All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] nvme: block ioctls if controller not in a live state
@ 2018-05-07 22:55 James Smart
  2018-05-08  7:15 ` Johannes Thumshirn
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: James Smart @ 2018-05-07 22:55 UTC (permalink / raw)


Rather than allow ioctl-based admin cmds to get intermixed on the admin
queue with commands being used to initialize a controller or io commands
to go to a controller in reconnect thus possibly hanging, reject them
if the controller isn't in the LIVE state. Reject with an -EAGAIN status
so that the app knows it could retry.

Signed-off-by: James Smart <james.smart at broadcom.com>
---
 drivers/nvme/host/core.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index a3771c5729f5..cfa03b68d2b8 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1041,6 +1041,8 @@ static int nvme_submit_io(struct nvme_ns *ns, struct nvme_user_io __user *uio)
 		return -EFAULT;
 	if (io.flags)
 		return -EINVAL;
+	if (ns->ctrl->state != NVME_CTRL_LIVE)
+		return -EAGAIN;
 
 	switch (io.opcode) {
 	case nvme_cmd_write:
@@ -1174,6 +1176,8 @@ static int nvme_user_cmd(struct nvme_ctrl *ctrl, struct nvme_ns *ns,
 		return -EFAULT;
 	if (cmd.flags)
 		return -EINVAL;
+	if (ctrl->state != NVME_CTRL_LIVE)
+		return -EAGAIN;
 
 	memset(&c, 0, sizeof(c));
 	c.common.opcode = cmd.opcode;
-- 
2.13.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH] nvme: block ioctls if controller not in a live state
  2018-05-07 22:55 [PATCH] nvme: block ioctls if controller not in a live state James Smart
@ 2018-05-08  7:15 ` Johannes Thumshirn
  2018-05-09 23:20   ` James Smart
  2018-05-09  5:11 ` Christoph Hellwig
  2018-05-14 15:23 ` Keith Busch
  2 siblings, 1 reply; 14+ messages in thread
From: Johannes Thumshirn @ 2018-05-08  7:15 UTC (permalink / raw)


On Mon, May 07, 2018@03:55:58PM -0700, James Smart wrote:
> Rather than allow ioctl-based admin cmds to get intermixed on the admin
> queue with commands being used to initialize a controller or io commands
> to go to a controller in reconnect thus possibly hanging, reject them
> if the controller isn't in the LIVE state. Reject with an -EAGAIN status
> so that the app knows it could retry.

Shouldn't the nvmf_check_if_read() calls in the transport drivers'
->queue_rq() callouts handle this?

-- 
Johannes Thumshirn                                          Storage
jthumshirn at suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N?rnberg
GF: Felix Imend?rffer, Jane Smithard, Graham Norton
HRB 21284 (AG N?rnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] nvme: block ioctls if controller not in a live state
  2018-05-07 22:55 [PATCH] nvme: block ioctls if controller not in a live state James Smart
  2018-05-08  7:15 ` Johannes Thumshirn
@ 2018-05-09  5:11 ` Christoph Hellwig
  2018-05-09 16:29   ` James Smart
  2018-05-14 15:23 ` Keith Busch
  2 siblings, 1 reply; 14+ messages in thread
From: Christoph Hellwig @ 2018-05-09  5:11 UTC (permalink / raw)


On Mon, May 07, 2018@03:55:58PM -0700, James Smart wrote:
> Rather than allow ioctl-based admin cmds to get intermixed on the admin
> queue with commands being used to initialize a controller or io commands
> to go to a controller in reconnect thus possibly hanging, reject them
> if the controller isn't in the LIVE state. Reject with an -EAGAIN status
> so that the app knows it could retry.

-EAGAIN implies that you can poll for the fd to be ready, so I think
this is the wrong error code here.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] nvme: block ioctls if controller not in a live state
  2018-05-09  5:11 ` Christoph Hellwig
@ 2018-05-09 16:29   ` James Smart
  2018-05-14 14:17     ` Christoph Hellwig
  0 siblings, 1 reply; 14+ messages in thread
From: James Smart @ 2018-05-09 16:29 UTC (permalink / raw)




On 5/8/2018 10:11 PM, Christoph Hellwig wrote:
> On Mon, May 07, 2018@03:55:58PM -0700, James Smart wrote:
>> Rather than allow ioctl-based admin cmds to get intermixed on the admin
>> queue with commands being used to initialize a controller or io commands
>> to go to a controller in reconnect thus possibly hanging, reject them
>> if the controller isn't in the LIVE state. Reject with an -EAGAIN status
>> so that the app knows it could retry.
> -EAGAIN implies that you can poll for the fd to be ready, so I think
> this is the wrong error code here.

That is the exact intent.? I do want the app to trigger off -EAGAIN in 
those two states and try again after a delay./? Perhaps I should change 
from !LIVE to NEW || RESETTING || CONNECTING to be more explicit.? I 
assumed !LIVE was sufficient as the other states should transition to 
LIVE or the delete path is taken which would remove the fd all together.

-- james

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] nvme: block ioctls if controller not in a live state
  2018-05-08  7:15 ` Johannes Thumshirn
@ 2018-05-09 23:20   ` James Smart
  2018-05-11  5:13     ` Johannes Thumshirn
  0 siblings, 1 reply; 14+ messages in thread
From: James Smart @ 2018-05-09 23:20 UTC (permalink / raw)


On 5/8/2018 12:15 AM, Johannes Thumshirn wrote:
> On Mon, May 07, 2018@03:55:58PM -0700, James Smart wrote:
>> Rather than allow ioctl-based admin cmds to get intermixed on the admin
>> queue with commands being used to initialize a controller or io commands
>> to go to a controller in reconnect thus possibly hanging, reject them
>> if the controller isn't in the LIVE state. Reject with an -EAGAIN status
>> so that the app knows it could retry.
> 
> Shouldn't the nvmf_check_if_read() calls in the transport drivers'
> ->queue_rq() callouts handle this?
> 


yes - but if things reach that layer, the return status for the ioctl 
can be quite different.  (if I read things right) The core routines are 
coded so that if there's a hard error, a -Exxx status will be returned. 
Otherwise, it returns the nvme_req(req)->status value (assumed positive 
big value). Which means, if the if_ready checks reject it, it'll go back 
with an NVME_SC_ABORT_REQ status. Which means the cli gets a convoluted 
error response. For the cases where it would be blocked by controller 
state, I want an -EAGAIN status so that the app knows it can retry and 
it should succeed sometime in the future.  Any other status - the app 
gives up and accepts the status value.

-- james

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] nvme: block ioctls if controller not in a live state
  2018-05-09 23:20   ` James Smart
@ 2018-05-11  5:13     ` Johannes Thumshirn
  0 siblings, 0 replies; 14+ messages in thread
From: Johannes Thumshirn @ 2018-05-11  5:13 UTC (permalink / raw)


On Wed, May 09, 2018@04:20:19PM -0700, James Smart wrote:
> yes - but if things reach that layer, the return status for the ioctl can be
> quite different.  (if I read things right) The core routines are coded so
> that if there's a hard error, a -Exxx status will be returned. Otherwise, it
> returns the nvme_req(req)->status value (assumed positive big value). Which
> means, if the if_ready checks reject it, it'll go back with an
> NVME_SC_ABORT_REQ status. Which means the cli gets a convoluted error
> response. For the cases where it would be blocked by controller state, I
> want an -EAGAIN status so that the app knows it can retry and it should
> succeed sometime in the future.  Any other status - the app gives up and
> accepts the status value.

Ah OK, thanks for the clarification.

Reviewed-by: Johannes Thumshirn <jthumshirn at suse.de>
-- 
Johannes Thumshirn                                          Storage
jthumshirn at suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N?rnberg
GF: Felix Imend?rffer, Jane Smithard, Graham Norton
HRB 21284 (AG N?rnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] nvme: block ioctls if controller not in a live state
  2018-05-09 16:29   ` James Smart
@ 2018-05-14 14:17     ` Christoph Hellwig
  2018-05-14 14:47       ` James Smart
  0 siblings, 1 reply; 14+ messages in thread
From: Christoph Hellwig @ 2018-05-14 14:17 UTC (permalink / raw)


On Wed, May 09, 2018@09:29:13AM -0700, James Smart wrote:
> 
> That is the exact intent.? I do want the app to trigger off -EAGAIN in those
> two states and try again after a delay./? Perhaps I should change from !LIVE
> to NEW || RESETTING || CONNECTING to be more explicit.? I assumed !LIVE was
> sufficient as the other states should transition to LIVE or the delete path
> is taken which would remove the fd all together.

The problem is that userspace reasonably expects to be able to call
poll/select when used on a character device and it gets -EAGAIN.

Maybe the right solution here is to just do a wait_event_interruptible to
wait for the controller being live.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] nvme: block ioctls if controller not in a live state
  2018-05-14 14:17     ` Christoph Hellwig
@ 2018-05-14 14:47       ` James Smart
  2018-05-14 14:53         ` Christoph Hellwig
  0 siblings, 1 reply; 14+ messages in thread
From: James Smart @ 2018-05-14 14:47 UTC (permalink / raw)




On 5/14/2018 7:17 AM, Christoph Hellwig wrote:
> On Wed, May 09, 2018@09:29:13AM -0700, James Smart wrote:
>> That is the exact intent.? I do want the app to trigger off -EAGAIN in those
>> two states and try again after a delay./? Perhaps I should change from !LIVE
>> to NEW || RESETTING || CONNECTING to be more explicit.? I assumed !LIVE was
>> sufficient as the other states should transition to LIVE or the delete path
>> is taken which would remove the fd all together.
> The problem is that userspace reasonably expects to be able to call
> poll/select when used on a character device and it gets -EAGAIN.

What's stopping us from adding poll support to the dev node ?


>
> Maybe the right solution here is to just do a wait_event_interruptible to
> wait for the controller being live.

That's doable. I don't like hanging the app up for what could be 60s or 
so though (controller loss timeout).

--james

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] nvme: block ioctls if controller not in a live state
  2018-05-14 14:47       ` James Smart
@ 2018-05-14 14:53         ` Christoph Hellwig
  2018-06-12 23:11           ` James Smart
  0 siblings, 1 reply; 14+ messages in thread
From: Christoph Hellwig @ 2018-05-14 14:53 UTC (permalink / raw)


On Mon, May 14, 2018@07:47:46AM -0700, James Smart wrote:
> 
> 
> On 5/14/2018 7:17 AM, Christoph Hellwig wrote:
> > On Wed, May 09, 2018@09:29:13AM -0700, James Smart wrote:
> > > That is the exact intent.? I do want the app to trigger off -EAGAIN in those
> > > two states and try again after a delay./? Perhaps I should change from !LIVE
> > > to NEW || RESETTING || CONNECTING to be more explicit.? I assumed !LIVE was
> > > sufficient as the other states should transition to LIVE or the delete path
> > > is taken which would remove the fd all together.
> > The problem is that userspace reasonably expects to be able to call
> > poll/select when used on a character device and it gets -EAGAIN.
> 
> What's stopping us from adding poll support to the dev node ?

A volunteer to write the code.

> 
> > 
> > Maybe the right solution here is to just do a wait_event_interruptible to
> > wait for the controller being live.
> 
> That's doable. I don't like hanging the app up for what could be 60s or so
> though (controller loss timeout).

Then you need to poll.  We should only do that if opened with O_NONBLOCK
and otherwise wait for the event, similar to the pattern used in a lot
of character devices.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] nvme: block ioctls if controller not in a live state
  2018-05-07 22:55 [PATCH] nvme: block ioctls if controller not in a live state James Smart
  2018-05-08  7:15 ` Johannes Thumshirn
  2018-05-09  5:11 ` Christoph Hellwig
@ 2018-05-14 15:23 ` Keith Busch
  2018-05-14 15:48   ` James Smart
  2 siblings, 1 reply; 14+ messages in thread
From: Keith Busch @ 2018-05-14 15:23 UTC (permalink / raw)


On Mon, May 07, 2018@03:55:58PM -0700, James Smart wrote:
> Rather than allow ioctl-based admin cmds to get intermixed on the admin
> queue with commands being used to initialize a controller or io commands
> to go to a controller in reconnect thus possibly hanging, reject them
> if the controller isn't in the LIVE state. Reject with an -EAGAIN status
> so that the app knows it could retry.

The admin ioctl is fine to use in ADMIN_ONLY state as well.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] nvme: block ioctls if controller not in a live state
  2018-05-14 15:23 ` Keith Busch
@ 2018-05-14 15:48   ` James Smart
  0 siblings, 0 replies; 14+ messages in thread
From: James Smart @ 2018-05-14 15:48 UTC (permalink / raw)


On 5/14/2018 8:23 AM, Keith Busch wrote:
> On Mon, May 07, 2018@03:55:58PM -0700, James Smart wrote:
>> Rather than allow ioctl-based admin cmds to get intermixed on the admin
>> queue with commands being used to initialize a controller or io commands
>> to go to a controller in reconnect thus possibly hanging, reject them
>> if the controller isn't in the LIVE state. Reject with an -EAGAIN status
>> so that the app knows it could retry.
> 
> The admin ioctl is fine to use in ADMIN_ONLY state as well.
> 

Yep, I would need to take that state into account. For now, looks like 
I'll be adding poll support and reposting.

-- james

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] nvme: block ioctls if controller not in a live state
  2018-05-14 14:53         ` Christoph Hellwig
@ 2018-06-12 23:11           ` James Smart
  2018-06-13  8:09             ` Christoph Hellwig
  0 siblings, 1 reply; 14+ messages in thread
From: James Smart @ 2018-06-12 23:11 UTC (permalink / raw)


On 5/14/2018 7:53 AM, Christoph Hellwig wrote:
> On Mon, May 14, 2018@07:47:46AM -0700, James Smart wrote:
>>
>> On 5/14/2018 7:17 AM, Christoph Hellwig wrote:
>>> On Wed, May 09, 2018@09:29:13AM -0700, James Smart wrote:
>>>> That is the exact intent.? I do want the app to trigger off -EAGAIN in those
>>>> two states and try again after a delay./? Perhaps I should change from !LIVE
>>>> to NEW || RESETTING || CONNECTING to be more explicit.? I assumed !LIVE was
>>>> sufficient as the other states should transition to LIVE or the delete path
>>>> is taken which would remove the fd all together.
>>> The problem is that userspace reasonably expects to be able to call
>>> poll/select when used on a character device and it gets -EAGAIN.
>> What's stopping us from adding poll support to the dev node ?
> A volunteer to write the code.
>
>>> Maybe the right solution here is to just do a wait_event_interruptible to
>>> wait for the controller being live.
>> That's doable. I don't like hanging the app up for what could be 60s or so
>> though (controller loss timeout).
> Then you need to poll.  We should only do that if opened with O_NONBLOCK
> and otherwise wait for the event, similar to the pattern used in a lot
> of character devices.

Christoph,

I was going to address this by adding the poll support.?? However, after 
reviewing the if_ready changes, they will fall into the USERCMD cases 
which will return the BLK_STST_IOERR and NVME_SC_ABORT_REQ.

In the nvme cli app, rather than keying retry off the EAGAIN, we could 
key off the ABORT_REQ status. I didn't like that initially as it may 
correspond to a real scenario that wasn't one of the 
reject-as-not-in-a-state-for-an-ioctl cases.

What do you think - continue with the poll interface and the EAGAIN 
block status, or revise the app EAGAIN check to NVME_SC_ABORT_REQ ?

-- james

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] nvme: block ioctls if controller not in a live state
  2018-06-12 23:11           ` James Smart
@ 2018-06-13  8:09             ` Christoph Hellwig
  2018-06-13 14:56               ` James Smart
  0 siblings, 1 reply; 14+ messages in thread
From: Christoph Hellwig @ 2018-06-13  8:09 UTC (permalink / raw)


On Tue, Jun 12, 2018@04:11:14PM -0700, James Smart wrote:
> I was going to address this by adding the poll support.?? However, after
> reviewing the if_ready changes, they will fall into the USERCMD cases which
> will return the BLK_STST_IOERR and NVME_SC_ABORT_REQ.
> 
> In the nvme cli app, rather than keying retry off the EAGAIN, we could key
> off the ABORT_REQ status. I didn't like that initially as it may correspond
> to a real scenario that wasn't one of the
> reject-as-not-in-a-state-for-an-ioctl cases.
> 
> What do you think - continue with the poll interface and the EAGAIN block
> status, or revise the app EAGAIN check to NVME_SC_ABORT_REQ ?

How about just having an option in the ioctl interface to not mark
the I/O failfast?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] nvme: block ioctls if controller not in a live state
  2018-06-13  8:09             ` Christoph Hellwig
@ 2018-06-13 14:56               ` James Smart
  0 siblings, 0 replies; 14+ messages in thread
From: James Smart @ 2018-06-13 14:56 UTC (permalink / raw)


On 6/13/2018 1:09 AM, Christoph Hellwig wrote:
> On Tue, Jun 12, 2018@04:11:14PM -0700, James Smart wrote:
>> I was going to address this by adding the poll support.?? However, after
>> reviewing the if_ready changes, they will fall into the USERCMD cases which
>> will return the BLK_STST_IOERR and NVME_SC_ABORT_REQ.
>>
>> In the nvme cli app, rather than keying retry off the EAGAIN, we could key
>> off the ABORT_REQ status. I didn't like that initially as it may correspond
>> to a real scenario that wasn't one of the
>> reject-as-not-in-a-state-for-an-ioctl cases.
>>
>> What do you think - continue with the poll interface and the EAGAIN block
>> status, or revise the app EAGAIN check to NVME_SC_ABORT_REQ ?
> 
> How about just having an option in the ioctl interface to not mark
> the I/O failfast?
> 

with the gist being you want to suspend the ioctl ?

I've already seen enough user feedback to say the ioctl shouldn't hang. 
So I agree with them being marked failfast.

-- james

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2018-06-13 14:56 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-07 22:55 [PATCH] nvme: block ioctls if controller not in a live state James Smart
2018-05-08  7:15 ` Johannes Thumshirn
2018-05-09 23:20   ` James Smart
2018-05-11  5:13     ` Johannes Thumshirn
2018-05-09  5:11 ` Christoph Hellwig
2018-05-09 16:29   ` James Smart
2018-05-14 14:17     ` Christoph Hellwig
2018-05-14 14:47       ` James Smart
2018-05-14 14:53         ` Christoph Hellwig
2018-06-12 23:11           ` James Smart
2018-06-13  8:09             ` Christoph Hellwig
2018-06-13 14:56               ` James Smart
2018-05-14 15:23 ` Keith Busch
2018-05-14 15:48   ` James Smart

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.