* [PATCH] nvme: block ioctls if controller not in a live state
@ 2018-05-07 22:55 James Smart
2018-05-08 7:15 ` Johannes Thumshirn
` (2 more replies)
0 siblings, 3 replies; 14+ messages in thread
From: James Smart @ 2018-05-07 22:55 UTC (permalink / raw)
Rather than allow ioctl-based admin cmds to get intermixed on the admin
queue with commands being used to initialize a controller or io commands
to go to a controller in reconnect thus possibly hanging, reject them
if the controller isn't in the LIVE state. Reject with an -EAGAIN status
so that the app knows it could retry.
Signed-off-by: James Smart <james.smart at broadcom.com>
---
drivers/nvme/host/core.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index a3771c5729f5..cfa03b68d2b8 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1041,6 +1041,8 @@ static int nvme_submit_io(struct nvme_ns *ns, struct nvme_user_io __user *uio)
return -EFAULT;
if (io.flags)
return -EINVAL;
+ if (ns->ctrl->state != NVME_CTRL_LIVE)
+ return -EAGAIN;
switch (io.opcode) {
case nvme_cmd_write:
@@ -1174,6 +1176,8 @@ static int nvme_user_cmd(struct nvme_ctrl *ctrl, struct nvme_ns *ns,
return -EFAULT;
if (cmd.flags)
return -EINVAL;
+ if (ctrl->state != NVME_CTRL_LIVE)
+ return -EAGAIN;
memset(&c, 0, sizeof(c));
c.common.opcode = cmd.opcode;
--
2.13.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH] nvme: block ioctls if controller not in a live state
2018-05-07 22:55 [PATCH] nvme: block ioctls if controller not in a live state James Smart
@ 2018-05-08 7:15 ` Johannes Thumshirn
2018-05-09 23:20 ` James Smart
2018-05-09 5:11 ` Christoph Hellwig
2018-05-14 15:23 ` Keith Busch
2 siblings, 1 reply; 14+ messages in thread
From: Johannes Thumshirn @ 2018-05-08 7:15 UTC (permalink / raw)
On Mon, May 07, 2018@03:55:58PM -0700, James Smart wrote:
> Rather than allow ioctl-based admin cmds to get intermixed on the admin
> queue with commands being used to initialize a controller or io commands
> to go to a controller in reconnect thus possibly hanging, reject them
> if the controller isn't in the LIVE state. Reject with an -EAGAIN status
> so that the app knows it could retry.
Shouldn't the nvmf_check_if_read() calls in the transport drivers'
->queue_rq() callouts handle this?
--
Johannes Thumshirn Storage
jthumshirn at suse.de +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N?rnberg
GF: Felix Imend?rffer, Jane Smithard, Graham Norton
HRB 21284 (AG N?rnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH] nvme: block ioctls if controller not in a live state
2018-05-07 22:55 [PATCH] nvme: block ioctls if controller not in a live state James Smart
2018-05-08 7:15 ` Johannes Thumshirn
@ 2018-05-09 5:11 ` Christoph Hellwig
2018-05-09 16:29 ` James Smart
2018-05-14 15:23 ` Keith Busch
2 siblings, 1 reply; 14+ messages in thread
From: Christoph Hellwig @ 2018-05-09 5:11 UTC (permalink / raw)
On Mon, May 07, 2018@03:55:58PM -0700, James Smart wrote:
> Rather than allow ioctl-based admin cmds to get intermixed on the admin
> queue with commands being used to initialize a controller or io commands
> to go to a controller in reconnect thus possibly hanging, reject them
> if the controller isn't in the LIVE state. Reject with an -EAGAIN status
> so that the app knows it could retry.
-EAGAIN implies that you can poll for the fd to be ready, so I think
this is the wrong error code here.
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH] nvme: block ioctls if controller not in a live state
2018-05-09 5:11 ` Christoph Hellwig
@ 2018-05-09 16:29 ` James Smart
2018-05-14 14:17 ` Christoph Hellwig
0 siblings, 1 reply; 14+ messages in thread
From: James Smart @ 2018-05-09 16:29 UTC (permalink / raw)
On 5/8/2018 10:11 PM, Christoph Hellwig wrote:
> On Mon, May 07, 2018@03:55:58PM -0700, James Smart wrote:
>> Rather than allow ioctl-based admin cmds to get intermixed on the admin
>> queue with commands being used to initialize a controller or io commands
>> to go to a controller in reconnect thus possibly hanging, reject them
>> if the controller isn't in the LIVE state. Reject with an -EAGAIN status
>> so that the app knows it could retry.
> -EAGAIN implies that you can poll for the fd to be ready, so I think
> this is the wrong error code here.
That is the exact intent.? I do want the app to trigger off -EAGAIN in
those two states and try again after a delay./? Perhaps I should change
from !LIVE to NEW || RESETTING || CONNECTING to be more explicit.? I
assumed !LIVE was sufficient as the other states should transition to
LIVE or the delete path is taken which would remove the fd all together.
-- james
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH] nvme: block ioctls if controller not in a live state
2018-05-08 7:15 ` Johannes Thumshirn
@ 2018-05-09 23:20 ` James Smart
2018-05-11 5:13 ` Johannes Thumshirn
0 siblings, 1 reply; 14+ messages in thread
From: James Smart @ 2018-05-09 23:20 UTC (permalink / raw)
On 5/8/2018 12:15 AM, Johannes Thumshirn wrote:
> On Mon, May 07, 2018@03:55:58PM -0700, James Smart wrote:
>> Rather than allow ioctl-based admin cmds to get intermixed on the admin
>> queue with commands being used to initialize a controller or io commands
>> to go to a controller in reconnect thus possibly hanging, reject them
>> if the controller isn't in the LIVE state. Reject with an -EAGAIN status
>> so that the app knows it could retry.
>
> Shouldn't the nvmf_check_if_read() calls in the transport drivers'
> ->queue_rq() callouts handle this?
>
yes - but if things reach that layer, the return status for the ioctl
can be quite different. (if I read things right) The core routines are
coded so that if there's a hard error, a -Exxx status will be returned.
Otherwise, it returns the nvme_req(req)->status value (assumed positive
big value). Which means, if the if_ready checks reject it, it'll go back
with an NVME_SC_ABORT_REQ status. Which means the cli gets a convoluted
error response. For the cases where it would be blocked by controller
state, I want an -EAGAIN status so that the app knows it can retry and
it should succeed sometime in the future. Any other status - the app
gives up and accepts the status value.
-- james
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH] nvme: block ioctls if controller not in a live state
2018-05-09 23:20 ` James Smart
@ 2018-05-11 5:13 ` Johannes Thumshirn
0 siblings, 0 replies; 14+ messages in thread
From: Johannes Thumshirn @ 2018-05-11 5:13 UTC (permalink / raw)
On Wed, May 09, 2018@04:20:19PM -0700, James Smart wrote:
> yes - but if things reach that layer, the return status for the ioctl can be
> quite different. (if I read things right) The core routines are coded so
> that if there's a hard error, a -Exxx status will be returned. Otherwise, it
> returns the nvme_req(req)->status value (assumed positive big value). Which
> means, if the if_ready checks reject it, it'll go back with an
> NVME_SC_ABORT_REQ status. Which means the cli gets a convoluted error
> response. For the cases where it would be blocked by controller state, I
> want an -EAGAIN status so that the app knows it can retry and it should
> succeed sometime in the future. Any other status - the app gives up and
> accepts the status value.
Ah OK, thanks for the clarification.
Reviewed-by: Johannes Thumshirn <jthumshirn at suse.de>
--
Johannes Thumshirn Storage
jthumshirn at suse.de +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N?rnberg
GF: Felix Imend?rffer, Jane Smithard, Graham Norton
HRB 21284 (AG N?rnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH] nvme: block ioctls if controller not in a live state
2018-05-09 16:29 ` James Smart
@ 2018-05-14 14:17 ` Christoph Hellwig
2018-05-14 14:47 ` James Smart
0 siblings, 1 reply; 14+ messages in thread
From: Christoph Hellwig @ 2018-05-14 14:17 UTC (permalink / raw)
On Wed, May 09, 2018@09:29:13AM -0700, James Smart wrote:
>
> That is the exact intent.? I do want the app to trigger off -EAGAIN in those
> two states and try again after a delay./? Perhaps I should change from !LIVE
> to NEW || RESETTING || CONNECTING to be more explicit.? I assumed !LIVE was
> sufficient as the other states should transition to LIVE or the delete path
> is taken which would remove the fd all together.
The problem is that userspace reasonably expects to be able to call
poll/select when used on a character device and it gets -EAGAIN.
Maybe the right solution here is to just do a wait_event_interruptible to
wait for the controller being live.
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH] nvme: block ioctls if controller not in a live state
2018-05-14 14:17 ` Christoph Hellwig
@ 2018-05-14 14:47 ` James Smart
2018-05-14 14:53 ` Christoph Hellwig
0 siblings, 1 reply; 14+ messages in thread
From: James Smart @ 2018-05-14 14:47 UTC (permalink / raw)
On 5/14/2018 7:17 AM, Christoph Hellwig wrote:
> On Wed, May 09, 2018@09:29:13AM -0700, James Smart wrote:
>> That is the exact intent.? I do want the app to trigger off -EAGAIN in those
>> two states and try again after a delay./? Perhaps I should change from !LIVE
>> to NEW || RESETTING || CONNECTING to be more explicit.? I assumed !LIVE was
>> sufficient as the other states should transition to LIVE or the delete path
>> is taken which would remove the fd all together.
> The problem is that userspace reasonably expects to be able to call
> poll/select when used on a character device and it gets -EAGAIN.
What's stopping us from adding poll support to the dev node ?
>
> Maybe the right solution here is to just do a wait_event_interruptible to
> wait for the controller being live.
That's doable. I don't like hanging the app up for what could be 60s or
so though (controller loss timeout).
--james
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH] nvme: block ioctls if controller not in a live state
2018-05-14 14:47 ` James Smart
@ 2018-05-14 14:53 ` Christoph Hellwig
2018-06-12 23:11 ` James Smart
0 siblings, 1 reply; 14+ messages in thread
From: Christoph Hellwig @ 2018-05-14 14:53 UTC (permalink / raw)
On Mon, May 14, 2018@07:47:46AM -0700, James Smart wrote:
>
>
> On 5/14/2018 7:17 AM, Christoph Hellwig wrote:
> > On Wed, May 09, 2018@09:29:13AM -0700, James Smart wrote:
> > > That is the exact intent.? I do want the app to trigger off -EAGAIN in those
> > > two states and try again after a delay./? Perhaps I should change from !LIVE
> > > to NEW || RESETTING || CONNECTING to be more explicit.? I assumed !LIVE was
> > > sufficient as the other states should transition to LIVE or the delete path
> > > is taken which would remove the fd all together.
> > The problem is that userspace reasonably expects to be able to call
> > poll/select when used on a character device and it gets -EAGAIN.
>
> What's stopping us from adding poll support to the dev node ?
A volunteer to write the code.
>
> >
> > Maybe the right solution here is to just do a wait_event_interruptible to
> > wait for the controller being live.
>
> That's doable. I don't like hanging the app up for what could be 60s or so
> though (controller loss timeout).
Then you need to poll. We should only do that if opened with O_NONBLOCK
and otherwise wait for the event, similar to the pattern used in a lot
of character devices.
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH] nvme: block ioctls if controller not in a live state
2018-05-07 22:55 [PATCH] nvme: block ioctls if controller not in a live state James Smart
2018-05-08 7:15 ` Johannes Thumshirn
2018-05-09 5:11 ` Christoph Hellwig
@ 2018-05-14 15:23 ` Keith Busch
2018-05-14 15:48 ` James Smart
2 siblings, 1 reply; 14+ messages in thread
From: Keith Busch @ 2018-05-14 15:23 UTC (permalink / raw)
On Mon, May 07, 2018@03:55:58PM -0700, James Smart wrote:
> Rather than allow ioctl-based admin cmds to get intermixed on the admin
> queue with commands being used to initialize a controller or io commands
> to go to a controller in reconnect thus possibly hanging, reject them
> if the controller isn't in the LIVE state. Reject with an -EAGAIN status
> so that the app knows it could retry.
The admin ioctl is fine to use in ADMIN_ONLY state as well.
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH] nvme: block ioctls if controller not in a live state
2018-05-14 15:23 ` Keith Busch
@ 2018-05-14 15:48 ` James Smart
0 siblings, 0 replies; 14+ messages in thread
From: James Smart @ 2018-05-14 15:48 UTC (permalink / raw)
On 5/14/2018 8:23 AM, Keith Busch wrote:
> On Mon, May 07, 2018@03:55:58PM -0700, James Smart wrote:
>> Rather than allow ioctl-based admin cmds to get intermixed on the admin
>> queue with commands being used to initialize a controller or io commands
>> to go to a controller in reconnect thus possibly hanging, reject them
>> if the controller isn't in the LIVE state. Reject with an -EAGAIN status
>> so that the app knows it could retry.
>
> The admin ioctl is fine to use in ADMIN_ONLY state as well.
>
Yep, I would need to take that state into account. For now, looks like
I'll be adding poll support and reposting.
-- james
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH] nvme: block ioctls if controller not in a live state
2018-05-14 14:53 ` Christoph Hellwig
@ 2018-06-12 23:11 ` James Smart
2018-06-13 8:09 ` Christoph Hellwig
0 siblings, 1 reply; 14+ messages in thread
From: James Smart @ 2018-06-12 23:11 UTC (permalink / raw)
On 5/14/2018 7:53 AM, Christoph Hellwig wrote:
> On Mon, May 14, 2018@07:47:46AM -0700, James Smart wrote:
>>
>> On 5/14/2018 7:17 AM, Christoph Hellwig wrote:
>>> On Wed, May 09, 2018@09:29:13AM -0700, James Smart wrote:
>>>> That is the exact intent.? I do want the app to trigger off -EAGAIN in those
>>>> two states and try again after a delay./? Perhaps I should change from !LIVE
>>>> to NEW || RESETTING || CONNECTING to be more explicit.? I assumed !LIVE was
>>>> sufficient as the other states should transition to LIVE or the delete path
>>>> is taken which would remove the fd all together.
>>> The problem is that userspace reasonably expects to be able to call
>>> poll/select when used on a character device and it gets -EAGAIN.
>> What's stopping us from adding poll support to the dev node ?
> A volunteer to write the code.
>
>>> Maybe the right solution here is to just do a wait_event_interruptible to
>>> wait for the controller being live.
>> That's doable. I don't like hanging the app up for what could be 60s or so
>> though (controller loss timeout).
> Then you need to poll. We should only do that if opened with O_NONBLOCK
> and otherwise wait for the event, similar to the pattern used in a lot
> of character devices.
Christoph,
I was going to address this by adding the poll support.?? However, after
reviewing the if_ready changes, they will fall into the USERCMD cases
which will return the BLK_STST_IOERR and NVME_SC_ABORT_REQ.
In the nvme cli app, rather than keying retry off the EAGAIN, we could
key off the ABORT_REQ status. I didn't like that initially as it may
correspond to a real scenario that wasn't one of the
reject-as-not-in-a-state-for-an-ioctl cases.
What do you think - continue with the poll interface and the EAGAIN
block status, or revise the app EAGAIN check to NVME_SC_ABORT_REQ ?
-- james
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH] nvme: block ioctls if controller not in a live state
2018-06-12 23:11 ` James Smart
@ 2018-06-13 8:09 ` Christoph Hellwig
2018-06-13 14:56 ` James Smart
0 siblings, 1 reply; 14+ messages in thread
From: Christoph Hellwig @ 2018-06-13 8:09 UTC (permalink / raw)
On Tue, Jun 12, 2018@04:11:14PM -0700, James Smart wrote:
> I was going to address this by adding the poll support.?? However, after
> reviewing the if_ready changes, they will fall into the USERCMD cases which
> will return the BLK_STST_IOERR and NVME_SC_ABORT_REQ.
>
> In the nvme cli app, rather than keying retry off the EAGAIN, we could key
> off the ABORT_REQ status. I didn't like that initially as it may correspond
> to a real scenario that wasn't one of the
> reject-as-not-in-a-state-for-an-ioctl cases.
>
> What do you think - continue with the poll interface and the EAGAIN block
> status, or revise the app EAGAIN check to NVME_SC_ABORT_REQ ?
How about just having an option in the ioctl interface to not mark
the I/O failfast?
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH] nvme: block ioctls if controller not in a live state
2018-06-13 8:09 ` Christoph Hellwig
@ 2018-06-13 14:56 ` James Smart
0 siblings, 0 replies; 14+ messages in thread
From: James Smart @ 2018-06-13 14:56 UTC (permalink / raw)
On 6/13/2018 1:09 AM, Christoph Hellwig wrote:
> On Tue, Jun 12, 2018@04:11:14PM -0700, James Smart wrote:
>> I was going to address this by adding the poll support.?? However, after
>> reviewing the if_ready changes, they will fall into the USERCMD cases which
>> will return the BLK_STST_IOERR and NVME_SC_ABORT_REQ.
>>
>> In the nvme cli app, rather than keying retry off the EAGAIN, we could key
>> off the ABORT_REQ status. I didn't like that initially as it may correspond
>> to a real scenario that wasn't one of the
>> reject-as-not-in-a-state-for-an-ioctl cases.
>>
>> What do you think - continue with the poll interface and the EAGAIN block
>> status, or revise the app EAGAIN check to NVME_SC_ABORT_REQ ?
>
> How about just having an option in the ioctl interface to not mark
> the I/O failfast?
>
with the gist being you want to suspend the ioctl ?
I've already seen enough user feedback to say the ioctl shouldn't hang.
So I agree with them being marked failfast.
-- james
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2018-06-13 14:56 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-07 22:55 [PATCH] nvme: block ioctls if controller not in a live state James Smart
2018-05-08 7:15 ` Johannes Thumshirn
2018-05-09 23:20 ` James Smart
2018-05-11 5:13 ` Johannes Thumshirn
2018-05-09 5:11 ` Christoph Hellwig
2018-05-09 16:29 ` James Smart
2018-05-14 14:17 ` Christoph Hellwig
2018-05-14 14:47 ` James Smart
2018-05-14 14:53 ` Christoph Hellwig
2018-06-12 23:11 ` James Smart
2018-06-13 8:09 ` Christoph Hellwig
2018-06-13 14:56 ` James Smart
2018-05-14 15:23 ` Keith Busch
2018-05-14 15:48 ` James Smart
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.