* [PATCH] nvme: do not ignore nvme status in nvme_set_queue_count()
@ 2021-01-21 9:50 Hannes Reinecke
2021-01-21 20:03 ` Chaitanya Kulkarni
2021-01-21 20:14 ` Keith Busch
0 siblings, 2 replies; 7+ messages in thread
From: Hannes Reinecke @ 2021-01-21 9:50 UTC (permalink / raw)
To: Sagi Grimberg; +Cc: linux-nvme, Christoph Hellwig, Keith Busch, Hannes Reinecke
If the call to nvme_set_queue_count() fails with a status we should
not ignore it but rather pass it on to the caller.
It's then up to the transport to decide whether to ignore it
(like PCI does) or to reset the connection (as would be appropriate
for fabrics).
Signed-off-by: Hannes Reinecke <hare@suse.de>
---
drivers/nvme/host/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index ce1b61519441..ddf32f5b4534 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1486,7 +1486,7 @@ int nvme_set_queue_count(struct nvme_ctrl *ctrl, int *count)
*count = min(*count, nr_io_queues);
}
- return 0;
+ return status;
}
EXPORT_SYMBOL_GPL(nvme_set_queue_count);
--
2.26.2
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] nvme: do not ignore nvme status in nvme_set_queue_count()
2021-01-21 9:50 [PATCH] nvme: do not ignore nvme status in nvme_set_queue_count() Hannes Reinecke
@ 2021-01-21 20:03 ` Chaitanya Kulkarni
2021-01-21 20:14 ` Keith Busch
1 sibling, 0 replies; 7+ messages in thread
From: Chaitanya Kulkarni @ 2021-01-21 20:03 UTC (permalink / raw)
To: Hannes Reinecke, Sagi Grimberg; +Cc: Keith Busch, Christoph Hellwig, linux-nvme
On 1/21/21 1:54 AM, Hannes Reinecke wrote:
> If the call to nvme_set_queue_count() fails with a status we should
> not ignore it but rather pass it on to the caller.
> It's then up to the transport to decide whether to ignore it
> (like PCI does) or to reset the connection (as would be appropriate
> for fabrics).
>
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> ---
> drivers/nvme/host/core.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index ce1b61519441..ddf32f5b4534 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -1486,7 +1486,7 @@ int nvme_set_queue_count(struct nvme_ctrl *ctrl, int *count)
> *count = min(*count, nr_io_queues);
> }
>
> - return 0;
> + return status;
> }
> EXPORT_SYMBOL_GPL(nvme_set_queue_count);
>
Looks good.
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] nvme: do not ignore nvme status in nvme_set_queue_count()
2021-01-21 9:50 [PATCH] nvme: do not ignore nvme status in nvme_set_queue_count() Hannes Reinecke
2021-01-21 20:03 ` Chaitanya Kulkarni
@ 2021-01-21 20:14 ` Keith Busch
2021-01-22 16:35 ` Hannes Reinecke
1 sibling, 1 reply; 7+ messages in thread
From: Keith Busch @ 2021-01-21 20:14 UTC (permalink / raw)
To: Hannes Reinecke; +Cc: Keith Busch, Sagi Grimberg, linux-nvme, Christoph Hellwig
On Thu, Jan 21, 2021 at 10:50:21AM +0100, Hannes Reinecke wrote:
> If the call to nvme_set_queue_count() fails with a status we should
> not ignore it but rather pass it on to the caller.
> It's then up to the transport to decide whether to ignore it
> (like PCI does) or to reset the connection (as would be appropriate
> for fabrics).
Instead of checking the error, wouldn't checking the number of created
queues be sufficient? What handling difference do you expect to occur
between getting a success with 0 queues, vs getting an error?
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] nvme: do not ignore nvme status in nvme_set_queue_count()
2021-01-21 20:14 ` Keith Busch
@ 2021-01-22 16:35 ` Hannes Reinecke
2021-01-22 16:44 ` Keith Busch
0 siblings, 1 reply; 7+ messages in thread
From: Hannes Reinecke @ 2021-01-22 16:35 UTC (permalink / raw)
To: Keith Busch; +Cc: Keith Busch, Sagi Grimberg, linux-nvme, Christoph Hellwig
On 1/21/21 9:14 PM, Keith Busch wrote:
> On Thu, Jan 21, 2021 at 10:50:21AM +0100, Hannes Reinecke wrote:
>> If the call to nvme_set_queue_count() fails with a status we should
>> not ignore it but rather pass it on to the caller.
>> It's then up to the transport to decide whether to ignore it
>> (like PCI does) or to reset the connection (as would be appropriate
>> for fabrics).
>
> Instead of checking the error, wouldn't checking the number of created
> queues be sufficient? What handling difference do you expect to occur
> between getting a success with 0 queues, vs getting an error?
>
The difference is that an error will (re-)start recovery, 0 queues won't.
But the problem here is that nvme_set_queue_count() is being called
during reconnection, ie during the recovery process itself.
And this command is returned with a timeout, which in any other case is
being treated as a fatal error. Plus we have been sending this command
on the admin queue, so a timeout on the admin queue pretty much _is_ a
fatal error. So we should be terminating the current recovery and
reconnect. None of that will happen if we return '0' queues.
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@suse.de +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] nvme: do not ignore nvme status in nvme_set_queue_count()
2021-01-22 16:35 ` Hannes Reinecke
@ 2021-01-22 16:44 ` Keith Busch
2021-01-26 15:25 ` Hannes Reinecke
0 siblings, 1 reply; 7+ messages in thread
From: Keith Busch @ 2021-01-22 16:44 UTC (permalink / raw)
To: Hannes Reinecke; +Cc: Keith Busch, Sagi Grimberg, linux-nvme, Christoph Hellwig
On Fri, Jan 22, 2021 at 05:35:35PM +0100, Hannes Reinecke wrote:
> On 1/21/21 9:14 PM, Keith Busch wrote:
> > On Thu, Jan 21, 2021 at 10:50:21AM +0100, Hannes Reinecke wrote:
> > > If the call to nvme_set_queue_count() fails with a status we should
> > > not ignore it but rather pass it on to the caller.
> > > It's then up to the transport to decide whether to ignore it
> > > (like PCI does) or to reset the connection (as would be appropriate
> > > for fabrics).
> >
> > Instead of checking the error, wouldn't checking the number of created
> > queues be sufficient? What handling difference do you expect to occur
> > between getting a success with 0 queues, vs getting an error?
> >
> The difference is that an error will (re-)start recovery, 0 queues won't.
> But the problem here is that nvme_set_queue_count() is being called during
> reconnection, ie during the recovery process itself.
> And this command is returned with a timeout, which in any other case is
> being treated as a fatal error. Plus we have been sending this command on
> the admin queue, so a timeout on the admin queue pretty much _is_ a fatal
> error. So we should be terminating the current recovery and reconnect. None
> of that will happen if we return '0' queues.
You should already be getting an error return status if a timeout occurs
for nvme_set_queue_count(), specifically -EINTR. Are you getting success
for some reason?
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] nvme: do not ignore nvme status in nvme_set_queue_count()
2021-01-22 16:44 ` Keith Busch
@ 2021-01-26 15:25 ` Hannes Reinecke
2021-01-26 19:06 ` Keith Busch
0 siblings, 1 reply; 7+ messages in thread
From: Hannes Reinecke @ 2021-01-26 15:25 UTC (permalink / raw)
To: Keith Busch; +Cc: linux-nvme, Sagi Grimberg, Keith Busch, Christoph Hellwig
On 1/22/21 5:44 PM, Keith Busch wrote:
> On Fri, Jan 22, 2021 at 05:35:35PM +0100, Hannes Reinecke wrote:
>> On 1/21/21 9:14 PM, Keith Busch wrote:
>>> On Thu, Jan 21, 2021 at 10:50:21AM +0100, Hannes Reinecke wrote:
>>>> If the call to nvme_set_queue_count() fails with a status we should
>>>> not ignore it but rather pass it on to the caller.
>>>> It's then up to the transport to decide whether to ignore it
>>>> (like PCI does) or to reset the connection (as would be appropriate
>>>> for fabrics).
>>>
>>> Instead of checking the error, wouldn't checking the number of created
>>> queues be sufficient? What handling difference do you expect to occur
>>> between getting a success with 0 queues, vs getting an error?
>>>
>> The difference is that an error will (re-)start recovery, 0 queues won't.
>> But the problem here is that nvme_set_queue_count() is being called during
>> reconnection, ie during the recovery process itself.
>> And this command is returned with a timeout, which in any other case is
>> being treated as a fatal error. Plus we have been sending this command on
>> the admin queue, so a timeout on the admin queue pretty much _is_ a fatal
>> error. So we should be terminating the current recovery and reconnect. None
>> of that will happen if we return '0' queues.
>
> You should already be getting an error return status if a timeout occurs
> for nvme_set_queue_count(), specifically -EINTR. Are you getting success
> for some reason?
>
-EINTR (which translates to 'nvme_req(req)->flags & NVME_REQ_CANCELLED')
will only ever be returned on pci; fabrics doesn't set this flag, so
we're never getting an -EINTR.
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@suse.de +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] nvme: do not ignore nvme status in nvme_set_queue_count()
2021-01-26 15:25 ` Hannes Reinecke
@ 2021-01-26 19:06 ` Keith Busch
0 siblings, 0 replies; 7+ messages in thread
From: Keith Busch @ 2021-01-26 19:06 UTC (permalink / raw)
To: Hannes Reinecke; +Cc: linux-nvme, Sagi Grimberg, Keith Busch, Christoph Hellwig
On Tue, Jan 26, 2021 at 04:25:11PM +0100, Hannes Reinecke wrote:
> On 1/22/21 5:44 PM, Keith Busch wrote:
> > On Fri, Jan 22, 2021 at 05:35:35PM +0100, Hannes Reinecke wrote:
> > > On 1/21/21 9:14 PM, Keith Busch wrote:
> > > > On Thu, Jan 21, 2021 at 10:50:21AM +0100, Hannes Reinecke wrote:
> > > > > If the call to nvme_set_queue_count() fails with a status we should
> > > > > not ignore it but rather pass it on to the caller.
> > > > > It's then up to the transport to decide whether to ignore it
> > > > > (like PCI does) or to reset the connection (as would be appropriate
> > > > > for fabrics).
> > > >
> > > > Instead of checking the error, wouldn't checking the number of created
> > > > queues be sufficient? What handling difference do you expect to occur
> > > > between getting a success with 0 queues, vs getting an error?
> > > >
> > > The difference is that an error will (re-)start recovery, 0 queues won't.
> > > But the problem here is that nvme_set_queue_count() is being called during
> > > reconnection, ie during the recovery process itself.
> > > And this command is returned with a timeout, which in any other case is
> > > being treated as a fatal error. Plus we have been sending this command on
> > > the admin queue, so a timeout on the admin queue pretty much _is_ a fatal
> > > error. So we should be terminating the current recovery and reconnect. None
> > > of that will happen if we return '0' queues.
> >
> > You should already be getting an error return status if a timeout occurs
> > for nvme_set_queue_count(), specifically -EINTR. Are you getting success
> > for some reason?
> >
> -EINTR (which translates to 'nvme_req(req)->flags & NVME_REQ_CANCELLED')
> will only ever be returned on pci; fabrics doesn't set this flag, so we're
> never getting an -EINTR.
Sounds like that's the problem that needs to be fixed.
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2021-01-26 19:07 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-21 9:50 [PATCH] nvme: do not ignore nvme status in nvme_set_queue_count() Hannes Reinecke
2021-01-21 20:03 ` Chaitanya Kulkarni
2021-01-21 20:14 ` Keith Busch
2021-01-22 16:35 ` Hannes Reinecke
2021-01-22 16:44 ` Keith Busch
2021-01-26 15:25 ` Hannes Reinecke
2021-01-26 19:06 ` Keith Busch
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.