All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hannes Reinecke <hare@suse.de>
To: Sagi Grimberg <sagi@grimberg.me>, Christoph Hellwig <hch@lst.de>
Cc: Keith Busch <kbusch@kernel.org>, linux-nvme@lists.infradead.org
Subject: Re: [PATCH 1/3] nvme-tcp: spurious I/O timeout under high load
Date: Tue, 24 May 2022 10:08:50 +0200	[thread overview]
Message-ID: <76475e4f-13c7-2e0c-8584-f46918f5cefa@suse.de> (raw)
In-Reply-To: <02805f44-6f2d-b12e-c224-d44616332d5a@grimberg.me>

On 5/24/22 09:57, Sagi Grimberg wrote:
> 
>>>>>> I'm open to discussion what we should be doing when the request is 
>>>>>> in the process of being sent. But when it didn't have a chance to 
>>>>>> be sent and we just overloaded our internal queuing we shouldn't 
>>>>>> be sending timeouts.
>>>>>
>>>>> As mentioned above, what happens if that same reporter opens 
>>>>> another bug
>>>>> that the same phenomenon happens with soft-iwarp? What would you tell
>>>>> him/her?
>>>>
>>>> Nope. It's a HW appliance. Not a chance to change that.
>>>
>>> It was just a theoretical question.
>>>
>>> Do note that I'm not against solving a problem for anyone, I'm just
>>> questioning if increasing the io_timeout to be unbound in case the
>>> network is congested, is the right solution for everyone instead of
>>> a particular case that can easily be solved with udev to make the
>>> io_timeout to be as high as needed.
>>>
>>> One can argue that this patchset is making nvme-tcp to basically
>>> ignore the device io_timeout in certain cases.
>>
>> Oh, yes, sure, that will happen.
>> What I'm actually arguing is the imprecise difference between 
>> BLK_STS_AGAIN / BLK_STS_RESOURCE as a return value from ->queue_rq()
>> and command timeouts in case of resource constraints on the driver 
>> implementing ->queue_rq().
>>
>> If there is a resource constrain driver is free to return 
>> BLK_STS_RESOURCE (in which case you wouldn't see a timeout) or accept 
>> the request (in which case there will be a timeout).
> 
> There is no resource constraint. The driver sizes up the resources
> to be able to queue all the requests it is getting.
> 
>> I could live with a timeout if that would just result in the command 
>> being retried. But in the case of nvme it results in a connection 
>> reset to boot, making customers really nervous that their system is 
>> broken.
> 
> But how does the driver know that it is running in this environment that
> is completely congested? What I'm saying is that this is a specific use
> case that the solution can have negative side-effects for other common
> use-cases, because it is beyond the scope of the driver to handle.
> 
> We can also trigger this condition with nvme-rdma.
> 
> We could stay with this patch, but I'd argue that this might be the
> wrong thing to do in certain use-cases.
> 
Right, okay.

Arguably this is a workload corner case, and we might not want to fix 
this in the driver.

_However_: do we need to do a controller reset in this case?
Shouldn't it be sufficient to just complete the command w/ timeout error 
and be done with it?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman


  reply	other threads:[~2022-05-24  8:09 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-19  6:26 [PATCH 0/3] nvme-tcp: queue stalls under high load Hannes Reinecke
2022-05-19  6:26 ` [PATCH 1/3] nvme-tcp: spurious I/O timeout " Hannes Reinecke
2022-05-20  9:05   ` Sagi Grimberg
2022-05-23  8:42     ` Hannes Reinecke
2022-05-23 13:36       ` Sagi Grimberg
2022-05-23 14:01         ` Hannes Reinecke
2022-05-23 15:05           ` Sagi Grimberg
2022-05-23 16:07             ` Hannes Reinecke
2022-05-24  7:57               ` Sagi Grimberg
2022-05-24  8:08                 ` Hannes Reinecke [this message]
2022-05-24  8:53                   ` Sagi Grimberg
2022-05-24  9:34                     ` Hannes Reinecke
2022-05-24  9:58                       ` Sagi Grimberg
2022-05-19  6:26 ` [PATCH 2/3] nvme-tcp: Check for write space before queueing requests Hannes Reinecke
2022-05-20  9:17   ` Sagi Grimberg
2022-05-20 10:05     ` Hannes Reinecke
2022-05-21 20:01       ` Sagi Grimberg
2022-05-19  6:26 ` [PATCH 3/3] nvme-tcp: send quota for nvme_tcp_send_all() Hannes Reinecke
2022-05-20  9:19   ` Sagi Grimberg
2022-05-20  9:59     ` Hannes Reinecke
2022-05-21 20:02       ` Sagi Grimberg
2022-05-20  9:20 ` [PATCH 0/3] nvme-tcp: queue stalls under high load Sagi Grimberg
2022-05-20 10:01   ` Hannes Reinecke
2022-05-21 20:03     ` Sagi Grimberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=76475e4f-13c7-2e0c-8584-f46918f5cefa@suse.de \
    --to=hare@suse.de \
    --cc=hch@lst.de \
    --cc=kbusch@kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.