linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Keith Busch <kbusch@kernel.org>
To: Samuel Jones <sjones@kalrayinc.com>
Cc: linux-nvme <linux-nvme@lists.infradead.org>
Subject: Re: Command timeouts with NVMe TCP kernel driver
Date: Mon, 30 Aug 2021 19:37:12 -0700	[thread overview]
Message-ID: <20210831023712.GA4046@dhcp-10-100-145-180.wdc.com> (raw)
In-Reply-To: <1058345072.5039165.1630330229081.JavaMail.zimbra@kalray.eu>

On Mon, Aug 30, 2021 at 03:30:29PM +0200, Samuel Jones wrote:
> Hi all, 
> 
> I'm experiencing command timeouts with recent versions of nvme-tcp driver. I have set up with a VM running 5.8 which is ok, but the same test load (100%RD 16K blocks) on a VM 5.11 or later, including 5.14-rc7, shows the same issue. The initatior complains as follows: 
> 
> Aug 30 14:58:05 centos7 kernel: nvme nvme0: queue 7: timeout request 0x10 type 4 
> Aug 30 14:58:05 centos7 kernel: nvme nvme0: starting error recovery 
> 
> My target is a Kalray target, but I have no trace of any outstanding commands when this situation occurs. Quite the opposite: The Kalray board observes that the initiator stops sending new requests on the queue some time before this command times out. 
> 
> I don't have any similar issues using SPDK as an initiator. I made the following modification to nvme-tcp code and my problem has disappeared: 
> 
> --- a/drivers/nvme/host/tcp.c 
> +++ b/drivers/nvme/host/tcp.c 
> @@ -287,6 +287,7 @@ static inline void nvme_tcp_queue_request(struct nvme_tcp_request *req, 
> * directly, otherwise queue io_work. Also, only do that if we 
> * are on the same cpu, so we don't introduce contention. 
> */ 
> +#if 0 
> if (queue->io_cpu == raw_smp_processor_id() && 
> sync && empty && mutex_trylock(&queue->send_mutex)) { 
> queue->more_requests = !last; 
> @@ -296,6 +297,9 @@ static inline void nvme_tcp_queue_request(struct nvme_tcp_request *req, 
> } else if (last) { 
> queue_work_on(queue->io_cpu, nvme_tcp_wq, &queue->io_work); 
> } 
> +#else 
> + queue_work_on(queue->io_cpu, nvme_tcp_wq, &queue->io_work); 
> +#endif 
> } 
> 
> To be honest, this bit of code has always bothered me: I don't understand how we can guarantee that the thread doesn't change CPU between the call to raw_smp_processor_id() and the trylock. I assume that even if this does occur, the fact that we hold queue->send_mutex() is supposed to make sure that there are no atomicity issues, but I'm concerned about the logic that tries to determine if there is any more pending work and reschedule io_work if and only if there is something to do. I can't pinpoint an issue from just reading the code though.. 
> 
> Does anyone have any thoughts on this? Is my patch proof that there is an issue in this code, or am I likely to be changing the program flow so significantly that we can't conclude? Are there any open issues wrt command timeouts at the moment? 

I've seen similar, but I thought we'd fixed all those issues. Your poc
patch indicates to me that there is still at least another condition
that's not being considered. I don't think the possibility of CPU
rescheduling is the culprit, though.

One thing that looks problematic is if 'bd.last' is true but
nvme_tcp_queue_rq() fails early after enqueuing previous commands in the
sequence. That could cause the io_work to never kick. There should have
been something else in dmesg indicating that error, though, so I'll look
for some other possibility.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

  reply	other threads:[~2021-08-31  2:38 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-30 13:30 Command timeouts with NVMe TCP kernel driver Samuel Jones
2021-08-31  2:37 ` Keith Busch [this message]
2021-08-31  2:44   ` Keith Busch
2021-09-01 12:56     ` Sagi Grimberg
2021-08-31 22:46   ` Samuel Jones
2021-09-01 13:21     ` Sagi Grimberg
2021-09-07  9:05       ` Samuel Jones
2021-09-07 16:42         ` Keith Busch
2021-09-07 20:46           ` Keith Busch
2021-09-08 10:29             ` Samuel Jones
2021-09-08 20:55             ` Sagi Grimberg
2021-09-08 21:50               ` Keith Busch
2021-09-09  6:36                 ` Sagi Grimberg
2021-09-09  8:01                   ` Sagi Grimberg
2021-09-09  8:09                     ` Samuel Jones
2021-09-09  8:54                       ` Sagi Grimberg
2021-09-13  7:44                         ` Samuel Jones
2021-09-13  9:15                           ` Sagi Grimberg
2021-09-13 14:10                             ` Keith Busch

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210831023712.GA4046@dhcp-10-100-145-180.wdc.com \
    --to=kbusch@kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=sjones@kalrayinc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).