All of lore.kernel.org
 help / color / mirror / Atom feed
From: Keith Busch <kbusch@kernel.org>
To: Samuel Jones <sjones@kalrayinc.com>
Cc: linux-nvme <linux-nvme@lists.infradead.org>
Subject: Re: Command timeouts with NVMe TCP kernel driver
Date: Mon, 30 Aug 2021 19:37:12 -0700	[thread overview]
Message-ID: <20210831023712.GA4046@dhcp-10-100-145-180.wdc.com> (raw)
In-Reply-To: <1058345072.5039165.1630330229081.JavaMail.zimbra@kalray.eu>

On Mon, Aug 30, 2021 at 03:30:29PM +0200, Samuel Jones wrote:
> Hi all, 
> 
> I'm experiencing command timeouts with recent versions of nvme-tcp driver. I have set up with a VM running 5.8 which is ok, but the same test load (100%RD 16K blocks) on a VM 5.11 or later, including 5.14-rc7, shows the same issue. The initatior complains as follows: 
> 
> Aug 30 14:58:05 centos7 kernel: nvme nvme0: queue 7: timeout request 0x10 type 4 
> Aug 30 14:58:05 centos7 kernel: nvme nvme0: starting error recovery 
> 
> My target is a Kalray target, but I have no trace of any outstanding commands when this situation occurs. Quite the opposite: The Kalray board observes that the initiator stops sending new requests on the queue some time before this command times out. 
> 
> I don't have any similar issues using SPDK as an initiator. I made the following modification to nvme-tcp code and my problem has disappeared: 
> 
> --- a/drivers/nvme/host/tcp.c 
> +++ b/drivers/nvme/host/tcp.c 
> @@ -287,6 +287,7 @@ static inline void nvme_tcp_queue_request(struct nvme_tcp_request *req, 
> * directly, otherwise queue io_work. Also, only do that if we 
> * are on the same cpu, so we don't introduce contention. 
> */ 
> +#if 0 
> if (queue->io_cpu == raw_smp_processor_id() && 
> sync && empty && mutex_trylock(&queue->send_mutex)) { 
> queue->more_requests = !last; 
> @@ -296,6 +297,9 @@ static inline void nvme_tcp_queue_request(struct nvme_tcp_request *req, 
> } else if (last) { 
> queue_work_on(queue->io_cpu, nvme_tcp_wq, &queue->io_work); 
> } 
> +#else 
> + queue_work_on(queue->io_cpu, nvme_tcp_wq, &queue->io_work); 
> +#endif 
> } 
> 
> To be honest, this bit of code has always bothered me: I don't understand how we can guarantee that the thread doesn't change CPU between the call to raw_smp_processor_id() and the trylock. I assume that even if this does occur, the fact that we hold queue->send_mutex() is supposed to make sure that there are no atomicity issues, but I'm concerned about the logic that tries to determine if there is any more pending work and reschedule io_work if and only if there is something to do. I can't pinpoint an issue from just reading the code though.. 
> 
> Does anyone have any thoughts on this? Is my patch proof that there is an issue in this code, or am I likely to be changing the program flow so significantly that we can't conclude? Are there any open issues wrt command timeouts at the moment? 

I've seen similar, but I thought we'd fixed all those issues. Your poc
patch indicates to me that there is still at least another condition
that's not being considered. I don't think the possibility of CPU
rescheduling is the culprit, though.

One thing that looks problematic is if 'bd.last' is true but
nvme_tcp_queue_rq() fails early after enqueuing previous commands in the
sequence. That could cause the io_work to never kick. There should have
been something else in dmesg indicating that error, though, so I'll look
for some other possibility.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

  reply	other threads:[~2021-08-31  2:38 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-30 13:30 Command timeouts with NVMe TCP kernel driver Samuel Jones
2021-08-31  2:37 ` Keith Busch [this message]
2021-08-31  2:44   ` Keith Busch
2021-09-01 12:56     ` Sagi Grimberg
2021-08-31 22:46   ` Samuel Jones
2021-09-01 13:21     ` Sagi Grimberg
2021-09-07  9:05       ` Samuel Jones
2021-09-07 16:42         ` Keith Busch
2021-09-07 20:46           ` Keith Busch
2021-09-08 10:29             ` Samuel Jones
2021-09-08 20:55             ` Sagi Grimberg
2021-09-08 21:50               ` Keith Busch
2021-09-09  6:36                 ` Sagi Grimberg
2021-09-09  8:01                   ` Sagi Grimberg
2021-09-09  8:09                     ` Samuel Jones
2021-09-09  8:54                       ` Sagi Grimberg
2021-09-13  7:44                         ` Samuel Jones
2021-09-13  9:15                           ` Sagi Grimberg
2021-09-13 14:10                             ` Keith Busch

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210831023712.GA4046@dhcp-10-100-145-180.wdc.com \
    --to=kbusch@kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=sjones@kalrayinc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.