Linux-Block Archive on lore.kernel.org
 help / color / Atom feed
From: Mike Christie <mchristi@redhat.com>
To: josef@toxicpanda.com, linux-block@vger.kernel.org
Cc: Mike Christie <mchristi@redhat.com>
Subject: [PATCH 4/4] nbd: fix zero cmd timeout handling
Date: Fri,  9 Aug 2019 16:26:10 -0500
Message-ID: <20190809212610.19412-5-mchristi@redhat.com> (raw)
In-Reply-To: <20190809212610.19412-1-mchristi@redhat.com>

This fixes a regression added in 4.9 with commit:

commit 0eadf37afc2500e1162c9040ec26a705b9af8d47
Author: Josef Bacik <jbacik@fb.com>
Date:   Thu Sep 8 12:33:40 2016 -0700

    nbd: allow block mq to deal with timeouts

where before the patch userspace would set the timeout to 0 to disable
it. With the above patch, a zero timeout tells the block layer to use
the default value of 30 seconds. For setups where commands can take a
long time or experience transient issues like network disruptions this
then results in IO errors being sent to the application.

To fix this, the patch still uses the common block layer timeout
framework, but if zero is set, nbd just logs a message and then resets
the timer when it expires.

Josef,

I did not cc stable, but I think we want to port the patches to some
releases. We originally hit this with users using the longterm kernels
with ceph. The patch does not apply anywhere cleanly with older ones
like 4.9, so I was not sure how we wanted to handle it.

Signed-off-by: Mike Christie <mchristi@redhat.com>
---
 drivers/block/nbd.c | 26 ++++++++++++++++++++++----
 1 file changed, 22 insertions(+), 4 deletions(-)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 93294000ce55..8459adce713a 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -120,6 +120,7 @@ struct nbd_cmd {
 	struct mutex lock;
 	int index;
 	int cookie;
+	int retries;
 	blk_status_t status;
 	unsigned long flags;
 	u32 cmd_cookie;
@@ -405,10 +406,25 @@ static enum blk_eh_timer_return nbd_xmit_timeout(struct request *req,
 			nbd_config_put(nbd);
 			return BLK_EH_DONE;
 		}
-	} else {
-		dev_err_ratelimited(nbd_to_dev(nbd),
-				    "Connection timed out\n");
 	}
+
+	if (!nbd->tag_set.timeout) {
+		/*
+		 * Userspace sets timeout=0 to disable socket disconnection,
+		 * so just warn and reset the timer.
+		 */
+		cmd->retries++;
+		dev_info(nbd_to_dev(nbd), "Possible stuck request %p: control (%s@%llu,%uB). Runtime %u seconds\n",
+			req, nbdcmd_to_ascii(req_to_nbd_cmd_type(req)),
+			(unsigned long long)blk_rq_pos(req) << 9,
+			blk_rq_bytes(req), (req->timeout / HZ) * cmd->retries);
+
+		mutex_unlock(&cmd->lock);
+		nbd_config_put(nbd);
+		return BLK_EH_RESET_TIMER;
+	}
+
+	dev_err_ratelimited(nbd_to_dev(nbd), "Connection timed out\n");
 	set_bit(NBD_TIMEDOUT, &config->runtime_flags);
 	cmd->status = BLK_STS_IOERR;
 	mutex_unlock(&cmd->lock);
@@ -527,6 +543,7 @@ static int nbd_send_cmd(struct nbd_device *nbd, struct nbd_cmd *cmd, int index)
 	}
 	cmd->index = index;
 	cmd->cookie = nsock->cookie;
+	cmd->retries = 0;
 	request.type = htonl(type | nbd_cmd_flags);
 	if (type != NBD_CMD_FLUSH) {
 		request.from = cpu_to_be64((u64)blk_rq_pos(req) << 9);
@@ -1253,7 +1270,8 @@ static bool nbd_is_valid_blksize(unsigned long blksize)
 static void nbd_set_cmd_timeout(struct nbd_device *nbd, u64 timeout)
 {
 	nbd->tag_set.timeout = timeout * HZ;
-	blk_queue_rq_timeout(nbd->disk->queue, timeout * HZ);
+	if (timeout)
+		blk_queue_rq_timeout(nbd->disk->queue, timeout * HZ);
 }
 
 /* Must be called with config_lock held */
-- 
2.20.1


  parent reply index

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-09 21:26 [PATCH 0/4] nbd: cmd timeout fixes Mike Christie
2019-08-09 21:26 ` [PATCH 1/4] nbd: add set cmd timeout helper Mike Christie
2019-08-09 21:32   ` Mike Christie
2019-08-13 13:11   ` Josef Bacik
2019-08-09 21:26 ` [PATCH 2/4] nbd: add function to convert blk req op to nbd cmd Mike Christie
2019-08-13 13:11   ` Josef Bacik
2019-08-09 21:26 ` [PATCH 3/4] nbd: add missing config put Mike Christie
2019-08-13 13:11   ` Josef Bacik
2019-08-09 21:26 ` Mike Christie [this message]
2019-08-13 13:13   ` [PATCH 4/4] nbd: fix zero cmd timeout handling Josef Bacik
2019-08-13 15:45     ` Mike Christie
2019-08-13 15:54       ` Mike Christie
2019-08-13 15:59         ` Mike Christie
2019-08-13 16:15       ` Josef Bacik

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190809212610.19412-5-mchristi@redhat.com \
    --to=mchristi@redhat.com \
    --cc=josef@toxicpanda.com \
    --cc=linux-block@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Block Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-block/0 linux-block/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-block linux-block/ https://lore.kernel.org/linux-block \
		linux-block@vger.kernel.org
	public-inbox-index linux-block

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-block


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git