linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mike Christie <mchristi@redhat.com>
To: Josef Bacik <josef@toxicpanda.com>
Cc: linux-block@vger.kernel.org
Subject: Re: [PATCH 4/4] nbd: fix zero cmd timeout handling
Date: Tue, 13 Aug 2019 10:54:15 -0500	[thread overview]
Message-ID: <5D52DD27.6090202@redhat.com> (raw)
In-Reply-To: <5D52DB33.8070307@redhat.com>

On 08/13/2019 10:45 AM, Mike Christie wrote:
> On 08/13/2019 08:13 AM, Josef Bacik wrote:
>> On Fri, Aug 09, 2019 at 04:26:10PM -0500, Mike Christie wrote:
>>> This fixes a regression added in 4.9 with commit:
>>>
>>> commit 0eadf37afc2500e1162c9040ec26a705b9af8d47
>>> Author: Josef Bacik <jbacik@fb.com>
>>> Date:   Thu Sep 8 12:33:40 2016 -0700
>>>
>>>     nbd: allow block mq to deal with timeouts
>>>
>>> where before the patch userspace would set the timeout to 0 to disable
>>> it. With the above patch, a zero timeout tells the block layer to use
>>> the default value of 30 seconds. For setups where commands can take a
>>> long time or experience transient issues like network disruptions this
>>> then results in IO errors being sent to the application.
>>>
>>> To fix this, the patch still uses the common block layer timeout
>>> framework, but if zero is set, nbd just logs a message and then resets
>>> the timer when it expires.
>>>
>>> Josef,
>>>
>>> I did not cc stable, but I think we want to port the patches to some
>>> releases. We originally hit this with users using the longterm kernels
>>> with ceph. The patch does not apply anywhere cleanly with older ones
>>> like 4.9, so I was not sure how we wanted to handle it.
>>>
>>
>> I assume you tested this?  IIRC there was a problem where 0 really meant 0 and
> 
> Yes.
> 
>> commands would insta-timeout.  But my memory is foggy here, so I'm not sure if
>> it was setting the tag_set timeout to 0 that made things go wrong, or what.  Or
>> I could be making it all up, who knows.
> 
> Yes, if you call blk_queue_rq_timeout with 0, then the command will
> timeout almost immediately. I added a check for this in the first patch.
> 
> If blk_mq_tag_set.timeout is 0, blk_mq_init_allocated_queue uses the
> default 30 second value.
> 
> So with the patch if the user sets the timeout to 0, then we will just
> log a message every 30 seconds that the command is stuck.
> 
>>
>> There's a blktest that just runs fio on a normal device with no timeouts or
>> anything, that's where I would see the problem since it was a little racy.
>> Basically have the timeout set to 0 and put load on the disk and eventually
>> you'd start seeing timeouts.  If that all goes fine then you can add

Oh yeah just to be clear that is another issue that you can hit with any
driver.

If a app/user sets the timeout value in sysfs:

/sys/block/$dev/queue/io_timeout

then it bypasses the driver completely because it just does

queue_io_timeout_store -> blk_queue_rq_timeout

and that function/interface lets you set the timeout to anything.

My patches just fix up the nbd interface that existing tools were using
and hitting regressions with.

I was debating about sending a patch for not allowing

blk_queue_rq_timeout(q, 9)

in a separate patchset, but I was not sure if people use that for
testing fast timeouts.

  reply	other threads:[~2019-08-13 15:54 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-09 21:26 [PATCH 0/4] nbd: cmd timeout fixes Mike Christie
2019-08-09 21:26 ` [PATCH 1/4] nbd: add set cmd timeout helper Mike Christie
2019-08-09 21:32   ` Mike Christie
2019-08-13 13:11   ` Josef Bacik
2019-08-09 21:26 ` [PATCH 2/4] nbd: add function to convert blk req op to nbd cmd Mike Christie
2019-08-13 13:11   ` Josef Bacik
2019-08-09 21:26 ` [PATCH 3/4] nbd: add missing config put Mike Christie
2019-08-13 13:11   ` Josef Bacik
2019-08-09 21:26 ` [PATCH 4/4] nbd: fix zero cmd timeout handling Mike Christie
2019-08-13 13:13   ` Josef Bacik
2019-08-13 15:45     ` Mike Christie
2019-08-13 15:54       ` Mike Christie [this message]
2019-08-13 15:59         ` Mike Christie
2019-08-13 16:15       ` Josef Bacik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5D52DD27.6090202@redhat.com \
    --to=mchristi@redhat.com \
    --cc=josef@toxicpanda.com \
    --cc=linux-block@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).