linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Elliott, Robert (Server Storage)" <Elliott@hp.com>
To: Christoph Hellwig <hch@lst.de>, Jens Axboe <axboe@fb.com>
Cc: "linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: RE: blk-mq timeout handling fixes
Date: Wed, 17 Sep 2014 21:53:50 +0000	[thread overview]
Message-ID: <94D0CD8314A33A4D9D801C0FE68B402958C86C8E@G9W0745.americas.hpqcorp.net> (raw)
In-Reply-To: <1410651613-1993-1-git-send-email-hch@lst.de>



> -----Original Message-----
> From: Christoph Hellwig [mailto:hch@lst.de]
> Sent: Saturday, 13 September, 2014 6:40 PM
> To: Jens Axboe
> Cc: Elliott, Robert (Server Storage); linux-scsi@vger.kernel.org; linux-
> kernel@vger.kernel.org
> Subject: blk-mq timeout handling fixes
> 
> This series fixes various issues with timeout handling that Robert
> ran into when testing scsi-mq heavily.  He tested an earlier version,
> and couldn't reproduce the issues anymore, although the series changed
> quite significantly since and should probably be retested.
> 
> In summary we not only start the blk-mq timer inside the drivers
> ->queue_rq method after the request has been fully setup, and we
> also tell the drivers if we're timing out a reserved (internal)
> request or a real one.  Many drivers including will need to handle
> those internal ones differently, e.g. for scsi-mq we don't even
> have a scsi command structure allocated for the reserved commands.

I have rerun a variety of tests on:
* Jens' for-next tree that went into 3.17rc5
* plus this series
* plus two patches for infinite recursion on flushes from 
  Ming and then Christoph

and have not been able to trigger the scsi_times_out req->special
NULL pointer dereference that prompted this series.

Testing includes:
* concurrent heavy workload generators:
  * fio high iodepth direct 512 byte random reads (> 1M IOPS)
  * programs generating large bursts of paged writes
    * mkfs.ext4 (followed by e2fsck)
    * mkfs.xfs (followed by xfs_check)
    * ddpt
  * watch -n 0 sync to generate flushes
* scsi_logging_level MLCOMPLETE set to 0 or 1
  * scsi_lib.c patched to put all the ACTION_FAIL messages
    under level 1 so they can be squelched (massive error 
    prints cause more timeouts themselves)
* 4 hpsa and 16 mpt3sas devices (all made from SAS SSDs)
  * lockless hpsa driver
* injecting errors
  * device removal
  * device generating infinite errors
  * device generating a brief number of errors

The filesystems don't always recover properly, but nothing in 
the block or scsi midlayers crashed.

So, you may add this to the series:
Tested-by: Robert Elliott <elliott@hp.com>

---
Rob Elliott    HP Server Storage





  parent reply	other threads:[~2014-09-17 21:55 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-13 23:40 blk-mq timeout handling fixes Christoph Hellwig
2014-09-13 23:40 ` [PATCH 1/6] blk-mq: remove REQ_END Christoph Hellwig
2014-09-13 23:40 ` [PATCH 2/6] blk-mq: call blk_mq_start_request from ->queue_rq Christoph Hellwig
2014-09-15  6:34   ` Ming Lei
2014-09-15  7:27   ` Ming Lei
2014-09-13 23:40 ` [PATCH 3/6] blk-mq: rename blk_mq_end_io to blk_mq_end_request Christoph Hellwig
2014-09-13 23:40 ` [PATCH 4/6] blk-mq: fix and simplify tag iteration for the timeout handler Christoph Hellwig
2014-09-13 23:40 ` [PATCH 5/6] blk-mq: unshared " Christoph Hellwig
2014-09-13 23:40 ` [PATCH 6/6] blk-mq: pass a reserved argument to the " Christoph Hellwig
2014-09-17 21:53 ` Elliott, Robert (Server Storage) [this message]
2014-09-17 21:56   ` blk-mq timeout handling fixes Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=94D0CD8314A33A4D9D801C0FE68B402958C86C8E@G9W0745.americas.hpqcorp.net \
    --to=elliott@hp.com \
    --cc=axboe@fb.com \
    --cc=hch@lst.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).