linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Martin K. Petersen" <martin.petersen@oracle.com>
To: Damien Le Moal <Damien.LeMoal@wdc.com>
Cc: Tim Walker <tim.t.walker@seagate.com>,
	Ming Lei <ming.lei@redhat.com>,
	"linux-block\@vger.kernel.org" <linux-block@vger.kernel.org>,
	linux-scsi <linux-scsi@vger.kernel.org>,
	"linux-nvme\@lists.infradead.org"
	<linux-nvme@lists.infradead.org>
Subject: Re: [LSF/MM/BPF TOPIC] NVMe HDD
Date: Wed, 12 Feb 2020 22:02:08 -0500	[thread overview]
Message-ID: <yq1blq3rxzj.fsf@oracle.com> (raw)
In-Reply-To: <BYAPR04MB5816AA843E63FFE2EA1D5D23E71B0@BYAPR04MB5816.namprd04.prod.outlook.com> (Damien Le Moal's message of "Wed, 12 Feb 2020 01:47:53 +0000")


Damien,

> Exposing an HDD through multiple-queues each with a high queue depth
> is simply asking for troubles. Commands will end up spending so much
> time sitting in the queues that they will timeout.

Yep!

> This can already be observed with the smartpqi SAS HBA which exposes
> single drives as multiqueue block devices with high queue depth.
> Exercising these drives heavily leads to thousands of commands being
> queued and to timeouts. It is fairly easy to trigger this without a
> manual change to the QD. This is on my to-do list of fixes for some
> time now (lacking time to do it).

Controllers that queue internally are very susceptible to application or
filesystem timeouts when drives are struggling to keep up.

> NVMe HDDs need to have an interface setup that match their speed, that
> is, something like a SAS interface: *single* queue pair with a max QD
> of 256 or less depending on what the drive can take. Their is no
> TASK_SET_FULL notification on NVMe, so throttling has to come from the
> max QD of the SQ, which the drive will advertise to the host.

At the very minimum we'll need low queue depths. But I have my doubts
whether we can make this work well enough without some kind of TASK SET
FULL style AER to throttle the I/O.

> NVMe specs will need an update to have a "NONROT" (non-rotational) bit in
> the identify data for all this to fit well in the current stack.

Absolutely.

-- 
Martin K. Petersen	Oracle Linux Engineering

  parent reply	other threads:[~2020-02-13  3:02 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-10 19:20 [LSF/MM/BPF TOPIC] NVMe HDD Tim Walker
2020-02-10 20:43 ` Keith Busch
2020-02-10 22:25   ` Finn Thain
2020-02-11 12:28 ` Ming Lei
2020-02-11 19:01   ` Tim Walker
2020-02-12  1:47     ` Damien Le Moal
2020-02-12 22:03       ` Ming Lei
2020-02-13  2:40         ` Damien Le Moal
2020-02-13  7:53           ` Ming Lei
2020-02-13  8:24             ` Damien Le Moal
2020-02-13  8:34               ` Ming Lei
2020-02-13 16:30                 ` Keith Busch
2020-02-14  0:40                   ` Ming Lei
2020-02-13  3:02       ` Martin K. Petersen [this message]
2020-02-13  3:12         ` Tim Walker
2020-02-13  4:17           ` Martin K. Petersen
2020-02-14  7:32             ` Hannes Reinecke
2020-02-14 14:40               ` Keith Busch
2020-02-14 16:04                 ` Hannes Reinecke
2020-02-14 17:05                   ` Keith Busch
2020-02-18 15:54                     ` Tim Walker
2020-02-18 17:41                       ` Keith Busch
2020-02-18 17:52                         ` James Smart
2020-02-19  1:31                         ` Ming Lei
2020-02-19  1:53                           ` Damien Le Moal
2020-02-19  2:15                             ` Ming Lei
2020-02-19  2:32                               ` Damien Le Moal
2020-02-19  2:56                                 ` Tim Walker
2020-02-19 16:28                                   ` Tim Walker
2020-02-19 20:50                                     ` Keith Busch
2020-02-14  0:35         ` Ming Lei
2020-02-12 21:52     ` Ming Lei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=yq1blq3rxzj.fsf@oracle.com \
    --to=martin.petersen@oracle.com \
    --cc=Damien.LeMoal@wdc.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=tim.t.walker@seagate.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).