All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sagi Grimberg <sagi@grimberg.me>
To: Mike Snitzer <snitzer@redhat.com>,
	axboe@kernel.dk, hch@lst.de, emilne@redhat.com,
	james.smart@broadcom.com
Cc: Bart.VanAssche@wdc.com, linux-block@vger.kernel.org,
	dm-devel@redhat.com, linux-nvme@lists.infradead.org
Subject: Re: [dm-devel] [for-4.16 PATCH 4/5] dm mpath: use NVMe error handling to know when an error is retryable
Date: Wed, 20 Dec 2017 22:33:54 +0200	[thread overview]
Message-ID: <e00058b2-d94e-3c4e-b2d9-4b00c1e28d80@grimberg.me> (raw)
In-Reply-To: <20171220165812.GB18255@redhat.com>


> But interestingly, with my "mptest" link failure test
> (test_01_nvme_offline) I'm not actually seeing NVMe trigger a failure
> that needs a multipath layer (be it NVMe multipath or DM multipath) to
> fail a path and retry the IO.  The pattern is that the link goes down,
> and nvme waits for it to come back (internalizing any failure) and then
> the IO continues.. so no multipath _really_ needed:
> 
> [55284.011286] nvme nvme0: NVME-FC{0}: controller connectivity lost. Awaiting Reconnect
> [55284.020078] nvme nvme1: NVME-FC{1}: controller connectivity lost. Awaiting Reconnect
> [55284.028872] nvme nvme2: NVME-FC{2}: controller connectivity lost. Awaiting Reconnect
> [55284.037658] nvme nvme3: NVME-FC{3}: controller connectivity lost. Awaiting Reconnect
> [55295.157773] nvmet: ctrl 1 keep-alive timer (15 seconds) expired!
> [55295.157775] nvmet: ctrl 4 keep-alive timer (15 seconds) expired!
> [55295.157778] nvmet: ctrl 3 keep-alive timer (15 seconds) expired!
> [55295.157780] nvmet: ctrl 2 keep-alive timer (15 seconds) expired!
> [55295.157781] nvmet: ctrl 4 fatal error occurred!
> [55295.157784] nvmet: ctrl 3 fatal error occurred!
> [55295.157785] nvmet: ctrl 2 fatal error occurred!
> [55295.199816] nvmet: ctrl 1 fatal error occurred!
> [55304.047540] nvme nvme0: NVME-FC{0}: connectivity re-established. Attempting reconnect
> [55304.056533] nvme nvme1: NVME-FC{1}: connectivity re-established. Attempting reconnect
> [55304.066053] nvme nvme2: NVME-FC{2}: connectivity re-established. Attempting reconnect
> [55304.075037] nvme nvme3: NVME-FC{3}: connectivity re-established. Attempting reconnect
> [55304.373776] nvmet: creating controller 1 for subsystem mptestnqn for NQN nqn.2014-08.org.nvmexpress:uuid:00000000-0000-0000-0000-000000000000.
> [55304.373835] nvmet: creating controller 2 for subsystem mptestnqn for NQN nqn.2014-08.org.nvmexpress:uuid:00000000-0000-0000-0000-000000000000.
> [55304.373873] nvmet: creating controller 3 for subsystem mptestnqn for NQN nqn.2014-08.org.nvmexpress:uuid:00000000-0000-0000-0000-000000000000.
> [55304.373879] nvmet: creating controller 4 for subsystem mptestnqn for NQN nqn.2014-08.org.nvmexpress:uuid:00000000-0000-0000-0000-000000000000.
> [55304.430988] nvme nvme0: NVME-FC{0}: controller reconnect complete
> [55304.433124] nvme nvme3: NVME-FC{3}: controller reconnect complete
> [55304.433705] nvme nvme1: NVME-FC{1}: controller reconnect complete
> 
> It seems if we have multipath ontop (again: either NVMe native multipath
> _or_ DM multipath) we'd prefer to have the equivalent of SCSI's
> REQ_FAILFAST_TRANSPORT support?
> 
> But nvme_req_needs_retry() calls blk_noretry_request() which returns
> true if REQ_FAILFAST_TRANSPORT is set.  Which results in
> nvme_req_needs_retry() returning false.  Which causes nvme_complete_rq()
> to skip the multipath specific nvme_req_needs_failover(), etc.
> 
> So all said:
> 
> 1) why wait for connection recovery if we have other connections to try?
> I think NVMe needs to be plumbed for respecting REQ_FAILFAST_TRANSPORT.

This is specific to FC fail fast logic, nvme-rdma will fail inflight
commands as soon as the transport see an error (or keep alive timeout
expires).

It seems that FC wants to wait for the request retries counter to exceed
but given that the queue isn't unquiesced, the requests are quiesced
until the host will successfully reconnect.

WARNING: multiple messages have this Message-ID (diff)
From: sagi@grimberg.me (Sagi Grimberg)
Subject: [dm-devel] [for-4.16 PATCH 4/5] dm mpath: use NVMe error handling to know when an error is retryable
Date: Wed, 20 Dec 2017 22:33:54 +0200	[thread overview]
Message-ID: <e00058b2-d94e-3c4e-b2d9-4b00c1e28d80@grimberg.me> (raw)
In-Reply-To: <20171220165812.GB18255@redhat.com>


> But interestingly, with my "mptest" link failure test
> (test_01_nvme_offline) I'm not actually seeing NVMe trigger a failure
> that needs a multipath layer (be it NVMe multipath or DM multipath) to
> fail a path and retry the IO.  The pattern is that the link goes down,
> and nvme waits for it to come back (internalizing any failure) and then
> the IO continues.. so no multipath _really_ needed:
> 
> [55284.011286] nvme nvme0: NVME-FC{0}: controller connectivity lost. Awaiting Reconnect
> [55284.020078] nvme nvme1: NVME-FC{1}: controller connectivity lost. Awaiting Reconnect
> [55284.028872] nvme nvme2: NVME-FC{2}: controller connectivity lost. Awaiting Reconnect
> [55284.037658] nvme nvme3: NVME-FC{3}: controller connectivity lost. Awaiting Reconnect
> [55295.157773] nvmet: ctrl 1 keep-alive timer (15 seconds) expired!
> [55295.157775] nvmet: ctrl 4 keep-alive timer (15 seconds) expired!
> [55295.157778] nvmet: ctrl 3 keep-alive timer (15 seconds) expired!
> [55295.157780] nvmet: ctrl 2 keep-alive timer (15 seconds) expired!
> [55295.157781] nvmet: ctrl 4 fatal error occurred!
> [55295.157784] nvmet: ctrl 3 fatal error occurred!
> [55295.157785] nvmet: ctrl 2 fatal error occurred!
> [55295.199816] nvmet: ctrl 1 fatal error occurred!
> [55304.047540] nvme nvme0: NVME-FC{0}: connectivity re-established. Attempting reconnect
> [55304.056533] nvme nvme1: NVME-FC{1}: connectivity re-established. Attempting reconnect
> [55304.066053] nvme nvme2: NVME-FC{2}: connectivity re-established. Attempting reconnect
> [55304.075037] nvme nvme3: NVME-FC{3}: connectivity re-established. Attempting reconnect
> [55304.373776] nvmet: creating controller 1 for subsystem mptestnqn for NQN nqn.2014-08.org.nvmexpress:uuid:00000000-0000-0000-0000-000000000000.
> [55304.373835] nvmet: creating controller 2 for subsystem mptestnqn for NQN nqn.2014-08.org.nvmexpress:uuid:00000000-0000-0000-0000-000000000000.
> [55304.373873] nvmet: creating controller 3 for subsystem mptestnqn for NQN nqn.2014-08.org.nvmexpress:uuid:00000000-0000-0000-0000-000000000000.
> [55304.373879] nvmet: creating controller 4 for subsystem mptestnqn for NQN nqn.2014-08.org.nvmexpress:uuid:00000000-0000-0000-0000-000000000000.
> [55304.430988] nvme nvme0: NVME-FC{0}: controller reconnect complete
> [55304.433124] nvme nvme3: NVME-FC{3}: controller reconnect complete
> [55304.433705] nvme nvme1: NVME-FC{1}: controller reconnect complete
> 
> It seems if we have multipath ontop (again: either NVMe native multipath
> _or_ DM multipath) we'd prefer to have the equivalent of SCSI's
> REQ_FAILFAST_TRANSPORT support?
> 
> But nvme_req_needs_retry() calls blk_noretry_request() which returns
> true if REQ_FAILFAST_TRANSPORT is set.  Which results in
> nvme_req_needs_retry() returning false.  Which causes nvme_complete_rq()
> to skip the multipath specific nvme_req_needs_failover(), etc.
> 
> So all said:
> 
> 1) why wait for connection recovery if we have other connections to try?
> I think NVMe needs to be plumbed for respecting REQ_FAILFAST_TRANSPORT.

This is specific to FC fail fast logic, nvme-rdma will fail inflight
commands as soon as the transport see an error (or keep alive timeout
expires).

It seems that FC wants to wait for the request retries counter to exceed
but given that the queue isn't unquiesced, the requests are quiesced
until the host will successfully reconnect.

  reply	other threads:[~2017-12-20 20:33 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-19 21:05 [for-4.16 PATCH 0/5] block, nvme, dm: allow DM multipath to use NVMe's error handler Mike Snitzer
2017-12-19 21:05 ` Mike Snitzer
2017-12-19 21:05 ` [for-4.16 PATCH 1/5] block: establish request failover callback infrastructure Mike Snitzer
2017-12-19 21:05   ` Mike Snitzer
2017-12-19 21:05 ` [for-4.16 PATCH 2/5] nvme: use request's failover callback for multipath failover Mike Snitzer
2017-12-19 21:05   ` Mike Snitzer
2017-12-19 21:05 ` [for-4.16 PATCH 3/5] nvme: move nvme_req_needs_failover() from multipath to core Mike Snitzer
2017-12-19 21:05   ` Mike Snitzer
2017-12-19 21:05 ` [for-4.16 PATCH 4/5] dm mpath: use NVMe error handling to know when an error is retryable Mike Snitzer
2017-12-19 21:05   ` Mike Snitzer
2017-12-20 16:58   ` Mike Snitzer
2017-12-20 16:58     ` Mike Snitzer
2017-12-20 20:33     ` Sagi Grimberg [this message]
2017-12-20 20:33       ` [dm-devel] " Sagi Grimberg
2017-12-19 21:05 ` [for-4.16 PATCH 5/5] dm mpath: skip calls to end_io_bio if using NVMe bio-based and round-robin Mike Snitzer
2017-12-19 21:05   ` Mike Snitzer
2017-12-22 18:02 ` [for-4.16 PATCH 0/5] block, nvme, dm: allow DM multipath to use NVMe's error handler Mike Snitzer
2017-12-22 18:02   ` Mike Snitzer
2017-12-26 20:51 ` Keith Busch
2017-12-26 20:51   ` Keith Busch
2017-12-27  2:42   ` Mike Snitzer
2017-12-27  2:42     ` Mike Snitzer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e00058b2-d94e-3c4e-b2d9-4b00c1e28d80@grimberg.me \
    --to=sagi@grimberg.me \
    --cc=Bart.VanAssche@wdc.com \
    --cc=axboe@kernel.dk \
    --cc=dm-devel@redhat.com \
    --cc=emilne@redhat.com \
    --cc=hch@lst.de \
    --cc=james.smart@broadcom.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=snitzer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.