linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Keith Busch <kbusch@kernel.org>
To: Nilay Shroff <nilay@linux.ibm.com>
Cc: axboe@fb.com, hch@lst.de, sagi@grimberg.me,
	linux-nvme@lists.infradead.org, linux-block@vger.kernel.org,
	gjoyce@linux.ibm.com
Subject: Re: [PATCH RESEND] nvme-pci: Fix EEH failure on ppc after subsystem reset
Date: Tue, 27 Feb 2024 11:29:16 -0700	[thread overview]
Message-ID: <Zd4p_E8cPFpr1M--@kbusch-mbp> (raw)
In-Reply-To: <20240209050342.406184-1-nilay@linux.ibm.com>

On Fri, Feb 09, 2024 at 10:32:16AM +0530, Nilay Shroff wrote:
> If the nvme subsyetm reset causes the loss of communication to the nvme
> adapter then EEH could potnetially recover the adapter. The detection of
> comminication loss to the adapter only happens when the nvme driver
> attempts to read an MMIO register.
> 
> The nvme subsystem reset command writes 0x4E564D65 to NSSR register and
> schedule adapter reset.In the case nvme subsystem reset caused the loss
> of communication to the nvme adapter then either IO timeout event or
> adapter reset handler could detect it. If IO timeout even could detect
> loss of communication then EEH handler is able to recover the
> communication to the adapter. This change was implemented in 651438bb0af5
> (nvme-pci: Fix EEH failure on ppc). However if the adapter communication
> loss is detected in nvme reset work handler then EEH is unable to
> successfully finish the adapter recovery.
> 
> This patch ensures that,
> - nvme driver reset handler would observer pci channel was offline after
>   a failed MMIO read and avoids marking the controller state to DEAD and
>   thus gives a fair chance to EEH handler to recover the nvme adapter.
> 
> - if nvme controller is already in RESETTNG state and pci channel frozen
>   error is detected then  nvme driver pci-error-handler code sends the
>   correct error code (PCI_ERS_RESULT_NEED_RESET) back to the EEH handler
>   so that EEH handler could proceed with the pci slot reset.

A subsystem reset takes the link down. I'm pretty sure the proper way to
recover from it requires pcie hotplug support.


  parent reply	other threads:[~2024-02-27 18:29 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-09  5:02 [PATCH RESEND] nvme-pci: Fix EEH failure on ppc after subsystem reset Nilay Shroff
2024-02-22 21:00 ` Greg Joyce
     [not found] ` <2c76725c-7bb6-4827-b45a-dbe1acbefba7@imap.linux.ibm.com>
2024-02-27 18:14   ` Nilay Shroff
2024-02-27 18:29 ` Keith Busch [this message]
2024-02-28 11:19   ` Nilay Shroff
2024-02-29 12:27     ` Nilay Shroff
2024-03-06 11:20   ` Nilay Shroff
2024-03-06 15:19     ` Keith Busch
2024-03-08 15:41 ` Keith Busch
2024-03-09 14:29   ` Nilay Shroff
2024-03-09 15:44     ` Keith Busch
2024-03-09 19:05       ` Nilay Shroff
2024-03-11  4:41         ` Keith Busch
2024-03-11 12:58           ` Nilay Shroff
2024-03-12 14:30             ` Keith Busch
2024-03-13 11:59               ` Nilay Shroff
2024-03-22  5:02                 ` Nilay Shroff

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zd4p_E8cPFpr1M--@kbusch-mbp \
    --to=kbusch@kernel.org \
    --cc=axboe@fb.com \
    --cc=gjoyce@linux.ibm.com \
    --cc=hch@lst.de \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=nilay@linux.ibm.com \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).