linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Gavin Shan <gwshan@linux.vnet.ibm.com>
To: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Cc: linasvepstas@gmail.com, Cao jin <caoj.fnst@cn.fujitsu.com>,
	Jonathan Corbet <corbet@lwn.net>,
	"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
	linux-doc@vger.kernel.org,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Bjorn Helgaas <bhelgaas@google.com>
Subject: Re: [PATCH] pci-error-recover: doc cleanup
Date: Wed, 14 Dec 2016 13:39:50 +1100	[thread overview]
Message-ID: <20161214023949.GA9896@gwshan> (raw)
In-Reply-To: <3ed3151c-eeef-940c-8a9c-49cf53a51d49@au1.ibm.com>

On Fri, Dec 09, 2016 at 05:50:17PM +1100, Andrew Donnellan wrote:
>On 09/12/16 17:24, Linas Vepstas wrote:
>>I suppose I'm confused, but I recall that link resets are non-fatal.
>>Fatal errors typically require that the the pci adapter be completely
>>reset, any adapter firmware to be reloaded from scratch, the device
>>driver has to kill all device state and start from scratch. Its huge.
>
>Is there a difference in terminology between an AER fatal error and what
>EEH/IBM people think of as a fatal error?
>

They are different things. AER fatal error can lead to frozen PE error,
not fenced PHB error basing on the configuration on PHB.

>>If the fatal error is on pci device that is under a block device
>>holding a file system, then (usually) there is no way to recover,
>>because the block layer (and file system) cannot deal with a block
>>device that disappeared and then reappeared some few seconds later.
>>(maybe some future zfs or lvm or btrfs might be able to deal with
>>this, but not today)
>
>Is this still true? I'm not at all familiar with the block device side of it,
>but the cxlflash driver has reasonably full EEH support, including surviving
>a full PHB fence and complete reset.
>

It's still true, especially when the recovery is going to affect the
rootfs. On completion of error recovery, the driver (if necessary)
and filesystem needs to be reloaded which depends on script or daemon
and they are unavailable in this scenario.

Thanks,
Gavin

      reply	other threads:[~2016-12-14  2:40 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-08  8:16 [PATCH] pci-error-recover: doc cleanup Cao jin
2016-12-08 14:05 ` Jonathan Corbet
2016-12-08 14:13   ` Cao jin
2016-12-09  6:24     ` Linas Vepstas
2016-12-09  6:37       ` Cao jin
2016-12-09  6:44         ` Linas Vepstas
2016-12-09  7:59           ` Cao jin
2016-12-09 16:11           ` Alex Williamson
2016-12-09 14:37         ` Jonathan Corbet
2016-12-19  3:25           ` Cao jin
2016-12-09  6:50       ` Andrew Donnellan
2016-12-14  2:39         ` Gavin Shan [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161214023949.GA9896@gwshan \
    --to=gwshan@linux.vnet.ibm.com \
    --cc=andrew.donnellan@au1.ibm.com \
    --cc=bhelgaas@google.com \
    --cc=caoj.fnst@cn.fujitsu.com \
    --cc=corbet@lwn.net \
    --cc=linasvepstas@gmail.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).