From: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
To: Linux Kernel mailing list <linux-kernel@vger.kernel.org>
Subject: [RFC] How drivers notice a HW error?
Date: Thu, 27 Nov 2003 17:28:02 +0900 [thread overview]
Message-ID: <023401c3b4c0$5fb40660$a8647c0a@seto> (raw)
Hi all,
This is a request for comments, especially comments from driver developers.
On some platform, for example IA64, the chipset detects an error caused by
driver's operation such as I/O read, and reports it to kernel. Linux kernel
analyzes the error and decides to kill the driver or reboot at worst.
I want to convey the error information to the offending driver, and want to
enable the driver to recover the failed operation.
So, just a plan, I think about a readb_check function that has checking ability
enable it to return error value if error is occurred on read. Drivers could use
readb_check instead of usual readb, and could diagnosis whether a retry be
required or not, by the return value of readb_check.
To realize this, I consider following two images:
+ readb_check on driver (with Notifier)
[Outline]:
- Hardware error handler (for example in IA64, MCA handler) has a Notifier
as hook point.
- Driver may register a hook function to the Notifier.
- Notifier calls over registered functions when error is occurred.
- Called hook function checks address of error, and if the error seems
to be concerned with the parent driver, ups internal error flag and
stops Notifier by returning OK.
- Hardware error handler regards state of Notifier, and decides the system
to resume or not.
- Restarted driver may refer the error flag after read, and may retry the
read if flag is up.
[Issue]:
- Some interfaces such as register hooks would be required.
- Coding a hook function would be a bother of developers.
+ readb_check on kernel
[Outline]:
- Kernel has readb_check function.
- Drivers may use readb_check instead of usual readb.
- Hardware error handler checks address of error, and if it occurs in
readb_check, changes return value of readb_check and resumes
interrupted context.
- Driver may refer the return value to notice an error in last read
procedure.
[Issue]:
- Overhead would be involved. (Possibly, it could say negligible since
I/O reads are already horribly slow.)
IMO, this is a general-purpose function that should be available on many
platforms. I also hear that Solaris has some similar implementations like this.
If you have any comment about this feature or any idea different from this,
please tell me.
Best regards,
------
H.Seto <seto.hidetoshi@jp.fujitsu.com>
next reply other threads:[~2003-11-27 8:30 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-11-27 8:28 Hidetoshi Seto [this message]
[not found] <WpR1.1LG.3@gated-at.bofh.it>
2003-11-27 11:37 ` [RFC] How drivers notice a HW error? Andi Kleen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='023401c3b4c0$5fb40660$a8647c0a@seto' \
--to=seto.hidetoshi@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).