linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Maciej W. Rozycki" <macro@linux-mips.org>
To: Russ Anderson <rja@sgi.com>
Cc: linux-kernel@vger.kernel.org
Subject: Re: [RCF] Linux memory error handling
Date: Wed, 15 Jun 2005 16:26:13 +0100 (BST)	[thread overview]
Message-ID: <Pine.LNX.4.61L.0506151545410.13835@blysk.ds.pg.gda.pl> (raw)
In-Reply-To: <200506151430.j5FEUD7J1393603@clink.americas.sgi.com>

On Wed, 15 Jun 2005, Russ Anderson wrote:

> Handling memory errors:
> 
> 	Some memory error handling functionality is common to
> 	most architectures.
> 
> 	Corrected error handling:
> 
> 	    Logging:  When ECC hardware corrects a Single Bit Error (SBE),
> 		an interrupt is generated to inform linux that there is 
> 		a corrected error record available for logging.
> 
> 	    Polling Threshold:  A solid single bit error can cause a burst
> 		of correctable errors that can cause a significant logging
> 		overhead.  SBE thresholding counts the number of SBEs for
> 		a given page and if too many SBEs are detected in a given
> 		period of time, the interrupt is disabled and instead 
> 		linux periodically polls for corrected errors.

 This is highly undesirable if the same interrupt is used for MBEs.  A 
page that causes an excessive number of SBEs should rather be removed from 
the available pool instead.  Logging should probably take recent events 
into account anyway and take care of not overloading the system, e.g. by 
keeping only statistical data instead of detailed information about each 
event under load.

> 	    Data Migration:  If a page of memory has too many single bit
> 		errors, it may be prudent to move the data off that
> 		physical page before the correctable SBE turns into an
> 		uncorrectable MBE. 
> 
> 	    Memory handling parameters:
> 
> 		Since memory failure modes are due to specific DIMM
> 		failure characteristics, there is will be no way to 
> 		reach agreement on one set of thresholds that will
> 		be appropriate for all configurations.  Therefore there
> 		needs to be a way to modify the thresholds.  One alternative
> 		is a /proc/sys/kernel/ interface to control settings, such
> 		as polling thresholds.  That provides an easy standard
> 		way of modifying thresholds to match the characteristics
> 		of the specific DIMM type.

 Note that scrubbing may also be required depending on hardware 
capabilities as data could have been corrected on the fly for the purpose 
of providing a correct value for the bus transaction, but memory may still 
hold corrupted data.

 And of course not all memory is DIMM!

> 	Uncorrected error handling:
> 
> 	    Kill the application:  One recovery technique to avoid a kernel
> 		panic when an application process hits an uncorrectable 
> 		memory error is to SIGKILL the application.  The page is 
> 		marked PG_reserved to avoid re-use.  A (new) PG_hard_error
> 		flag would be useful to indicate that the physical page has
> 		a hard memory error.

 Note we have some infrastructure for that in the MIPS port -- we kill the 
triggering process, but we don't mark the problematic memory page as 
unusable (which is an area for improvement).  This is of course the case 
for faults occurring synchronously in the user mode -- when in the kernel 
mode or when happening asynchronously (e.g. because of being triggered by 
a DMA transaction rather than one involving a CPU) you often cannot 
determine whether killing a process is good enough for system safety even 
if you are able to narrow the fault down to a potential victim.

> 	    Disable memory for next reboot:  When a hard error is detected,
> 		notify SAL/BIOS of the bad physical memory.  SAL/BIOS can
> 		save the bad addresses and, when building the EFI map after
> 		reset/reboot, mark the bad pages as EFI_UNUSABLE_MEMORY,
> 		and type = 0, so Linux will ignore granules contains these 
> 		pages.
> 
> 	    Dumping:  Dump programs should not try to dump pages with bad
> 		memory.  A PG_hard_error flag would indicate to dump
> 		programs which pages have bad memory.
> 
> 	Memory DIMM information & settings:
> 
> 	    Use a /proc/dimm_info interface to pass DIMM information to Linux.
> 	    Hardware vendors could add their hardware specific settings.

 I'd recommend a more generic name rather than "dimm_info" if that is to 
be reused universally.

  Maciej

  parent reply	other threads:[~2005-06-15 15:26 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-06-15 14:30 [RCF] Linux memory error handling Russ Anderson
2005-06-15 15:08 ` Andi Kleen
2005-06-15 16:36   ` Russ Anderson
2005-06-15 15:26 ` Maciej W. Rozycki [this message]
2005-06-15 19:46   ` Russell King
2005-06-15 20:28     ` [RFC] " Russ Anderson
2005-06-15 20:45       ` Dave Hansen
2005-06-15 21:27         ` Russ Anderson
2005-06-15 21:33           ` Dave Hansen
2005-06-20 20:42             ` Russ Anderson
2005-06-20 21:07               ` Dave Hansen
2005-06-15 22:09   ` Russ Anderson
2005-06-16 19:42     ` Maciej W. Rozycki
2005-06-16  1:03   ` [RCF] " Ross Biro
2005-06-15 20:42 ` Joel Schopp
2005-06-16  2:54 ` Wang, Zhenyu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.61L.0506151545410.13835@blysk.ds.pg.gda.pl \
    --to=macro@linux-mips.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rja@sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).