All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/3] support for broken memory modules (BadRAM)
@ 2011-04-27 16:16 Stefan Assmann
  2011-04-27 16:16 ` [RFC PATCH 1/3] Add string parsing function get_next_ulong Stefan Assmann
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Stefan Assmann @ 2011-04-27 16:16 UTC (permalink / raw)
  To: linux-mm
  Cc: tony.luck, andi, mingo, hpa, rick, akpm, lwoodman, riel, sassmann

This is a RFC for the BadRAM feature originally developed by Rick van Rein.
Patches are against vanilla 2.6.38.

The idea is to allow the user to specify RAM addresses that shouldn't be
touched by the OS, because they are broken in some way. Not all machines have
hardware support for hwpoison, ECC RAM, etc, so here's a solution that allows to
use bitmasks to mask address patterns with the new "badram" kernel command line
parameter.
Memtest86 has an option to generate these patterns since v2.3 so the only thing
for the user to do should be:
- run Memtest86
- note down the pattern
- add badram=<pattern> to the kernel command line

The concerning pages are then marked with the hwpoison flag and thus won't be
used by the memory managment system.

Link to Ricks original patches and docs:
http://rick.vanrein.org/linux/badram/

  Stefan

Stefan Assmann (3):
  Add string parsing function get_next_ulong
  support for broken memory modules (BadRAM)
  Add documentation and credits for BadRAM

 CREDITS                             |    9 +
 Documentation/BadRAM.txt            |  369 +++++++++++++++++++++++++++++++++++
 Documentation/kernel-parameters.txt |    5 +
 include/linux/kernel.h              |    1 +
 lib/cmdline.c                       |   35 ++++
 mm/memory-failure.c                 |   95 +++++++++
 6 files changed, 514 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/BadRAM.txt

-- 
1.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [RFC PATCH 1/3] Add string parsing function get_next_ulong
  2011-04-27 16:16 [RFC PATCH 0/3] support for broken memory modules (BadRAM) Stefan Assmann
@ 2011-04-27 16:16 ` Stefan Assmann
  2011-04-27 16:28   ` Randy Dunlap
  2011-04-27 16:16 ` [RFC PATCH 2/3] support for broken memory modules (BadRAM) Stefan Assmann
  2011-04-27 16:16 ` [RFC PATCH 3/3] Add documentation and credits for BadRAM Stefan Assmann
  2 siblings, 1 reply; 13+ messages in thread
From: Stefan Assmann @ 2011-04-27 16:16 UTC (permalink / raw)
  To: linux-mm
  Cc: tony.luck, andi, mingo, hpa, rick, akpm, lwoodman, riel, sassmann

Adding this function to allow easy parsing of unsigned long values from the
beginning of strings. Convenience function to parse pointers from the kernel
command line.

Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
---
 include/linux/kernel.h |    1 +
 lib/cmdline.c          |   35 +++++++++++++++++++++++++++++++++++
 2 files changed, 36 insertions(+), 0 deletions(-)

diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index 2fe6e84..b6ded39 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -218,6 +218,7 @@ extern int vsscanf(const char *, const char *, va_list)
 
 extern int get_option(char **str, int *pint);
 extern char *get_options(const char *str, int nints, int *ints);
+extern int get_next_ulong(char **str, unsigned long *val, char sep, int base);
 extern unsigned long long memparse(const char *ptr, char **retptr);
 
 extern int core_kernel_text(unsigned long addr);
diff --git a/lib/cmdline.c b/lib/cmdline.c
index f5f3ad8..82a6616 100644
--- a/lib/cmdline.c
+++ b/lib/cmdline.c
@@ -114,6 +114,41 @@ char *get_options(const char *str, int nints, int *ints)
 }
 
 /**
+ *	get_next_ulong - Parse unsigned long at the beginning of a string
+ *	@strp: (output) String to be parsed
+ *	@val: (output) unsigned long carrying the result
+ *	@sep: character specifying the separator
+ *	@base: number system of the parsed value
+ *
+ *	This function parses an unsigned long value at the beginning of a
+ *	string. The string may begin with a separator or an unsigned long
+ *	value.
+ *	After the function is run val will contain the parsed value and strp
+ *	will point to the character *after* the parsed unsigned long.
+ *
+ *	In the error case 0 is returned, val and *strp stay unaltered.
+ *	Otherwise return 1.
+ */
+int get_next_ulong(char **strp, unsigned long *val, char sep, int base)
+{
+	char *tmp;
+
+	if (!strp || !(*strp))
+		return 0;
+
+	tmp = *strp;
+	if (*tmp == sep)
+		tmp++;
+
+	*val = simple_strtoul(tmp, strp, base);
+
+	if (tmp == *strp)
+		return 0; /* no new value parsed */
+	else
+		return 1;
+}
+
+/**
  *	memparse - parse a string with mem suffixes into a number
  *	@ptr: Where parse begins
  *	@retptr: (output) Optional pointer to next char after parse completes
-- 
1.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC PATCH 2/3] support for broken memory modules (BadRAM)
  2011-04-27 16:16 [RFC PATCH 0/3] support for broken memory modules (BadRAM) Stefan Assmann
  2011-04-27 16:16 ` [RFC PATCH 1/3] Add string parsing function get_next_ulong Stefan Assmann
@ 2011-04-27 16:16 ` Stefan Assmann
  2011-04-27 21:12   ` Andi Kleen
  2011-04-27 16:16 ` [RFC PATCH 3/3] Add documentation and credits for BadRAM Stefan Assmann
  2 siblings, 1 reply; 13+ messages in thread
From: Stefan Assmann @ 2011-04-27 16:16 UTC (permalink / raw)
  To: linux-mm
  Cc: tony.luck, andi, mingo, hpa, rick, akpm, lwoodman, riel, sassmann

BadRAM is a mechanism to exclude memory addresses (pages) from being used by
the system. The addresses are given to the kernel via kernel command line.
This is useful for systems with defective RAM modules, especially if the RAM
modules cannot be replaced.

command line parameter: badram=<addr>,<mask>[,...]

Patterns for the command line parameter can be obtained by running Memtest86.
In Memtest86 press "c" for configuration, select "Error Report Mode" and
finally "BadRAM Patterns"

This has already been done by Rick van Rein a long time ago but it never found
it's way into the kernel.

Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
---
 mm/memory-failure.c |   95 +++++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 95 insertions(+), 0 deletions(-)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 0207c2f..dac506c 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -52,6 +52,7 @@
 #include <linux/swapops.h>
 #include <linux/hugetlb.h>
 #include <linux/memory_hotplug.h>
+#include <linux/memblock.h>
 #include "internal.h"
 
 int sysctl_memory_failure_early_kill __read_mostly = 0;
@@ -1519,3 +1520,97 @@ int is_hwpoison_address(unsigned long addr)
 	return is_hwpoison_entry(entry);
 }
 EXPORT_SYMBOL_GPL(is_hwpoison_address);
+
+/*
+ * Return 0 if no address found else return 1, new address is stored in addrp.
+ **/
+static int __init next_masked_address(unsigned long *addrp, unsigned long mask)
+{
+	unsigned long total_mem = (max_pfn + 1) << PAGE_SHIFT;
+	unsigned long tmp_addr = *addrp;
+	unsigned long inc = 1;
+
+	while (inc & mask)
+		inc = inc << 1;
+
+	while (inc != 0) {
+		tmp_addr += inc;
+		tmp_addr &= ~mask;
+		tmp_addr |= ((*addrp) & mask);
+
+		/* address is bigger than phys memory */
+		if (tmp_addr >= total_mem)
+			return 0;
+
+		/* address found */
+		if (tmp_addr > *addrp) {
+			*addrp = tmp_addr;
+			return 1;
+		}
+
+		while (inc & ~mask)
+			inc = inc << 1;
+		inc = inc << 1;
+	}
+
+	return 0;
+}
+
+/*
+ * Set hwpoison pageflag on all pages specified by addr/mask.
+ */
+static int __init badram_mark_pages(unsigned long addr, unsigned long mask)
+{
+	unsigned long pagecount = 0;
+
+	mask |= ~PAGE_MASK; /* smallest chunk is a page */
+	addr &= mask;
+
+	printk(KERN_INFO "BadRAM: mark 0x%lx with mask 0x%0lx\n", addr, mask);
+
+	do {
+		unsigned long pfn = addr >> PAGE_SHIFT;
+		struct page *page = pfn_to_page(pfn);
+
+		if (!pfn_valid(pfn))
+			continue;
+		if (memblock_is_reserved(addr)) {
+			printk(KERN_DEBUG
+			       "BadRAM: page %lu reserved by kernel\n", pfn);
+			continue;
+		}
+
+		SetPageHWPoison(page);
+		atomic_long_add(1, &mce_bad_pages);
+		pagecount++;
+		pr_debug("BadRAM: page %lu (addr 0x%0lx) marked bad "
+			 "[total %lu]\n", pfn, addr, pagecount);
+	} while (next_masked_address(&addr, mask));
+
+	return pagecount;
+}
+
+static int __init badram_setup(char *str)
+{
+	printk(KERN_DEBUG "BadRAM: cmdline option is %s\n", str);
+
+	if (*str++ != '=')
+		return 0;
+
+	while (*str) {
+		unsigned long addr = 0, mask = 0, pagecount = 0;
+
+		if (!get_next_ulong(&str, &addr, ',', 16)) {
+			printk(KERN_WARNING "BadRAM: parsing error\n");
+			return 0;
+		}
+		if (!get_next_ulong(&str, &mask, ',', 16))
+			mask = ~(0UL);
+
+		pagecount = badram_mark_pages(addr, mask);
+		printk(KERN_INFO "BadRAM: %lu page(s) bad\n", pagecount);
+	}
+
+	return 0;
+}
+__setup("badram", badram_setup);
-- 
1.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC PATCH 3/3] Add documentation and credits for BadRAM
  2011-04-27 16:16 [RFC PATCH 0/3] support for broken memory modules (BadRAM) Stefan Assmann
  2011-04-27 16:16 ` [RFC PATCH 1/3] Add string parsing function get_next_ulong Stefan Assmann
  2011-04-27 16:16 ` [RFC PATCH 2/3] support for broken memory modules (BadRAM) Stefan Assmann
@ 2011-04-27 16:16 ` Stefan Assmann
  2011-04-27 16:49   ` Randy Dunlap
  2 siblings, 1 reply; 13+ messages in thread
From: Stefan Assmann @ 2011-04-27 16:16 UTC (permalink / raw)
  To: linux-mm
  Cc: tony.luck, andi, mingo, hpa, rick, akpm, lwoodman, riel, sassmann

Add Documentation/BadRAM.txt for in-depth information and update
Documentation/kernel-parameters.txt.

Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
---
 CREDITS                             |    9 +
 Documentation/BadRAM.txt            |  369 +++++++++++++++++++++++++++++++++++
 Documentation/kernel-parameters.txt |    5 +
 3 files changed, 383 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/BadRAM.txt

diff --git a/CREDITS b/CREDITS
index 1d39a6d..22ee8ae 100644
--- a/CREDITS
+++ b/CREDITS
@@ -2899,6 +2899,15 @@ S: 6 Karen Drive
 S: Malvern, Pennsylvania 19355
 S: USA
 
+N: Rick van Rein
+E: rick@vanrein.org
+W: http://rick.vanrein.org/
+D: Memory, the BadRAM subsystem dealing with defective RAM modules.
+S: Haarlebrink 5
+S: 7544 WP  Enschede
+S: The Netherlands
+P: 1024D/89754606  CD46 B5F2 E876 A5EE 9A85  1735 1411 A9C2 8975 4606
+
 N: Stefan Reinauer
 E: stepan@linux.de
 W: http://www.freiburg.linux.de/~stepan/
diff --git a/Documentation/BadRAM.txt b/Documentation/BadRAM.txt
new file mode 100644
index 0000000..67a7ccc
--- /dev/null
+++ b/Documentation/BadRAM.txt
@@ -0,0 +1,369 @@
+INFORMATION ON USING BAD RAM MODULES
+====================================
+
+The BadRAM feature enables Linux to run on broken memory.  The
+resulting system will be stable and healthy, because the kernel
+simply never allocates the faulty pages for use.  This is how
+to setup BadRAM if your memory is failing.
+
+
+Introduction
+------------
+
+As RAM memory grows smaller, it also becomes harder to manufacture
+chips that are perfect.  Each single cell that is failing could cause
+an entire memory module to fail.  Even though manufacturers put in
+extra cells to replace failed ones, it is still possible that the
+sensitive small structures get damaged by an electric discharge on
+their pins.  Such damage leads to problems in fixed locations of
+the address space of a memory module, which is what theory predicts
+and has been confirmed by years of experience with bad memory.
+
+It is not necessary for such a memory module to be discarded.  All
+pages of memory behave the same, and if only we skip the failing
+pages we can continue to use the module for many more years.  The
+operating system kernel simply has to avoid using the blocks that
+are damaged.  This is easy to do in the part of the kernel where
+memory pages are allocated.
+
+
+Reasons for using BadRAM
+------------------------
+
+Chip manufacturing process use lots of harsh chemicals, and the less
+of these used, the better.  Being able to make good use of partially
+failed memory chips means that far less of those chemicals are needed
+to provide storage.  This reduces expenses and it is lighter on the
+environment in which we live.
+
+This kernel feature clearly shows that Linux is "the flexible OS".
+If something does not work, fix it.  Also, share it with all the
+others that could use it.  After more than a decennium of BadRAM,
+the response has been purely positive, because it has helped real
+people to solve real problems.
+
+One important use for this feature is with laptops that have their
+memory soldered in.  Such laptops would have to be discarded as a
+whole, but with BadRAM in place they can continue to be used
+without further restrictions.
+
+Finally, running a system on broken memory is just plain cool ;-)
+
+
+Running example
+---------------
+
+To run this project, I was given two DIMMs, 32 MB each. One, that we
+shall use as a running example in this text, contained 512 faulty bits,
+spread over 1/4 of the address range in a regular pattern.  This looks
+a lot like the fauly pattern that many others have reported; the only
+common other pattern is a single faulty spot.  With such memory, a few
+tricks with a thorough RAM tester and some binary calculations suffice
+to write these fault patterns down in 2 longword numbers.  The format
+of these is hexadecimal, which is a condensed way of writing down the
+binary patterns that make the hardware patterns recognisable.
+
+After being patched and invoked with the properly formatted description,
+the kernel held back only the memory pages with faults, and never haded
+them out for allocation. The allocation routines could therefore
+progress as normally, without any adaption.  This is important, since
+all the work is done at booting time.  After booting, the kernel does
+not have to do spend any time to implement BadRAM.
+
+As a result of this initial exercise, I gained 30 MB out of the 32 MB
+DIMM that would otherwise have been thrown away.  Of course, these
+numbers scale up with larger memory modules, but the principle is
+the same.
+
+
+The structure of memory failures
+--------------------------------
+
+Memory chips are usually laid out in a roughly equal number of rows
+and columns, making it a square of cells that each store one bit.
+When addressing a bit, the processor sends the row and column in
+separate phases, and then reads or writes its value.  The rows and
+columns are therefore visible on the outside of a chip.
+
+The connections of row and column lines to the outside world is
+usually protected by a buffer.  It can happen that a static
+discharge damages such a buffer, causing an entire row or an
+entire column to fail.  This means that a series of bits become
+unusable in a single page or in a regular pattern of pages,
+depending on whether it was a row or column that got damaged.
+
+For this reason, BadRAM was designed to describe memory faults
+in a pattern of address/mask pairs.  An address locates an
+error and a zero on the corresponding position in the mask
+defines which bits in the address may be replaced with any
+other value.  This has shown to work as a tight description
+of error patterns: it is very compact, but does not waste pages
+that are good.
+
+
+BadRAM's notation for memory faults
+-----------------------------------
+
+Instead of manually providing all 512 errors in the running example
+to the kernel, it's easier to use a pattern notation. Since the
+regularity is based on address decoding software, which generally
+takes certain bits into account and ignores others, we shall
+provide a faulty address F, together with a bit mask M that
+specifies which bits must be equal to F. In C code, an address A
+is faulty if and only if
+
+	(F & M) == (A & M)
+
+or alternately (closer to a hardware implementation):
+
+	~((F ^ A) & M)
+
+In the example 32 MB chip, I had the faulty addresses in 8MB-16MB:
+
+	xxx42f4         ....0100....
+	xxx62f4         ....0110....
+	xxxc2f4         ....1100....
+	xxxe2f4         ....1110....
+
+The second column represents the alternating hex digit in binary form.
+Apperantly, the first and next to last binary digit can be anything,
+so the binary mask for that part is 0101. The mask for the part after
+this is 0xfff, and the part before should select anything in the range
+8MB-16MB, or 0x00800000-0x01000000; this is done with a bitmask
+0xff80xxxx. Combining these partial masks, we get:
+
+	F=0x008042f4    M=0xff805fff
+
+That covers every fault in this DIMM; for more complicated failing
+DIMMs, or for a combination of multiple failing DIMMs, it can be
+necessary to set up a number of such F/M pairs.
+
+
+Getting started
+---------------
+
+If you experience RAM trouble, first read Documentation/memory.txt
+and try out the mem=4M trick to see if at least some initial parts
+of your RAM work well.  Note that 4 MB will not be able to hold a
+modern desktop, so if you rely on that you would have to set the
+limit higher (and accept that your sanity check is not as tight as
+possible).
+
+The BadRAM routines halt the kernel in panic if the reserved area
+of memory (containing kernel stuff) contains a faulty address.  It
+will only do that when supplied with the patterns below; this
+initial check is merely to see if this is likely to happen.
+
+
+Running a memory checker
+------------------------
+
+There is no memory checker built into the kernel, to avoid delays
+at runtime or while booting. If you experience problems that may
+be caused by RAM, run a good outside RAM checker.  The Memtest86
+checker is a popular, free, high-quality checker.  Many Linux
+distributions include it as an alternate boot option, so you may
+simply find it in your GRUB boot menu.
+
+The memory checker lists all addresses that have a fault.  It will
+do this for a given configuration of the DIMMs in your motherboard;
+if you replace or move memory modules you may find other addresses.
+In the running example's 32 MB chip, with the DIMM in slot #0 on
+the motherboard, the errors were found in the 8MB-16MB range:
+
+	xxx42f4
+	xxx62f4
+	xxxc2f4
+	xxxe2f4
+
+The error reported was a "sticky 1 bit", a memory bit that always
+reads as "1" even if a "0" was just written to it.  This is
+probably caused by a damaged buffer on one of the rows or columns
+in one of the memory chips.
+
+It would be a lot of work to collect the individual errors and
+condense them into a pattern.  That is why I patched the
+Memtest86 (v2.3+) checker to directly print out the address/mask
+pairs that are used by this kernel feature. All you would do is
+select the BadRAM printout option at the start of the scan, and
+then leave it running for hours and hours, until it has made at
+least one pass.  The patterns are printed each time a bit is
+added, but each line contains all faults found up to that point,
+so you would write down the last set of patterns printed, and
+supply that as a boot option in your next run of a
+BadRAM-capable Linux kernel.
+
+If you use this patch on an x86_64 architecture, your addresses are
+twice as long.  Fill up with zeroes in the address and with f's in
+the mask.  The latter example would thus become:
+
+	mem=24M badram=0x0000000000f00000,0xfffffffffff00000
+
+The patch applies the changes to both x86 and x86_64 code bases
+at the same time.  Patching but not compiling maps the entire
+source tree at once, which makes more sense than splitting the
+patch into an x86 and x86_64 branch, because those two branches
+could not be applied at the same time because they would overlap.
+
+
+Rebooting Linux
+---------------
+
+Once the fault patterns are known we simply restart Linux with
+these F/M pairs as a parameter If your normal boot options look
+like
+
+       root=/dev/sda1 ro
+
+you should now boot with options
+
+       root=/dev/sda1 ro badram=0x008042f4,0xff805fff
+
+or perhaps by mentioning more F/M pairs in an order F0,M0,F1,M1,...
+When you provide an odd number of arguments to badram, the default
+mask 0xffffffff (meaning that only one address is matched) is
+applied to the last address.
+
+If your bootloader is GRUB, you can supply this additional
+parameter interactively during boot.  This way, you can try them
+before you edit /boot/grub/menu.lst to put them in forever.
+
+When the kernel now boots, it should not give any trouble with RAM.
+Mind you, this is under the assumption that the kernel and its data
+storage do not overlap an erroneous part. If they do, and the
+kernel does not choke on it right away, BadRAM itself will stop the
+system with a kernel panic.  When the error is that low in memory,
+you will need additional bootloader magic, to load the kernel at an
+alternative address.
+
+Now look up your memory status with
+
+	cat /proc/meminfo |grep HardwareCorrupted
+
+which prints a single line with information like
+
+HardwareCorrupted:  2048 kB
+
+The entry HardwareCorrupted: 2048k represents the loss of 2MB
+of general purpose RAM due to the errors. Or, positively rephrased,
+instead of throwing out 32MB as useless, you only throw out 2MB.
+Note that 2048 kB equals 512 pages of 4kB.  The size of a page is
+defined by the processor architecture.
+
+If the system is stable (which you can test by compiling a few
+kernels, and a few file finds in / or so) you can decide to add
+the boot parameter to /boot/grub/menu.lst, in addition to any
+other boot parameters that may already be there.  For example,
+
+	kernel /boot/vmlinuz root=/dev/sda1 ro
+
+would become
+
+	kernel /boot/vmlinuz root=/dev/sda1 ro badram=0x008042f4,0xff805fff
+
+Depending on how helpful your Linux distribution is, you may
+have to add this feature again after upgrading your kernel.  If
+your boot loader is GRUB, you can always do this manually if you
+rebooted before you remembered to make that adaption.
+
+
+BadRAM classification
+---------------------
+
+This technique might start a lively market for "dead" RAM. It is
+important to realise that some RAMs are more dead than others. So,
+instead of just providing a RAM size, it is also important to know
+the BadRAM class, which is defined as follows:
+
+	A BadRAM class N means that at most 2^N bytes have a problem,
+	and that all problems with the RAMs are persistent: They
+	are predictable and always show up.
+
+The DIMM that serves as an example here was of class 9, since 512=2^9
+errors were found. Higher classes are worse, "correct" RAM is of class
+-1 (or even less, at your choice).
+Class N also means that the bitmask for your chip (if there's just one,
+that is) counts N bits "0" and it means that (if no faults fall in the
+same page) an amount of 2^N*PAGESIZE memory is lost, in the example on
+an x86 architecture that would be 2^9*4k=2MB, which accounts for the
+initial claim of 30MB RAM gained with this DIMM.
+
+Note that this scheme has deliberately been defined to be independent
+of memory technology and of computer architecture.
+
+
+Further Possibilities
+---------------------
+
+**Slab allocation support**
+
+It would be possible to use even more of the faulty RAMs by employing
+them for slabs. The smaller allocation granularity of slabs makes it
+possible to throw out just, say, 32 bytes surrounding an error. This
+would mean that the example DIMM only caused a loss of 16kB instead
+of 2MB, or scaled-up similar values for larger memory sizes.  One
+specific area that could benefit from this is the growing market
+for embedded devices, which usually wants to meet tight budgets.
+
+It should be possible to make the slab allocator prefer pages with
+broken memory, and allocate the faulty places in memory before the
+other slabs are made available to the kernel.  In the best possible
+situation, this could reduce the loss of good RAM cells to zero!
+
+**Support for low-memory errors**
+
+To the best of my knowledge, boot loaders like GRUB cannot load
+the Linux kernel in non-standard locations.  This means that any
+errors at low memory locations cannot be overcome with BadRAM.
+
+Anything that physically alters the memory layout can be used
+to overcome such problems; this may be achieved through BIOS
+settings, or by adding or swapping memory modules.
+
+A general solution could be to use a boot loader that can load
+the Linux kernel (and its initial memory allocation) at other
+memory addresses than are standard.
+
+
+**Boot-time memory checking**
+
+Many suggestions have been made to insert a RAM checker at boot time;
+since this would leave the time to do only very meager checking, it
+is not a reasonable option; we already have a half-done BIOS check
+doing that!
+
+**ECC RAM integration**
+
+It would be interesting to integrate this functionality with the
+self-verifying nature of ECC RAM. These memories can even distinguish
+between recoverable and unrecoverable errors! Such memory has been
+handled in older operating systems by `testing' once-failed memory
+blocks for a while, by placing only (reloadable) program code in it.
+
+I possess no faulty ECC modules to work this out, and there is no
+general use for it either.
+
+
+Names and Places
+----------------
+
+The home page of this project is on
+	http://rick.vanrein.org/linux/badram
+This page also links to Nico Schmoigl's experimental extensions to
+this patch (with debugging and a few other fancy things).
+
+In case you have experiences with the BadRAM software which differ from
+the test reportings on that site, I hope you will mail me with that
+new information.
+
+The BadRAM project is an idea and implementation by
+	Rick van Rein
+	Haarlebrink 5
+	7544 WP  Enschede
+	The Netherlands
+	rick@vanrein.org
+If you like it, a postcard would be much appreciated ;-)
+
+
+							Enjoy,
+							 -Rick.
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index f4a04c0..84f9ef5 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -373,6 +373,11 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
 
 	autotest	[IA64]
 
+	badram=		When CONFIG_MEMORY_FAILURE is set, this parameter
+			allows memory areas to be flagged as hwpoison.
+			Format: <addr>,<mask>[,...]
+			See Documentation/BadRAM.txt
+
 	baycom_epp=	[HW,AX25]
 			Format: <io>,<mode>
 
-- 
1.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 1/3] Add string parsing function get_next_ulong
  2011-04-27 16:16 ` [RFC PATCH 1/3] Add string parsing function get_next_ulong Stefan Assmann
@ 2011-04-27 16:28   ` Randy Dunlap
  0 siblings, 0 replies; 13+ messages in thread
From: Randy Dunlap @ 2011-04-27 16:28 UTC (permalink / raw)
  To: Stefan Assmann
  Cc: linux-mm, tony.luck, andi, mingo, hpa, rick, akpm, lwoodman, riel

On Wed, 27 Apr 2011 18:16:45 +0200 Stefan Assmann wrote:

> Adding this function to allow easy parsing of unsigned long values from the
> beginning of strings. Convenience function to parse pointers from the kernel
> command line.
> 
> Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
> ---
>  include/linux/kernel.h |    1 +
>  lib/cmdline.c          |   35 +++++++++++++++++++++++++++++++++++
>  2 files changed, 36 insertions(+), 0 deletions(-)
> 
> diff --git a/include/linux/kernel.h b/include/linux/kernel.h
> index 2fe6e84..b6ded39 100644
> --- a/include/linux/kernel.h
> +++ b/include/linux/kernel.h
> @@ -218,6 +218,7 @@ extern int vsscanf(const char *, const char *, va_list)
>  
>  extern int get_option(char **str, int *pint);
>  extern char *get_options(const char *str, int nints, int *ints);
> +extern int get_next_ulong(char **str, unsigned long *val, char sep, int base);
>  extern unsigned long long memparse(const char *ptr, char **retptr);
>  
>  extern int core_kernel_text(unsigned long addr);
> diff --git a/lib/cmdline.c b/lib/cmdline.c
> index f5f3ad8..82a6616 100644
> --- a/lib/cmdline.c
> +++ b/lib/cmdline.c
> @@ -114,6 +114,41 @@ char *get_options(const char *str, int nints, int *ints)
>  }
>  
>  /**
> + *	get_next_ulong - Parse unsigned long at the beginning of a string
> + *	@strp: (output) String to be parsed

                ^ input/output

> + *	@val: (output) unsigned long carrying the result
> + *	@sep: character specifying the separator
> + *	@base: number system of the parsed value
> + *
> + *	This function parses an unsigned long value at the beginning of a
> + *	string. The string may begin with a separator or an unsigned long
> + *	value.
> + *	After the function is run val will contain the parsed value and strp

                                  @val                                  @strp

> + *	will point to the character *after* the parsed unsigned long.
> + *
> + *	In the error case 0 is returned, val and *strp stay unaltered.

                                         @val and @strp

> + *	Otherwise return 1.
> + */
> +int get_next_ulong(char **strp, unsigned long *val, char sep, int base)
> +{
> +	char *tmp;
> +
> +	if (!strp || !(*strp))
> +		return 0;
> +
> +	tmp = *strp;
> +	if (*tmp == sep)
> +		tmp++;
> +
> +	*val = simple_strtoul(tmp, strp, base);
> +
> +	if (tmp == *strp)
> +		return 0; /* no new value parsed */
> +	else
> +		return 1;
> +}
> +
> +/**
>   *	memparse - parse a string with mem suffixes into a number
>   *	@ptr: Where parse begins
>   *	@retptr: (output) Optional pointer to next char after parse completes
> -- 


---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 3/3] Add documentation and credits for BadRAM
  2011-04-27 16:16 ` [RFC PATCH 3/3] Add documentation and credits for BadRAM Stefan Assmann
@ 2011-04-27 16:49   ` Randy Dunlap
  2011-04-27 20:05     ` Stefan Assmann
  0 siblings, 1 reply; 13+ messages in thread
From: Randy Dunlap @ 2011-04-27 16:49 UTC (permalink / raw)
  To: Stefan Assmann
  Cc: linux-mm, tony.luck, andi, mingo, hpa, rick, akpm, lwoodman, riel

On Wed, 27 Apr 2011 18:16:47 +0200 Stefan Assmann wrote:

> Add Documentation/BadRAM.txt for in-depth information and update
> Documentation/kernel-parameters.txt.
> 
> Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
> ---
>  CREDITS                             |    9 +
>  Documentation/BadRAM.txt            |  369 +++++++++++++++++++++++++++++++++++
>  Documentation/kernel-parameters.txt |    5 +
>  3 files changed, 383 insertions(+), 0 deletions(-)
>  create mode 100644 Documentation/BadRAM.txt

> diff --git a/Documentation/BadRAM.txt b/Documentation/BadRAM.txt
> new file mode 100644
> index 0000000..67a7ccc
> --- /dev/null
> +++ b/Documentation/BadRAM.txt
> @@ -0,0 +1,369 @@

> +Reasons for using BadRAM
> +------------------------
> +
> +Chip manufacturing process use lots of harsh chemicals, and the less

                      processes

> +of these used, the better.  Being able to make good use of partially
> +failed memory chips means that far less of those chemicals are needed
> +to provide storage.  This reduces expenses and it is lighter on the
> +environment in which we live.
> +
...

> +
> +
> +Running example
> +---------------
> +
...
> +
> +After being patched and invoked with the properly formatted description,
> +the kernel held back only the memory pages with faults, and never haded

                                                                     handed

> +them out for allocation. The allocation routines could therefore
> +progress as normally, without any adaption.  This is important, since
> +all the work is done at booting time.  After booting, the kernel does
> +not have to do spend any time to implement BadRAM.
> +
> +As a result of this initial exercise, I gained 30 MB out of the 32 MB
> +DIMM that would otherwise have been thrown away.  Of course, these
> +numbers scale up with larger memory modules, but the principle is
> +the same.
> +
> +

> +BadRAM's notation for memory faults
> +-----------------------------------
> +
> +Instead of manually providing all 512 errors in the running example
> +to the kernel, it's easier to use a pattern notation. Since the
> +regularity is based on address decoding software, which generally
> +takes certain bits into account and ignores others, we shall
> +provide a faulty address F, together with a bit mask M that
> +specifies which bits must be equal to F. In C code, an address A
> +is faulty if and only if
> +
> +	(F & M) == (A & M)
> +
> +or alternately (closer to a hardware implementation):
> +
> +	~((F ^ A) & M)
> +
> +In the example 32 MB chip, I had the faulty addresses in 8MB-16MB:
> +
> +	xxx42f4         ....0100....
> +	xxx62f4         ....0110....
> +	xxxc2f4         ....1100....
> +	xxxe2f4         ....1110....
> +
> +The second column represents the alternating hex digit in binary form.
> +Apperantly, the first and next to last binary digit can be anything,

   Apparently,

> +so the binary mask for that part is 0101. The mask for the part after
> +this is 0xfff, and the part before should select anything in the range
> +8MB-16MB, or 0x00800000-0x01000000; this is done with a bitmask
> +0xff80xxxx. Combining these partial masks, we get:
> +
> +	F=0x008042f4    M=0xff805fff
> +
> +That covers every fault in this DIMM; for more complicated failing
> +DIMMs, or for a combination of multiple failing DIMMs, it can be
> +necessary to set up a number of such F/M pairs.
> +
> +
> +Running a memory checker
> +------------------------
> +
> +There is no memory checker built into the kernel, to avoid delays
> +at runtime or while booting. If you experience problems that may
> +be caused by RAM, run a good outside RAM checker.  The Memtest86
> +checker is a popular, free, high-quality checker.  Many Linux
> +distributions include it as an alternate boot option, so you may
> +simply find it in your GRUB boot menu.

                          boot loader's boot menu.

> +
> +The memory checker lists all addresses that have a fault.  It will
> +do this for a given configuration of the DIMMs in your motherboard;
> +if you replace or move memory modules you may find other addresses.
> +In the running example's 32 MB chip, with the DIMM in slot #0 on
> +the motherboard, the errors were found in the 8MB-16MB range:
> +
> +	xxx42f4
> +	xxx62f4
> +	xxxc2f4
> +	xxxe2f4
> +
> +The error reported was a "sticky 1 bit", a memory bit that always
> +reads as "1" even if a "0" was just written to it.  This is
> +probably caused by a damaged buffer on one of the rows or columns
> +in one of the memory chips.
> +
...
> +
> +Rebooting Linux
> +---------------
> +
> +Once the fault patterns are known we simply restart Linux with
> +these F/M pairs as a parameter If your normal boot options look
> +like
> +
> +       root=/dev/sda1 ro
> +
> +you should now boot with options
> +
> +       root=/dev/sda1 ro badram=0x008042f4,0xff805fff
> +
> +or perhaps by mentioning more F/M pairs in an order F0,M0,F1,M1,...
> +When you provide an odd number of arguments to badram, the default
> +mask 0xffffffff (meaning that only one address is matched) is
> +applied to the last address.
> +
> +If your bootloader is GRUB, you can supply this additional
> +parameter interactively during boot.  This way, you can try them
> +before you edit /boot/grub/menu.lst to put them in forever.

I thought that /boot/grub/grub.conf was the current file name. (?)

> +
> +When the kernel now boots, it should not give any trouble with RAM.
> +Mind you, this is under the assumption that the kernel and its data
> +storage do not overlap an erroneous part. If they do, and the
> +kernel does not choke on it right away, BadRAM itself will stop the
> +system with a kernel panic.  When the error is that low in memory,
> +you will need additional bootloader magic, to load the kernel at an
> +alternative address.
> +
> +Now look up your memory status with
> +
> +	cat /proc/meminfo |grep HardwareCorrupted
> +
> +which prints a single line with information like
> +
> +HardwareCorrupted:  2048 kB
> +
> +The entry HardwareCorrupted: 2048k represents the loss of 2MB
> +of general purpose RAM due to the errors. Or, positively rephrased,
> +instead of throwing out 32MB as useless, you only throw out 2MB.
> +Note that 2048 kB equals 512 pages of 4kB.  The size of a page is
> +defined by the processor architecture.
> +
> +If the system is stable (which you can test by compiling a few
> +kernels, and a few file finds in / or so) you can decide to add
> +the boot parameter to /boot/grub/menu.lst, in addition to any

file name?

> +other boot parameters that may already be there.  For example,
> +
> +	kernel /boot/vmlinuz root=/dev/sda1 ro
> +
> +would become
> +
> +	kernel /boot/vmlinuz root=/dev/sda1 ro badram=0x008042f4,0xff805fff
> +
> +Depending on how helpful your Linux distribution is, you may
> +have to add this feature again after upgrading your kernel.  If
> +your boot loader is GRUB, you can always do this manually if you
> +rebooted before you remembered to make that adaption.
> +
> +
...


> diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
> index f4a04c0..84f9ef5 100644
> --- a/Documentation/kernel-parameters.txt
> +++ b/Documentation/kernel-parameters.txt
> @@ -373,6 +373,11 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
>  
>  	autotest	[IA64]
>  
> +	badram=		When CONFIG_MEMORY_FAILURE is set, this parameter
> +			allows memory areas to be flagged as hwpoison.

hwpoison??  undefined.

> +			Format: <addr>,<mask>[,...]
> +			See Documentation/BadRAM.txt
> +


---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 3/3] Add documentation and credits for BadRAM
  2011-04-27 16:49   ` Randy Dunlap
@ 2011-04-27 20:05     ` Stefan Assmann
  0 siblings, 0 replies; 13+ messages in thread
From: Stefan Assmann @ 2011-04-27 20:05 UTC (permalink / raw)
  To: Randy Dunlap
  Cc: linux-mm, tony.luck, andi, mingo, hpa, rick, akpm, lwoodman, riel

On 27.04.2011 18:49, Randy Dunlap wrote:
> On Wed, 27 Apr 2011 18:16:47 +0200 Stefan Assmann wrote:
> 
>> Add Documentation/BadRAM.txt for in-depth information and update
>> Documentation/kernel-parameters.txt.
>>
>> Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
>> ---
>>  CREDITS                             |    9 +
>>  Documentation/BadRAM.txt            |  369 +++++++++++++++++++++++++++++++++++
>>  Documentation/kernel-parameters.txt |    5 +
>>  3 files changed, 383 insertions(+), 0 deletions(-)
>>  create mode 100644 Documentation/BadRAM.txt
> 
>> diff --git a/Documentation/BadRAM.txt b/Documentation/BadRAM.txt
>> new file mode 100644
>> index 0000000..67a7ccc
>> --- /dev/null
>> +++ b/Documentation/BadRAM.txt
>> @@ -0,0 +1,369 @@
> 

[snip]

Spelling errors will be fixed in next version. Thanks!

> I thought that /boot/grub/grub.conf was the current file name. (?)

Not sure about that, some distros use menu.lst others grub.conf for
GRUB. Also GRUB 2 uses /boot/grub/grub.cfg. Either of these would be
fine with me, /boot/grub/menu.lst sometimes is a symlink to
/boot/grub/grub.conf and I felt it's the most convenient one, but I have
no strong preference here.

> 
>> +
>> +When the kernel now boots, it should not give any trouble with RAM.
>> +Mind you, this is under the assumption that the kernel and its data
>> +storage do not overlap an erroneous part. If they do, and the
>> +kernel does not choke on it right away, BadRAM itself will stop the
>> +system with a kernel panic.  When the error is that low in memory,
>> +you will need additional bootloader magic, to load the kernel at an
>> +alternative address.
>> +
>> +Now look up your memory status with
>> +
>> +	cat /proc/meminfo |grep HardwareCorrupted
>> +
>> +which prints a single line with information like
>> +
>> +HardwareCorrupted:  2048 kB
>> +
>> +The entry HardwareCorrupted: 2048k represents the loss of 2MB
>> +of general purpose RAM due to the errors. Or, positively rephrased,
>> +instead of throwing out 32MB as useless, you only throw out 2MB.
>> +Note that 2048 kB equals 512 pages of 4kB.  The size of a page is
>> +defined by the processor architecture.
>> +
>> +If the system is stable (which you can test by compiling a few
>> +kernels, and a few file finds in / or so) you can decide to add
>> +the boot parameter to /boot/grub/menu.lst, in addition to any
> 
> file name?

See above comment.

> 
>> +other boot parameters that may already be there.  For example,
>> +
>> +	kernel /boot/vmlinuz root=/dev/sda1 ro
>> +
>> +would become
>> +
>> +	kernel /boot/vmlinuz root=/dev/sda1 ro badram=0x008042f4,0xff805fff
>> +
>> +Depending on how helpful your Linux distribution is, you may
>> +have to add this feature again after upgrading your kernel.  If
>> +your boot loader is GRUB, you can always do this manually if you
>> +rebooted before you remembered to make that adaption.
>> +
>> +
> ...
> 
> 
>> diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
>> index f4a04c0..84f9ef5 100644
>> --- a/Documentation/kernel-parameters.txt
>> +++ b/Documentation/kernel-parameters.txt
>> @@ -373,6 +373,11 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
>>  
>>  	autotest	[IA64]
>>  
>> +	badram=		When CONFIG_MEMORY_FAILURE is set, this parameter
>> +			allows memory areas to be flagged as hwpoison.
> 
> hwpoison??  undefined.

BadRAM depends on hwpoison to be available. The code is located in
mm/memory-failure.c. That file is only compiled if CONFIG_MEMORY_FAILURE
is defined.
grep CONFIG_MEMORY_FAILURE mm/Makefile
obj-$(CONFIG_MEMORY_FAILURE) += memory-failure.o

So for your question, if hwpoison is not available BadRAM also won't be
available.

> 
>> +			Format: <addr>,<mask>[,...]
>> +			See Documentation/BadRAM.txt
>> +
> 
> 
> ---
> ~Randy
> *** Remember to use Documentation/SubmitChecklist when testing your code ***

Thanks for the review Randy!

  Stefan

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 2/3] support for broken memory modules (BadRAM)
  2011-04-27 16:16 ` [RFC PATCH 2/3] support for broken memory modules (BadRAM) Stefan Assmann
@ 2011-04-27 21:12   ` Andi Kleen
  2011-04-28  6:34     ` Stefan Assmann
  0 siblings, 1 reply; 13+ messages in thread
From: Andi Kleen @ 2011-04-27 21:12 UTC (permalink / raw)
  To: Stefan Assmann
  Cc: linux-mm, tony.luck, andi, mingo, hpa, rick, akpm, lwoodman, riel

On Wed, Apr 27, 2011 at 06:16:46PM +0200, Stefan Assmann wrote:
> BadRAM is a mechanism to exclude memory addresses (pages) from being used by
> the system. The addresses are given to the kernel via kernel command line.
> This is useful for systems with defective RAM modules, especially if the RAM
> modules cannot be replaced.
> 
> command line parameter: badram=<addr>,<mask>[,...]
> 
> Patterns for the command line parameter can be obtained by running Memtest86.
> In Memtest86 press "c" for configuration, select "Error Report Mode" and
> finally "BadRAM Patterns"
> 
> This has already been done by Rick van Rein a long time ago but it never found
> it's way into the kernel.

Looks good to me, except for the too verbose printks. Logging
every page this way will be very noisy for larger areas.

The mask will also only work for very simple memory interleaving
setups, so I suspect it won't work for a lot of modern systems
unless you go more fancy.

Longer term there should be also likely a better way to specify
these pages than the kernel command line, e.g. the new persistent
store on some systems.

-Andi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 2/3] support for broken memory modules (BadRAM)
  2011-04-27 21:12   ` Andi Kleen
@ 2011-04-28  6:34     ` Stefan Assmann
  2011-04-28 15:08       ` Andi Kleen
  0 siblings, 1 reply; 13+ messages in thread
From: Stefan Assmann @ 2011-04-28  6:34 UTC (permalink / raw)
  To: Andi Kleen; +Cc: linux-mm, tony.luck, mingo, hpa, rick, akpm, lwoodman, riel

On 27.04.2011 23:12, Andi Kleen wrote:
> On Wed, Apr 27, 2011 at 06:16:46PM +0200, Stefan Assmann wrote:
>> BadRAM is a mechanism to exclude memory addresses (pages) from being used by
>> the system. The addresses are given to the kernel via kernel command line.
>> This is useful for systems with defective RAM modules, especially if the RAM
>> modules cannot be replaced.
>>
>> command line parameter: badram=<addr>,<mask>[,...]
>>
>> Patterns for the command line parameter can be obtained by running Memtest86.
>> In Memtest86 press "c" for configuration, select "Error Report Mode" and
>> finally "BadRAM Patterns"
>>
>> This has already been done by Rick van Rein a long time ago but it never found
>> it's way into the kernel.
> 
> Looks good to me, except for the too verbose printks. Logging
> every page this way will be very noisy for larger areas.

You're right, logging every page marked would be too verbose. That's why
I wrapped that logging into pr_debug.
http://www.kernel.org/doc/local/pr_debug.txt
This way it shouldn't bother anybody but it still could be useful in the
case of debugging.
However I kept the printk in the case of early allocated pages. The user
should be notified of the attempt to mark a page that's already been
allocated by the kernel itself.

> 
> The mask will also only work for very simple memory interleaving
> setups, so I suspect it won't work for a lot of modern systems
> unless you go more fancy.
> 
> Longer term there should be also likely a better way to specify
> these pages than the kernel command line, e.g. the new persistent
> store on some systems.

I'd be happy to help improving and refining things for more fancy
scenarios after this is done.

Thanks for the feedback Andi.

  Stefan

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 2/3] support for broken memory modules (BadRAM)
  2011-04-28  6:34     ` Stefan Assmann
@ 2011-04-28 15:08       ` Andi Kleen
  2011-04-28 15:51         ` Stefan Assmann
  0 siblings, 1 reply; 13+ messages in thread
From: Andi Kleen @ 2011-04-28 15:08 UTC (permalink / raw)
  To: Stefan Assmann
  Cc: Andi Kleen, linux-mm, tony.luck, mingo, hpa, rick, akpm, lwoodman, riel

> You're right, logging every page marked would be too verbose. That's why
> I wrapped that logging into pr_debug.

pr_debug still floods the kernel log buffer. On large systems
it often already overflows.

> However I kept the printk in the case of early allocated pages. The user
> should be notified of the attempt to mark a page that's already been
> allocated by the kernel itself.

That's ok, although if you're unlucky (e.g. hit a large mem_map area)
it can be also very nosiy.

It would be better if you fixed the printks to output ranges.

-Andi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 2/3] support for broken memory modules (BadRAM)
  2011-04-28 15:08       ` Andi Kleen
@ 2011-04-28 15:51         ` Stefan Assmann
  2011-04-28 17:44           ` Luck, Tony
  2011-04-29  9:14           ` Stefan Assmann
  0 siblings, 2 replies; 13+ messages in thread
From: Stefan Assmann @ 2011-04-28 15:51 UTC (permalink / raw)
  To: Andi Kleen; +Cc: linux-mm, tony.luck, mingo, hpa, rick, akpm, lwoodman, riel

On 04/28/2011 05:08 PM, Andi Kleen wrote:
>> You're right, logging every page marked would be too verbose. That's why
>> I wrapped that logging into pr_debug.
>
> pr_debug still floods the kernel log buffer. On large systems
> it often already overflows.

That's a pain then, I understand.

>
>> However I kept the printk in the case of early allocated pages. The user
>> should be notified of the attempt to mark a page that's already been
>> allocated by the kernel itself.
>
> That's ok, although if you're unlucky (e.g. hit a large mem_map area)
> it can be also very nosiy.
>
> It would be better if you fixed the printks to output ranges.

BadRAM patterns might often mark non-consecutive pages so outputting
ranges could be more verbose than what we have now. I'll try to think
of something to minimize log output.

   Stefan

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [RFC PATCH 2/3] support for broken memory modules (BadRAM)
  2011-04-28 15:51         ` Stefan Assmann
@ 2011-04-28 17:44           ` Luck, Tony
  2011-04-29  9:14           ` Stefan Assmann
  1 sibling, 0 replies; 13+ messages in thread
From: Luck, Tony @ 2011-04-28 17:44 UTC (permalink / raw)
  To: Stefan Assmann, Andi Kleen
  Cc: linux-mm, mingo, hpa, rick, akpm, lwoodman, riel

> BadRAM patterns might often mark non-consecutive pages so outputting
> ranges could be more verbose than what we have now. I'll try to think
> of something to minimize log output.

How about printing the pattern together with a count of pages affected:

badram: addr=foo mask=bar (1024 pages = 4MB marked unusable)

-Tony

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 2/3] support for broken memory modules (BadRAM)
  2011-04-28 15:51         ` Stefan Assmann
  2011-04-28 17:44           ` Luck, Tony
@ 2011-04-29  9:14           ` Stefan Assmann
  1 sibling, 0 replies; 13+ messages in thread
From: Stefan Assmann @ 2011-04-29  9:14 UTC (permalink / raw)
  To: Andi Kleen; +Cc: linux-mm, tony.luck, mingo, hpa, rick, akpm, lwoodman, riel

On 28.04.2011 17:51, Stefan Assmann wrote:
> On 04/28/2011 05:08 PM, Andi Kleen wrote:
>>> You're right, logging every page marked would be too verbose. That's why
>>> I wrapped that logging into pr_debug.
>>
>> pr_debug still floods the kernel log buffer. On large systems
>> it often already overflows.
> 
> That's a pain then, I understand.

I took a closer look at pr_debug and it seems that pr_debug gets
evaluated to a conditional branch and thus does not flood the log buffer
if not explicitly enabled. I confirmed that by dumping the log buffer.
So in the normal use-case things should be fine and if pr_debug really
is enabled it dumps a lot of data, which I hope is acceptable for
debugging purposes.

> 
>>
>>> However I kept the printk in the case of early allocated pages. The user
>>> should be notified of the attempt to mark a page that's already been
>>> allocated by the kernel itself.
>>
>> That's ok, although if you're unlucky (e.g. hit a large mem_map area)
>> it can be also very nosiy.
>>
>> It would be better if you fixed the printks to output ranges.
> 
> BadRAM patterns might often mark non-consecutive pages so outputting
> ranges could be more verbose than what we have now. I'll try to think
> of something to minimize log output.

How about the following:
static int __init badram_mark_pages(unsigned long addr, unsigned long mask)
{
	unsigned long pagecount = 0, is_reserved = 0;
[...]
	printk(KERN_INFO "BadRAM: mark 0x%lx with mask 0x%0lx\n", addr, mask);

	do {
[...]
		if (memblock_is_reserved(addr)) {
			pr_debug("BadRAM: page %lu reserved by kernel\n", pfn);
			is_reserved++;
			continue;
		}
[...]
		pr_debug("BadRAM: page %lu (addr 0x%0lx) marked bad "
			 "[total %lu]\n", pfn, addr, pagecount);
	} while (next_masked_address(&addr, mask));

	if (is_reserved)
		printk(KERN_WARNING "BadRAM: %lu page(s) already reserved and "
		       "could not be marked bad\n", is_reserved);

	return pagecount;
}

This way everything with possibly high volume log output is guarded by
pr_debug and only the summary gets printed by default. No log_buf
cluttering but also a bit harder to debug for somebody who's interested
in finding out which pages are already reserved.

  Stefan

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2011-04-29  9:14 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-04-27 16:16 [RFC PATCH 0/3] support for broken memory modules (BadRAM) Stefan Assmann
2011-04-27 16:16 ` [RFC PATCH 1/3] Add string parsing function get_next_ulong Stefan Assmann
2011-04-27 16:28   ` Randy Dunlap
2011-04-27 16:16 ` [RFC PATCH 2/3] support for broken memory modules (BadRAM) Stefan Assmann
2011-04-27 21:12   ` Andi Kleen
2011-04-28  6:34     ` Stefan Assmann
2011-04-28 15:08       ` Andi Kleen
2011-04-28 15:51         ` Stefan Assmann
2011-04-28 17:44           ` Luck, Tony
2011-04-29  9:14           ` Stefan Assmann
2011-04-27 16:16 ` [RFC PATCH 3/3] Add documentation and credits for BadRAM Stefan Assmann
2011-04-27 16:49   ` Randy Dunlap
2011-04-27 20:05     ` Stefan Assmann

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.