All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] Persist printk buffer across reboots.
@ 2012-03-13  5:36 ` Avery Pennarun
  0 siblings, 0 replies; 52+ messages in thread
From: Avery Pennarun @ 2012-03-13  5:36 UTC (permalink / raw)
  To: Andrew Morton, Josh Triplett, Paul E. McKenney, Ingo Molnar,
	David S. Miller, Peter Zijlstra, Fabio M. Di Nitto,
	Avery Pennarun, Johannes Weiner, Olaf Hering, Paul Gortmaker,
	Tejun Heo, H. Peter Anvin, Yinghai LU, linux-kernel, linux-mm

The last patch in this series implements a new CONFIG_PRINTK_PERSIST option
that, when enabled, puts the printk buffer in a well-defined memory location
so that we can keep appending to it after a reboot.  The upshot is that,
even after a kernel panic or non-panic hard lockup, on the next boot
userspace will be able to grab the kernel messages leading up to it.  It
could then upload the messages to a server (for example) to keep crash
statistics.

The preceding patches in the series are mostly just things I fixed up while
working on that patch.

Some notes:

- I'm not totally sure of the locking or portability issues when calling
  memblock or bootmem.  This all happens really early, and I *think*
  interrupts are still disabled at that time, so it's probably okay.

- Tested this version on x86 (kvm) and it works with soft reboot (ie. reboot
  -f).  Since some BIOSes wipe the memory during boot, you might not have
  any luck.  It should be great on many embedded systems, though, including
  the MIPS system I've tested a variant of this patch on.  (Our MIPS build
  is based on a slightly older kernel so it's not 100% the same, but I think
  this should behave identically.)
  
- The way we choose a well-defined memory location is slightly suspicious
  (we just count down from the top of the address space) but I've tested it
  pretty carefully, and it seems to be okay.

- In printk.c with CONFIG_PRINTK_PERSIST set, we're #defining words like
  log_end.  It might be cleaner to replace all instances of log_end with
  LOG_END to make this more clear.  This is also the reason the struct
  logbits members start with _: because otherwise they conflict with the
  macro.  Suggestions welcome.

Patch generated against v3.3-rc7.

Avery Pennarun (5):
  mm: bootmem: BUG() if you try to allocate bootmem too late.
  mm: bootmem: it's okay to reserve_bootmem an invalid address.
  mm: nobootmem: implement reserve_bootmem() in terms of memblock.
  printk: use alloc_bootmem() instead of memblock_alloc().
  printk: CONFIG_PRINTK_PERSIST: persist printk buffer across reboots.

 init/Kconfig    |   12 ++++++
 kernel/printk.c |  117 +++++++++++++++++++++++++++++++++++++++++++++---------
 mm/bootmem.c    |   10 ++++-
 mm/nobootmem.c  |   23 +++++++++++
 4 files changed, 140 insertions(+), 22 deletions(-)

-- 
1.7.7.3


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH 0/5] Persist printk buffer across reboots.
@ 2012-03-13  5:36 ` Avery Pennarun
  0 siblings, 0 replies; 52+ messages in thread
From: Avery Pennarun @ 2012-03-13  5:36 UTC (permalink / raw)
  To: Andrew Morton, Josh Triplett, Paul E. McKenney, Ingo Molnar,
	David S. Miller, Peter Zijlstra, Fabio M. Di Nitto,
	Avery Pennarun, Johannes Weiner, Olaf Hering, Paul Gortmaker,
	Tejun Heo, H. Peter Anvin, Yinghai LU, linux-kernel, linux-mm

The last patch in this series implements a new CONFIG_PRINTK_PERSIST option
that, when enabled, puts the printk buffer in a well-defined memory location
so that we can keep appending to it after a reboot.  The upshot is that,
even after a kernel panic or non-panic hard lockup, on the next boot
userspace will be able to grab the kernel messages leading up to it.  It
could then upload the messages to a server (for example) to keep crash
statistics.

The preceding patches in the series are mostly just things I fixed up while
working on that patch.

Some notes:

- I'm not totally sure of the locking or portability issues when calling
  memblock or bootmem.  This all happens really early, and I *think*
  interrupts are still disabled at that time, so it's probably okay.

- Tested this version on x86 (kvm) and it works with soft reboot (ie. reboot
  -f).  Since some BIOSes wipe the memory during boot, you might not have
  any luck.  It should be great on many embedded systems, though, including
  the MIPS system I've tested a variant of this patch on.  (Our MIPS build
  is based on a slightly older kernel so it's not 100% the same, but I think
  this should behave identically.)
  
- The way we choose a well-defined memory location is slightly suspicious
  (we just count down from the top of the address space) but I've tested it
  pretty carefully, and it seems to be okay.

- In printk.c with CONFIG_PRINTK_PERSIST set, we're #defining words like
  log_end.  It might be cleaner to replace all instances of log_end with
  LOG_END to make this more clear.  This is also the reason the struct
  logbits members start with _: because otherwise they conflict with the
  macro.  Suggestions welcome.

Patch generated against v3.3-rc7.

Avery Pennarun (5):
  mm: bootmem: BUG() if you try to allocate bootmem too late.
  mm: bootmem: it's okay to reserve_bootmem an invalid address.
  mm: nobootmem: implement reserve_bootmem() in terms of memblock.
  printk: use alloc_bootmem() instead of memblock_alloc().
  printk: CONFIG_PRINTK_PERSIST: persist printk buffer across reboots.

 init/Kconfig    |   12 ++++++
 kernel/printk.c |  117 +++++++++++++++++++++++++++++++++++++++++++++---------
 mm/bootmem.c    |   10 ++++-
 mm/nobootmem.c  |   23 +++++++++++
 4 files changed, 140 insertions(+), 22 deletions(-)

-- 
1.7.7.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH 1/5] mm: bootmem: BUG() if you try to allocate bootmem too late.
  2012-03-13  5:36 ` Avery Pennarun
@ 2012-03-13  5:36   ` Avery Pennarun
  -1 siblings, 0 replies; 52+ messages in thread
From: Avery Pennarun @ 2012-03-13  5:36 UTC (permalink / raw)
  To: Andrew Morton, Josh Triplett, Paul E. McKenney, Ingo Molnar,
	David S. Miller, Peter Zijlstra, Fabio M. Di Nitto,
	Avery Pennarun, Johannes Weiner, Olaf Hering, Paul Gortmaker,
	Tejun Heo, H. Peter Anvin, Yinghai LU, linux-kernel, linux-mm

If you try to allocate or reserve bootmem after the bootmem subsystem has
already been shut down and replaced with the real VM system, it would
succeed, but then your memory would silently get freed and could be
reallocated by the normal VM.

Add a few lines of code to catch this condition immediately rather than
manifesting as memory corruption.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>
---
 mm/bootmem.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/mm/bootmem.c b/mm/bootmem.c
index 668e94d..7a9f505 100644
--- a/mm/bootmem.c
+++ b/mm/bootmem.c
@@ -39,6 +39,7 @@ bootmem_data_t bootmem_node_data[MAX_NUMNODES] __initdata;
 static struct list_head bdata_list __initdata = LIST_HEAD_INIT(bdata_list);
 
 static int bootmem_debug;
+static int bootmem_stopped;
 
 static int __init bootmem_debug_setup(char *buf)
 {
@@ -182,6 +183,7 @@ static unsigned long __init free_all_bootmem_core(bootmem_data_t *bdata)
 
 	bdebug("nid=%td start=%lx end=%lx\n",
 		bdata - bootmem_node_data, start, end);
+	bootmem_stopped = 1;
 
 	while (start < end) {
 		unsigned long *map, idx, vec;
@@ -279,6 +281,7 @@ static int __init __reserve(bootmem_data_t *bdata, unsigned long sidx,
 	unsigned long idx;
 	int exclusive = flags & BOOTMEM_EXCLUSIVE;
 
+	BUG_ON(bootmem_stopped);
 	bdebug("nid=%td start=%lx end=%lx flags=%x\n",
 		bdata - bootmem_node_data,
 		sidx + bdata->node_min_pfn,
-- 
1.7.7.3


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 1/5] mm: bootmem: BUG() if you try to allocate bootmem too late.
@ 2012-03-13  5:36   ` Avery Pennarun
  0 siblings, 0 replies; 52+ messages in thread
From: Avery Pennarun @ 2012-03-13  5:36 UTC (permalink / raw)
  To: Andrew Morton, Josh Triplett, Paul E. McKenney, Ingo Molnar,
	David S. Miller, Peter Zijlstra, Fabio M. Di Nitto,
	Avery Pennarun, Johannes Weiner, Olaf Hering, Paul Gortmaker,
	Tejun Heo, H. Peter Anvin, Yinghai LU, linux-kernel, linux-mm

If you try to allocate or reserve bootmem after the bootmem subsystem has
already been shut down and replaced with the real VM system, it would
succeed, but then your memory would silently get freed and could be
reallocated by the normal VM.

Add a few lines of code to catch this condition immediately rather than
manifesting as memory corruption.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>
---
 mm/bootmem.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/mm/bootmem.c b/mm/bootmem.c
index 668e94d..7a9f505 100644
--- a/mm/bootmem.c
+++ b/mm/bootmem.c
@@ -39,6 +39,7 @@ bootmem_data_t bootmem_node_data[MAX_NUMNODES] __initdata;
 static struct list_head bdata_list __initdata = LIST_HEAD_INIT(bdata_list);
 
 static int bootmem_debug;
+static int bootmem_stopped;
 
 static int __init bootmem_debug_setup(char *buf)
 {
@@ -182,6 +183,7 @@ static unsigned long __init free_all_bootmem_core(bootmem_data_t *bdata)
 
 	bdebug("nid=%td start=%lx end=%lx\n",
 		bdata - bootmem_node_data, start, end);
+	bootmem_stopped = 1;
 
 	while (start < end) {
 		unsigned long *map, idx, vec;
@@ -279,6 +281,7 @@ static int __init __reserve(bootmem_data_t *bdata, unsigned long sidx,
 	unsigned long idx;
 	int exclusive = flags & BOOTMEM_EXCLUSIVE;
 
+	BUG_ON(bootmem_stopped);
 	bdebug("nid=%td start=%lx end=%lx flags=%x\n",
 		bdata - bootmem_node_data,
 		sidx + bdata->node_min_pfn,
-- 
1.7.7.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 2/5] mm: bootmem: it's okay to reserve_bootmem an invalid address.
  2012-03-13  5:36 ` Avery Pennarun
@ 2012-03-13  5:36   ` Avery Pennarun
  -1 siblings, 0 replies; 52+ messages in thread
From: Avery Pennarun @ 2012-03-13  5:36 UTC (permalink / raw)
  To: Andrew Morton, Josh Triplett, Paul E. McKenney, Ingo Molnar,
	David S. Miller, Peter Zijlstra, Fabio M. Di Nitto,
	Avery Pennarun, Johannes Weiner, Olaf Hering, Paul Gortmaker,
	Tejun Heo, H. Peter Anvin, Yinghai LU, linux-kernel, linux-mm

...but only if you provide BOOTMEM_EXCLUSIVE, which should guarantee that
you're actually checking the return value.  In that case, just return an
error code if the memory you tried to get is invalid.  This lets callers
safely probe around for a valid memory range.

If you don't use BOOTMEM_EXCLUSIVE and the memory address is invalid, just
crash as before, since the caller is probably not bothering to check the
return value.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>
---
 mm/bootmem.c |    7 +++++--
 1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/mm/bootmem.c b/mm/bootmem.c
index 7a9f505..d397dae 100644
--- a/mm/bootmem.c
+++ b/mm/bootmem.c
@@ -351,7 +351,10 @@ static int __init mark_bootmem(unsigned long start, unsigned long end,
 			return 0;
 		pos = bdata->node_low_pfn;
 	}
-	BUG();
+	/* people who don't use BOOTMEM_EXCLUSIVE don't check the return
+	 * value, so BUG() if it goes wrong. */
+	BUG_ON(!(reserve && (flags & BOOTMEM_EXCLUSIVE));
+	return -ENOENT;
 }
 
 /**
@@ -421,7 +424,7 @@ int __init reserve_bootmem_node(pg_data_t *pgdat, unsigned long physaddr,
 }
 
 /**
- * reserve_bootmem - mark a page range as usable
+ * reserve_bootmem - mark a page range as reserved
  * @addr: starting address of the range
  * @size: size of the range in bytes
  * @flags: reservation flags (see linux/bootmem.h)
-- 
1.7.7.3


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 2/5] mm: bootmem: it's okay to reserve_bootmem an invalid address.
@ 2012-03-13  5:36   ` Avery Pennarun
  0 siblings, 0 replies; 52+ messages in thread
From: Avery Pennarun @ 2012-03-13  5:36 UTC (permalink / raw)
  To: Andrew Morton, Josh Triplett, Paul E. McKenney, Ingo Molnar,
	David S. Miller, Peter Zijlstra, Fabio M. Di Nitto,
	Avery Pennarun, Johannes Weiner, Olaf Hering, Paul Gortmaker,
	Tejun Heo, H. Peter Anvin, Yinghai LU, linux-kernel, linux-mm

...but only if you provide BOOTMEM_EXCLUSIVE, which should guarantee that
you're actually checking the return value.  In that case, just return an
error code if the memory you tried to get is invalid.  This lets callers
safely probe around for a valid memory range.

If you don't use BOOTMEM_EXCLUSIVE and the memory address is invalid, just
crash as before, since the caller is probably not bothering to check the
return value.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>
---
 mm/bootmem.c |    7 +++++--
 1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/mm/bootmem.c b/mm/bootmem.c
index 7a9f505..d397dae 100644
--- a/mm/bootmem.c
+++ b/mm/bootmem.c
@@ -351,7 +351,10 @@ static int __init mark_bootmem(unsigned long start, unsigned long end,
 			return 0;
 		pos = bdata->node_low_pfn;
 	}
-	BUG();
+	/* people who don't use BOOTMEM_EXCLUSIVE don't check the return
+	 * value, so BUG() if it goes wrong. */
+	BUG_ON(!(reserve && (flags & BOOTMEM_EXCLUSIVE));
+	return -ENOENT;
 }
 
 /**
@@ -421,7 +424,7 @@ int __init reserve_bootmem_node(pg_data_t *pgdat, unsigned long physaddr,
 }
 
 /**
- * reserve_bootmem - mark a page range as usable
+ * reserve_bootmem - mark a page range as reserved
  * @addr: starting address of the range
  * @size: size of the range in bytes
  * @flags: reservation flags (see linux/bootmem.h)
-- 
1.7.7.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 3/5] mm: nobootmem: implement reserve_bootmem() in terms of memblock.
  2012-03-13  5:36 ` Avery Pennarun
@ 2012-03-13  5:36   ` Avery Pennarun
  -1 siblings, 0 replies; 52+ messages in thread
From: Avery Pennarun @ 2012-03-13  5:36 UTC (permalink / raw)
  To: Andrew Morton, Josh Triplett, Paul E. McKenney, Ingo Molnar,
	David S. Miller, Peter Zijlstra, Fabio M. Di Nitto,
	Avery Pennarun, Johannes Weiner, Olaf Hering, Paul Gortmaker,
	Tejun Heo, H. Peter Anvin, Yinghai LU, linux-kernel, linux-mm

There was an implementation for it in mm/bootmem.c, but it was left out of
nobootmem.c, and we can easily write a memblock implementation.  That way
code (eg.  printk) that wants to reserve memory early on can just always
call bootmem on all platforms.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>
---
 mm/nobootmem.c |   23 +++++++++++++++++++++++
 1 files changed, 23 insertions(+), 0 deletions(-)

diff --git a/mm/nobootmem.c b/mm/nobootmem.c
index 24f0fc1..2c269da 100644
--- a/mm/nobootmem.c
+++ b/mm/nobootmem.c
@@ -193,6 +193,29 @@ void __init free_bootmem(unsigned long addr, unsigned long size)
 	memblock_free(addr, size);
 }
 
+/**
+ * reserve_bootmem - mark a page range as reserved
+ * @addr: starting address of the range
+ * @size: size of the range in bytes
+ * @flags: reservation flags (see linux/bootmem.h)
+ *
+ * Partial pages will be reserved.
+ *
+ * The range must be contiguous but may span node boundaries.
+ */
+int __init reserve_bootmem(unsigned long addr, unsigned long size,
+			    int flags)
+{
+	if (flags & BOOTMEM_EXCLUSIVE) {
+		phys_addr_t m = memblock_find_in_range(addr, addr + size,
+					size, PAGE_SIZE);
+		if (m != addr)
+			return -EBUSY;
+	}
+	memblock_reserve(addr, size);
+	return 0;
+}
+
 static void * __init ___alloc_bootmem_nopanic(unsigned long size,
 					unsigned long align,
 					unsigned long goal,
-- 
1.7.7.3


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 3/5] mm: nobootmem: implement reserve_bootmem() in terms of memblock.
@ 2012-03-13  5:36   ` Avery Pennarun
  0 siblings, 0 replies; 52+ messages in thread
From: Avery Pennarun @ 2012-03-13  5:36 UTC (permalink / raw)
  To: Andrew Morton, Josh Triplett, Paul E. McKenney, Ingo Molnar,
	David S. Miller, Peter Zijlstra, Fabio M. Di Nitto,
	Avery Pennarun, Johannes Weiner, Olaf Hering, Paul Gortmaker,
	Tejun Heo, H. Peter Anvin, Yinghai LU, linux-kernel, linux-mm

There was an implementation for it in mm/bootmem.c, but it was left out of
nobootmem.c, and we can easily write a memblock implementation.  That way
code (eg.  printk) that wants to reserve memory early on can just always
call bootmem on all platforms.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>
---
 mm/nobootmem.c |   23 +++++++++++++++++++++++
 1 files changed, 23 insertions(+), 0 deletions(-)

diff --git a/mm/nobootmem.c b/mm/nobootmem.c
index 24f0fc1..2c269da 100644
--- a/mm/nobootmem.c
+++ b/mm/nobootmem.c
@@ -193,6 +193,29 @@ void __init free_bootmem(unsigned long addr, unsigned long size)
 	memblock_free(addr, size);
 }
 
+/**
+ * reserve_bootmem - mark a page range as reserved
+ * @addr: starting address of the range
+ * @size: size of the range in bytes
+ * @flags: reservation flags (see linux/bootmem.h)
+ *
+ * Partial pages will be reserved.
+ *
+ * The range must be contiguous but may span node boundaries.
+ */
+int __init reserve_bootmem(unsigned long addr, unsigned long size,
+			    int flags)
+{
+	if (flags & BOOTMEM_EXCLUSIVE) {
+		phys_addr_t m = memblock_find_in_range(addr, addr + size,
+					size, PAGE_SIZE);
+		if (m != addr)
+			return -EBUSY;
+	}
+	memblock_reserve(addr, size);
+	return 0;
+}
+
 static void * __init ___alloc_bootmem_nopanic(unsigned long size,
 					unsigned long align,
 					unsigned long goal,
-- 
1.7.7.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 4/5] printk: use alloc_bootmem() instead of memblock_alloc().
  2012-03-13  5:36 ` Avery Pennarun
@ 2012-03-13  5:36   ` Avery Pennarun
  -1 siblings, 0 replies; 52+ messages in thread
From: Avery Pennarun @ 2012-03-13  5:36 UTC (permalink / raw)
  To: Andrew Morton, Josh Triplett, Paul E. McKenney, Ingo Molnar,
	David S. Miller, Peter Zijlstra, Fabio M. Di Nitto,
	Avery Pennarun, Johannes Weiner, Olaf Hering, Paul Gortmaker,
	Tejun Heo, H. Peter Anvin, Yinghai LU, linux-kernel, linux-mm

The code in setup_log_buf() had two memory allocation branches, depending
on the value of 'early'.  If early==1, it would use memblock_alloc(); if
early==0, it would use alloc_bootmem_nopanic().

bootmem should already configured by the time setup_log_buf(early=1) is
called, so there's no reason to have the separation.  Furthermore, on
arches with nobootmem, memblock_alloc is essentially the same as
alloc_bootmem anyway.  x86 is one such arch, and also the only one
that uses early=1.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>
---
 kernel/printk.c |   13 +------------
 1 files changed, 1 insertions(+), 12 deletions(-)

diff --git a/kernel/printk.c b/kernel/printk.c
index 32690a0..bf96a7d 100644
--- a/kernel/printk.c
+++ b/kernel/printk.c
@@ -31,7 +31,6 @@
 #include <linux/smp.h>
 #include <linux/security.h>
 #include <linux/bootmem.h>
-#include <linux/memblock.h>
 #include <linux/syscalls.h>
 #include <linux/kexec.h>
 #include <linux/kdb.h>
@@ -195,17 +194,7 @@ void __init setup_log_buf(int early)
 	if (!new_log_buf_len)
 		return;
 
-	if (early) {
-		unsigned long mem;
-
-		mem = memblock_alloc(new_log_buf_len, PAGE_SIZE);
-		if (!mem)
-			return;
-		new_log_buf = __va(mem);
-	} else {
-		new_log_buf = alloc_bootmem_nopanic(new_log_buf_len);
-	}
-
+	new_log_buf = alloc_bootmem_nopanic(new_log_buf_len);
 	if (unlikely(!new_log_buf)) {
 		pr_err("log_buf_len: %ld bytes not available\n",
 			new_log_buf_len);
-- 
1.7.7.3


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 4/5] printk: use alloc_bootmem() instead of memblock_alloc().
@ 2012-03-13  5:36   ` Avery Pennarun
  0 siblings, 0 replies; 52+ messages in thread
From: Avery Pennarun @ 2012-03-13  5:36 UTC (permalink / raw)
  To: Andrew Morton, Josh Triplett, Paul E. McKenney, Ingo Molnar,
	David S. Miller, Peter Zijlstra, Fabio M. Di Nitto,
	Avery Pennarun, Johannes Weiner, Olaf Hering, Paul Gortmaker,
	Tejun Heo, H. Peter Anvin, Yinghai LU, linux-kernel, linux-mm

The code in setup_log_buf() had two memory allocation branches, depending
on the value of 'early'.  If early==1, it would use memblock_alloc(); if
early==0, it would use alloc_bootmem_nopanic().

bootmem should already configured by the time setup_log_buf(early=1) is
called, so there's no reason to have the separation.  Furthermore, on
arches with nobootmem, memblock_alloc is essentially the same as
alloc_bootmem anyway.  x86 is one such arch, and also the only one
that uses early=1.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>
---
 kernel/printk.c |   13 +------------
 1 files changed, 1 insertions(+), 12 deletions(-)

diff --git a/kernel/printk.c b/kernel/printk.c
index 32690a0..bf96a7d 100644
--- a/kernel/printk.c
+++ b/kernel/printk.c
@@ -31,7 +31,6 @@
 #include <linux/smp.h>
 #include <linux/security.h>
 #include <linux/bootmem.h>
-#include <linux/memblock.h>
 #include <linux/syscalls.h>
 #include <linux/kexec.h>
 #include <linux/kdb.h>
@@ -195,17 +194,7 @@ void __init setup_log_buf(int early)
 	if (!new_log_buf_len)
 		return;
 
-	if (early) {
-		unsigned long mem;
-
-		mem = memblock_alloc(new_log_buf_len, PAGE_SIZE);
-		if (!mem)
-			return;
-		new_log_buf = __va(mem);
-	} else {
-		new_log_buf = alloc_bootmem_nopanic(new_log_buf_len);
-	}
-
+	new_log_buf = alloc_bootmem_nopanic(new_log_buf_len);
 	if (unlikely(!new_log_buf)) {
 		pr_err("log_buf_len: %ld bytes not available\n",
 			new_log_buf_len);
-- 
1.7.7.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 5/5] printk: CONFIG_PRINTK_PERSIST: persist printk buffer across reboots.
  2012-03-13  5:36 ` Avery Pennarun
@ 2012-03-13  5:36   ` Avery Pennarun
  -1 siblings, 0 replies; 52+ messages in thread
From: Avery Pennarun @ 2012-03-13  5:36 UTC (permalink / raw)
  To: Andrew Morton, Josh Triplett, Paul E. McKenney, Ingo Molnar,
	David S. Miller, Peter Zijlstra, Fabio M. Di Nitto,
	Avery Pennarun, Johannes Weiner, Olaf Hering, Paul Gortmaker,
	Tejun Heo, H. Peter Anvin, Yinghai LU, linux-kernel, linux-mm

Instead of using alloc_bootmem() when log_buf_len= is provided on the
command line, use reserve_bootmem() instead to try to reserve the memory at
a predictable physical address.  If we manage to get such an address, check
whether it has a valid header from last time, and if so, keep the data in
the existing buffer as if it had been printk'd as part of the current
session.  You can then retrieve or clear it with dmesg.  Note: you must
supply log_buf_len= on the kernel command line to activate this feature.

If reserve_bootmem() doesn't work out, we fall back to the old
alloc_bootmem() method.

The nice thing about this feature is it allows us to capture and upload
printk results after a crash and reboot, even if the system had hard crashed
so there was no chance to do something like panic or kexec.  The last few
messages before the crash might give a clue as to the crash.

Note: None of this is any use if your bootloader or BIOS wipes memory
between reboots.  On embedded systems, you have somewhat more control over
this.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>
---
 init/Kconfig    |   12 ++++++
 kernel/printk.c |  106 ++++++++++++++++++++++++++++++++++++++++++++++++++-----
 2 files changed, 109 insertions(+), 9 deletions(-)

diff --git a/init/Kconfig b/init/Kconfig
index 3f42cd6..d182c07 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1040,6 +1040,18 @@ config PRINTK
 	  very difficult to diagnose system problems, saying N here is
 	  strongly discouraged.
 
+config PRINTK_PERSIST
+	default n
+	bool "printk log persists across reboots" if PRINTK
+	help
+	  This option tries to keep the printk memory buffer in a well-known
+	  location in physical memory. It isn't cleared on reboot (unless RAM
+	  is wiped by your boot loader or BIOS) so if your system crashes
+	  or panics, you might get to examine all the log messages next time you
+	  boot. The persisted log messages show up in your 'dmesg' output.
+	  Note: you must supply the log_buf_len= kernel parameter to
+	  activate this feature.
+
 config BUG
 	bool "BUG() support" if EXPERT
 	default y
diff --git a/kernel/printk.c b/kernel/printk.c
index bf96a7d..f9045d9 100644
--- a/kernel/printk.c
+++ b/kernel/printk.c
@@ -110,7 +110,12 @@ static DEFINE_RAW_SPINLOCK(logbuf_lock);
  */
 static unsigned log_start;	/* Index into log_buf: next char to be read by syslog() */
 static unsigned con_start;	/* Index into log_buf: next char to be sent to consoles */
+
+#ifdef CONFIG_PRINTK_PERSIST
+#define log_end (logbits->_log_end)
+#else
 static unsigned log_end;	/* Index into log_buf: most-recently-written-char + 1 */
+#endif
 
 /*
  * If exclusive_console is non-NULL then only this console is to be printed to.
@@ -143,11 +148,93 @@ static int console_may_schedule;
 
 #ifdef CONFIG_PRINTK
 
+static int saved_console_loglevel = -1;
 static char __log_buf[__LOG_BUF_LEN];
 static char *log_buf = __log_buf;
+
+#ifndef CONFIG_PRINTK_PERSIST
+
 static int log_buf_len = __LOG_BUF_LEN;
 static unsigned logged_chars; /* Number of chars produced since last read+clear operation */
-static int saved_console_loglevel = -1;
+
+#else  /* CONFIG_PRINTK_PERSIST */
+
+struct logbits {
+	int magic; /* needed to verify the memory across reboots */
+	int _log_buf_len; /* leading _ so they aren't replaced by #define */
+	unsigned _logged_chars;
+	unsigned _log_end;
+};
+static struct logbits __logbits = {
+	._log_buf_len = __LOG_BUF_LEN,
+};
+static struct logbits *logbits = &__logbits;
+#define log_buf_len (logbits->_log_buf_len)
+#define logged_chars (logbits->_logged_chars)
+
+#define PERSIST_SEARCH_END 0xfe000000
+#define PERSIST_SEARCH_JUMP (16*1024*1024)
+#define PERSIST_MAGIC 0xbabb1e
+
+#endif /* CONFIG_PRINTK_PERSIST */
+
+/*
+ * size is a power of 2 so that the printk offset mask will work.  We'll add
+ * a bit more space to the end of the buffer for our extra data, but that
+ * won't change the alignment of the buffer itself.
+ */
+static __init char *log_buf_alloc(int early, unsigned long size,
+	unsigned *dest_offset)
+{
+#ifdef CONFIG_PRINTK_PERSIST
+	unsigned long where;
+	char *buf;
+	unsigned long full_size = size + sizeof(struct logbits);
+	struct logbits *new_logbits;
+
+	for (where = PERSIST_SEARCH_END - size;
+			where >= PERSIST_SEARCH_JUMP;
+			where -= PERSIST_SEARCH_JUMP) {
+		where &= ~(roundup_pow_of_two(size) - 1);
+		if (reserve_bootmem(where, full_size, BOOTMEM_EXCLUSIVE))
+			continue;
+
+		printk(KERN_INFO "printk_persist: memory reserved @ 0x%08lx\n",
+			where);
+		buf = phys_to_virt(where);
+		new_logbits = phys_to_virt(where + size);
+		if (new_logbits->magic != PERSIST_MAGIC ||
+				new_logbits->_log_buf_len != size ||
+				new_logbits->_logged_chars > size ||
+				new_logbits->_log_end > size * 2) {
+			printk(KERN_INFO "printk_persist: header invalid, "
+				"cleared.\n");
+			memset(buf, 0, full_size);
+			new_logbits->magic = PERSIST_MAGIC;
+			new_logbits->_log_buf_len = size;
+			new_logbits->_logged_chars = 0;
+			new_logbits->_log_end = 0;
+		} else {
+			printk(KERN_INFO "printk_persist: header valid; "
+				"logged=%d next=%d\n",
+				new_logbits->_logged_chars,
+				new_logbits->_log_end);
+		}
+		*dest_offset = new_logbits->_log_end;
+		new_logbits->_log_end = log_end;
+		new_logbits->_logged_chars += logged_chars;
+		logbits = new_logbits;
+		return buf;
+	}
+	goto error;
+
+error:
+	/* replace the buffer, but don't bother to swap struct logbits */
+	printk(KERN_ERR "printk_persist: failed to reserve bootmem "
+		"area. disabled.\n");
+#endif  /* CONFIG_PRINTK_PERSIST */
+	return alloc_bootmem_nopanic(size);
+}
 
 #ifdef CONFIG_KEXEC
 /*
@@ -187,14 +274,14 @@ early_param("log_buf_len", log_buf_len_setup);
 void __init setup_log_buf(int early)
 {
 	unsigned long flags;
-	unsigned start, dest_idx, offset;
+	unsigned start, dest_idx, dest_offset = 0, offset;
 	char *new_log_buf;
 	int free;
 
 	if (!new_log_buf_len)
 		return;
 
-	new_log_buf = alloc_bootmem_nopanic(new_log_buf_len);
+	new_log_buf = log_buf_alloc(early, new_log_buf_len, &dest_offset);
 	if (unlikely(!new_log_buf)) {
 		pr_err("log_buf_len: %ld bytes not available\n",
 			new_log_buf_len);
@@ -204,21 +291,22 @@ void __init setup_log_buf(int early)
 	raw_spin_lock_irqsave(&logbuf_lock, flags);
 	log_buf_len = new_log_buf_len;
 	log_buf = new_log_buf;
-	new_log_buf_len = 0;
 	free = __LOG_BUF_LEN - log_end;
 
 	offset = start = min(con_start, log_start);
-	dest_idx = 0;
+	dest_idx = dest_offset;
 	while (start != log_end) {
 		unsigned log_idx_mask = start & (__LOG_BUF_LEN - 1);
 
-		log_buf[dest_idx] = __log_buf[log_idx_mask];
+		log_buf[dest_idx & (new_log_buf_len - 1)] =
+			__log_buf[log_idx_mask];
 		start++;
 		dest_idx++;
 	}
-	log_start -= offset;
-	con_start -= offset;
-	log_end -= offset;
+	log_start += dest_offset - offset;
+	con_start += dest_offset - offset;
+	log_end += dest_offset - offset;
+	new_log_buf_len = 0;
 	raw_spin_unlock_irqrestore(&logbuf_lock, flags);
 
 	pr_info("log_buf_len: %d\n", log_buf_len);
-- 
1.7.7.3


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 5/5] printk: CONFIG_PRINTK_PERSIST: persist printk buffer across reboots.
@ 2012-03-13  5:36   ` Avery Pennarun
  0 siblings, 0 replies; 52+ messages in thread
From: Avery Pennarun @ 2012-03-13  5:36 UTC (permalink / raw)
  To: Andrew Morton, Josh Triplett, Paul E. McKenney, Ingo Molnar,
	David S. Miller, Peter Zijlstra, Fabio M. Di Nitto,
	Avery Pennarun, Johannes Weiner, Olaf Hering, Paul Gortmaker,
	Tejun Heo, H. Peter Anvin, Yinghai LU, linux-kernel, linux-mm

Instead of using alloc_bootmem() when log_buf_len= is provided on the
command line, use reserve_bootmem() instead to try to reserve the memory at
a predictable physical address.  If we manage to get such an address, check
whether it has a valid header from last time, and if so, keep the data in
the existing buffer as if it had been printk'd as part of the current
session.  You can then retrieve or clear it with dmesg.  Note: you must
supply log_buf_len= on the kernel command line to activate this feature.

If reserve_bootmem() doesn't work out, we fall back to the old
alloc_bootmem() method.

The nice thing about this feature is it allows us to capture and upload
printk results after a crash and reboot, even if the system had hard crashed
so there was no chance to do something like panic or kexec.  The last few
messages before the crash might give a clue as to the crash.

Note: None of this is any use if your bootloader or BIOS wipes memory
between reboots.  On embedded systems, you have somewhat more control over
this.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>
---
 init/Kconfig    |   12 ++++++
 kernel/printk.c |  106 ++++++++++++++++++++++++++++++++++++++++++++++++++-----
 2 files changed, 109 insertions(+), 9 deletions(-)

diff --git a/init/Kconfig b/init/Kconfig
index 3f42cd6..d182c07 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1040,6 +1040,18 @@ config PRINTK
 	  very difficult to diagnose system problems, saying N here is
 	  strongly discouraged.
 
+config PRINTK_PERSIST
+	default n
+	bool "printk log persists across reboots" if PRINTK
+	help
+	  This option tries to keep the printk memory buffer in a well-known
+	  location in physical memory. It isn't cleared on reboot (unless RAM
+	  is wiped by your boot loader or BIOS) so if your system crashes
+	  or panics, you might get to examine all the log messages next time you
+	  boot. The persisted log messages show up in your 'dmesg' output.
+	  Note: you must supply the log_buf_len= kernel parameter to
+	  activate this feature.
+
 config BUG
 	bool "BUG() support" if EXPERT
 	default y
diff --git a/kernel/printk.c b/kernel/printk.c
index bf96a7d..f9045d9 100644
--- a/kernel/printk.c
+++ b/kernel/printk.c
@@ -110,7 +110,12 @@ static DEFINE_RAW_SPINLOCK(logbuf_lock);
  */
 static unsigned log_start;	/* Index into log_buf: next char to be read by syslog() */
 static unsigned con_start;	/* Index into log_buf: next char to be sent to consoles */
+
+#ifdef CONFIG_PRINTK_PERSIST
+#define log_end (logbits->_log_end)
+#else
 static unsigned log_end;	/* Index into log_buf: most-recently-written-char + 1 */
+#endif
 
 /*
  * If exclusive_console is non-NULL then only this console is to be printed to.
@@ -143,11 +148,93 @@ static int console_may_schedule;
 
 #ifdef CONFIG_PRINTK
 
+static int saved_console_loglevel = -1;
 static char __log_buf[__LOG_BUF_LEN];
 static char *log_buf = __log_buf;
+
+#ifndef CONFIG_PRINTK_PERSIST
+
 static int log_buf_len = __LOG_BUF_LEN;
 static unsigned logged_chars; /* Number of chars produced since last read+clear operation */
-static int saved_console_loglevel = -1;
+
+#else  /* CONFIG_PRINTK_PERSIST */
+
+struct logbits {
+	int magic; /* needed to verify the memory across reboots */
+	int _log_buf_len; /* leading _ so they aren't replaced by #define */
+	unsigned _logged_chars;
+	unsigned _log_end;
+};
+static struct logbits __logbits = {
+	._log_buf_len = __LOG_BUF_LEN,
+};
+static struct logbits *logbits = &__logbits;
+#define log_buf_len (logbits->_log_buf_len)
+#define logged_chars (logbits->_logged_chars)
+
+#define PERSIST_SEARCH_END 0xfe000000
+#define PERSIST_SEARCH_JUMP (16*1024*1024)
+#define PERSIST_MAGIC 0xbabb1e
+
+#endif /* CONFIG_PRINTK_PERSIST */
+
+/*
+ * size is a power of 2 so that the printk offset mask will work.  We'll add
+ * a bit more space to the end of the buffer for our extra data, but that
+ * won't change the alignment of the buffer itself.
+ */
+static __init char *log_buf_alloc(int early, unsigned long size,
+	unsigned *dest_offset)
+{
+#ifdef CONFIG_PRINTK_PERSIST
+	unsigned long where;
+	char *buf;
+	unsigned long full_size = size + sizeof(struct logbits);
+	struct logbits *new_logbits;
+
+	for (where = PERSIST_SEARCH_END - size;
+			where >= PERSIST_SEARCH_JUMP;
+			where -= PERSIST_SEARCH_JUMP) {
+		where &= ~(roundup_pow_of_two(size) - 1);
+		if (reserve_bootmem(where, full_size, BOOTMEM_EXCLUSIVE))
+			continue;
+
+		printk(KERN_INFO "printk_persist: memory reserved @ 0x%08lx\n",
+			where);
+		buf = phys_to_virt(where);
+		new_logbits = phys_to_virt(where + size);
+		if (new_logbits->magic != PERSIST_MAGIC ||
+				new_logbits->_log_buf_len != size ||
+				new_logbits->_logged_chars > size ||
+				new_logbits->_log_end > size * 2) {
+			printk(KERN_INFO "printk_persist: header invalid, "
+				"cleared.\n");
+			memset(buf, 0, full_size);
+			new_logbits->magic = PERSIST_MAGIC;
+			new_logbits->_log_buf_len = size;
+			new_logbits->_logged_chars = 0;
+			new_logbits->_log_end = 0;
+		} else {
+			printk(KERN_INFO "printk_persist: header valid; "
+				"logged=%d next=%d\n",
+				new_logbits->_logged_chars,
+				new_logbits->_log_end);
+		}
+		*dest_offset = new_logbits->_log_end;
+		new_logbits->_log_end = log_end;
+		new_logbits->_logged_chars += logged_chars;
+		logbits = new_logbits;
+		return buf;
+	}
+	goto error;
+
+error:
+	/* replace the buffer, but don't bother to swap struct logbits */
+	printk(KERN_ERR "printk_persist: failed to reserve bootmem "
+		"area. disabled.\n");
+#endif  /* CONFIG_PRINTK_PERSIST */
+	return alloc_bootmem_nopanic(size);
+}
 
 #ifdef CONFIG_KEXEC
 /*
@@ -187,14 +274,14 @@ early_param("log_buf_len", log_buf_len_setup);
 void __init setup_log_buf(int early)
 {
 	unsigned long flags;
-	unsigned start, dest_idx, offset;
+	unsigned start, dest_idx, dest_offset = 0, offset;
 	char *new_log_buf;
 	int free;
 
 	if (!new_log_buf_len)
 		return;
 
-	new_log_buf = alloc_bootmem_nopanic(new_log_buf_len);
+	new_log_buf = log_buf_alloc(early, new_log_buf_len, &dest_offset);
 	if (unlikely(!new_log_buf)) {
 		pr_err("log_buf_len: %ld bytes not available\n",
 			new_log_buf_len);
@@ -204,21 +291,22 @@ void __init setup_log_buf(int early)
 	raw_spin_lock_irqsave(&logbuf_lock, flags);
 	log_buf_len = new_log_buf_len;
 	log_buf = new_log_buf;
-	new_log_buf_len = 0;
 	free = __LOG_BUF_LEN - log_end;
 
 	offset = start = min(con_start, log_start);
-	dest_idx = 0;
+	dest_idx = dest_offset;
 	while (start != log_end) {
 		unsigned log_idx_mask = start & (__LOG_BUF_LEN - 1);
 
-		log_buf[dest_idx] = __log_buf[log_idx_mask];
+		log_buf[dest_idx & (new_log_buf_len - 1)] =
+			__log_buf[log_idx_mask];
 		start++;
 		dest_idx++;
 	}
-	log_start -= offset;
-	con_start -= offset;
-	log_end -= offset;
+	log_start += dest_offset - offset;
+	con_start += dest_offset - offset;
+	log_end += dest_offset - offset;
+	new_log_buf_len = 0;
 	raw_spin_unlock_irqrestore(&logbuf_lock, flags);
 
 	pr_info("log_buf_len: %d\n", log_buf_len);
-- 
1.7.7.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH 0/5] Persist printk buffer across reboots.
  2012-03-13  5:36 ` Avery Pennarun
@ 2012-03-13  5:53   ` David Miller
  -1 siblings, 0 replies; 52+ messages in thread
From: David Miller @ 2012-03-13  5:53 UTC (permalink / raw)
  To: apenwarr
  Cc: akpm, josh, paulmck, mingo, a.p.zijlstra, fdinitto, hannes, olaf,
	paul.gortmaker, tj, hpa, yinghai, linux-kernel, linux-mm

From: Avery Pennarun <apenwarr@gmail.com>
Date: Tue, 13 Mar 2012 01:36:36 -0400

> The last patch in this series implements a new CONFIG_PRINTK_PERSIST option
> that, when enabled, puts the printk buffer in a well-defined memory location
> so that we can keep appending to it after a reboot.  The upshot is that,
> even after a kernel panic or non-panic hard lockup, on the next boot
> userspace will be able to grab the kernel messages leading up to it.  It
> could then upload the messages to a server (for example) to keep crash
> statistics.

On some platforms there are formal ways to reserve areas of memory
such that the bootup firmware will know to not touch it on soft resets
no matter what.  For example, on Sparc there are OpenFirmware calls to
set aside such an area of soft-reset preserved memory.

I think some formal agreement with the system firmware is a lot better
when available, and should be explicitly accomodated in these changes
so that those of us with such facilities can very easily hook it up.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 0/5] Persist printk buffer across reboots.
@ 2012-03-13  5:53   ` David Miller
  0 siblings, 0 replies; 52+ messages in thread
From: David Miller @ 2012-03-13  5:53 UTC (permalink / raw)
  To: apenwarr
  Cc: akpm, josh, paulmck, mingo, a.p.zijlstra, fdinitto, hannes, olaf,
	paul.gortmaker, tj, hpa, yinghai, linux-kernel, linux-mm

From: Avery Pennarun <apenwarr@gmail.com>
Date: Tue, 13 Mar 2012 01:36:36 -0400

> The last patch in this series implements a new CONFIG_PRINTK_PERSIST option
> that, when enabled, puts the printk buffer in a well-defined memory location
> so that we can keep appending to it after a reboot.  The upshot is that,
> even after a kernel panic or non-panic hard lockup, on the next boot
> userspace will be able to grab the kernel messages leading up to it.  It
> could then upload the messages to a server (for example) to keep crash
> statistics.

On some platforms there are formal ways to reserve areas of memory
such that the bootup firmware will know to not touch it on soft resets
no matter what.  For example, on Sparc there are OpenFirmware calls to
set aside such an area of soft-reset preserved memory.

I think some formal agreement with the system firmware is a lot better
when available, and should be explicitly accomodated in these changes
so that those of us with such facilities can very easily hook it up.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 0/5] Persist printk buffer across reboots.
  2012-03-13  5:53   ` David Miller
@ 2012-03-13  6:00     ` Avery Pennarun
  -1 siblings, 0 replies; 52+ messages in thread
From: Avery Pennarun @ 2012-03-13  6:00 UTC (permalink / raw)
  To: David Miller
  Cc: akpm, josh, paulmck, mingo, a.p.zijlstra, fdinitto, hannes, olaf,
	paul.gortmaker, tj, hpa, yinghai, linux-kernel, linux-mm

On Tue, Mar 13, 2012 at 1:53 AM, David Miller <davem@davemloft.net> wrote:
> From: Avery Pennarun <apenwarr@gmail.com>
> Date: Tue, 13 Mar 2012 01:36:36 -0400
>
>> The last patch in this series implements a new CONFIG_PRINTK_PERSIST option
>> that, when enabled, puts the printk buffer in a well-defined memory location
>> so that we can keep appending to it after a reboot.  The upshot is that,
>> even after a kernel panic or non-panic hard lockup, on the next boot
>> userspace will be able to grab the kernel messages leading up to it.  It
>> could then upload the messages to a server (for example) to keep crash
>> statistics.
>
> On some platforms there are formal ways to reserve areas of memory
> such that the bootup firmware will know to not touch it on soft resets
> no matter what.  For example, on Sparc there are OpenFirmware calls to
> set aside such an area of soft-reset preserved memory.
>
> I think some formal agreement with the system firmware is a lot better
> when available, and should be explicitly accomodated in these changes
> so that those of us with such facilities can very easily hook it up.

Sounds good to me.  Do you have any pointers?  Just use an
early_param?  If we see the early_param but we can't reserve the
requested address, should we fall back to probing or disable the
PRINTK_PERSIST mode entirely?

Thanks,

Avery

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 0/5] Persist printk buffer across reboots.
@ 2012-03-13  6:00     ` Avery Pennarun
  0 siblings, 0 replies; 52+ messages in thread
From: Avery Pennarun @ 2012-03-13  6:00 UTC (permalink / raw)
  To: David Miller
  Cc: akpm, josh, paulmck, mingo, a.p.zijlstra, fdinitto, hannes, olaf,
	paul.gortmaker, tj, hpa, yinghai, linux-kernel, linux-mm

On Tue, Mar 13, 2012 at 1:53 AM, David Miller <davem@davemloft.net> wrote:
> From: Avery Pennarun <apenwarr@gmail.com>
> Date: Tue, 13 Mar 2012 01:36:36 -0400
>
>> The last patch in this series implements a new CONFIG_PRINTK_PERSIST option
>> that, when enabled, puts the printk buffer in a well-defined memory location
>> so that we can keep appending to it after a reboot.  The upshot is that,
>> even after a kernel panic or non-panic hard lockup, on the next boot
>> userspace will be able to grab the kernel messages leading up to it.  It
>> could then upload the messages to a server (for example) to keep crash
>> statistics.
>
> On some platforms there are formal ways to reserve areas of memory
> such that the bootup firmware will know to not touch it on soft resets
> no matter what.  For example, on Sparc there are OpenFirmware calls to
> set aside such an area of soft-reset preserved memory.
>
> I think some formal agreement with the system firmware is a lot better
> when available, and should be explicitly accomodated in these changes
> so that those of us with such facilities can very easily hook it up.

Sounds good to me.  Do you have any pointers?  Just use an
early_param?  If we see the early_param but we can't reserve the
requested address, should we fall back to probing or disable the
PRINTK_PERSIST mode entirely?

Thanks,

Avery

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 4/5] printk: use alloc_bootmem() instead of memblock_alloc().
  2012-03-13  5:36   ` Avery Pennarun
@ 2012-03-13  6:13     ` Yinghai Lu
  -1 siblings, 0 replies; 52+ messages in thread
From: Yinghai Lu @ 2012-03-13  6:13 UTC (permalink / raw)
  To: Avery Pennarun
  Cc: Andrew Morton, Josh Triplett, Paul E. McKenney, Ingo Molnar,
	David S. Miller, Peter Zijlstra, Fabio M. Di Nitto,
	Johannes Weiner, Olaf Hering, Paul Gortmaker, Tejun Heo,
	H. Peter Anvin, linux-kernel, linux-mm

On Mon, Mar 12, 2012 at 10:36 PM, Avery Pennarun <apenwarr@gmail.com> wrote:
> The code in setup_log_buf() had two memory allocation branches, depending
> on the value of 'early'.  If early==1, it would use memblock_alloc(); if
> early==0, it would use alloc_bootmem_nopanic().
>
> bootmem should already configured by the time setup_log_buf(early=1) is
> called, so there's no reason to have the separation.  Furthermore, on
> arches with nobootmem, memblock_alloc is essentially the same as
> alloc_bootmem anyway.  x86 is one such arch, and also the only one
> that uses early=1.
>
> Signed-off-by: Avery Pennarun <apenwarr@gmail.com>
> ---
>  kernel/printk.c |   13 +------------
>  1 files changed, 1 insertions(+), 12 deletions(-)
>
> diff --git a/kernel/printk.c b/kernel/printk.c
> index 32690a0..bf96a7d 100644
> --- a/kernel/printk.c
> +++ b/kernel/printk.c
> @@ -31,7 +31,6 @@
>  #include <linux/smp.h>
>  #include <linux/security.h>
>  #include <linux/bootmem.h>
> -#include <linux/memblock.h>
>  #include <linux/syscalls.h>
>  #include <linux/kexec.h>
>  #include <linux/kdb.h>
> @@ -195,17 +194,7 @@ void __init setup_log_buf(int early)
>        if (!new_log_buf_len)
>                return;
>
> -       if (early) {
> -               unsigned long mem;
> -
> -               mem = memblock_alloc(new_log_buf_len, PAGE_SIZE);
> -               if (!mem)
> -                       return;
> -               new_log_buf = __va(mem);
> -       } else {
> -               new_log_buf = alloc_bootmem_nopanic(new_log_buf_len);
> -       }
> -
> +       new_log_buf = alloc_bootmem_nopanic(new_log_buf_len);
>        if (unlikely(!new_log_buf)) {
>                pr_err("log_buf_len: %ld bytes not available\n",
>                        new_log_buf_len);
> --

that seems not right.

for x86, setup_log_buf(1) is quite early called in setup_arch() before
bootmem is there.

bootmem should be killed after memblock is supported for arch that
current support bootmem.

Yinghai

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 4/5] printk: use alloc_bootmem() instead of memblock_alloc().
@ 2012-03-13  6:13     ` Yinghai Lu
  0 siblings, 0 replies; 52+ messages in thread
From: Yinghai Lu @ 2012-03-13  6:13 UTC (permalink / raw)
  To: Avery Pennarun
  Cc: Andrew Morton, Josh Triplett, Paul E. McKenney, Ingo Molnar,
	David S. Miller, Peter Zijlstra, Fabio M. Di Nitto,
	Johannes Weiner, Olaf Hering, Paul Gortmaker, Tejun Heo,
	H. Peter Anvin, linux-kernel, linux-mm

On Mon, Mar 12, 2012 at 10:36 PM, Avery Pennarun <apenwarr@gmail.com> wrote:
> The code in setup_log_buf() had two memory allocation branches, depending
> on the value of 'early'.  If early==1, it would use memblock_alloc(); if
> early==0, it would use alloc_bootmem_nopanic().
>
> bootmem should already configured by the time setup_log_buf(early=1) is
> called, so there's no reason to have the separation.  Furthermore, on
> arches with nobootmem, memblock_alloc is essentially the same as
> alloc_bootmem anyway.  x86 is one such arch, and also the only one
> that uses early=1.
>
> Signed-off-by: Avery Pennarun <apenwarr@gmail.com>
> ---
>  kernel/printk.c |   13 +------------
>  1 files changed, 1 insertions(+), 12 deletions(-)
>
> diff --git a/kernel/printk.c b/kernel/printk.c
> index 32690a0..bf96a7d 100644
> --- a/kernel/printk.c
> +++ b/kernel/printk.c
> @@ -31,7 +31,6 @@
>  #include <linux/smp.h>
>  #include <linux/security.h>
>  #include <linux/bootmem.h>
> -#include <linux/memblock.h>
>  #include <linux/syscalls.h>
>  #include <linux/kexec.h>
>  #include <linux/kdb.h>
> @@ -195,17 +194,7 @@ void __init setup_log_buf(int early)
>        if (!new_log_buf_len)
>                return;
>
> -       if (early) {
> -               unsigned long mem;
> -
> -               mem = memblock_alloc(new_log_buf_len, PAGE_SIZE);
> -               if (!mem)
> -                       return;
> -               new_log_buf = __va(mem);
> -       } else {
> -               new_log_buf = alloc_bootmem_nopanic(new_log_buf_len);
> -       }
> -
> +       new_log_buf = alloc_bootmem_nopanic(new_log_buf_len);
>        if (unlikely(!new_log_buf)) {
>                pr_err("log_buf_len: %ld bytes not available\n",
>                        new_log_buf_len);
> --

that seems not right.

for x86, setup_log_buf(1) is quite early called in setup_arch() before
bootmem is there.

bootmem should be killed after memblock is supported for arch that
current support bootmem.

Yinghai

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 4/5] printk: use alloc_bootmem() instead of memblock_alloc().
  2012-03-13  6:13     ` Yinghai Lu
@ 2012-03-13  6:40       ` Avery Pennarun
  -1 siblings, 0 replies; 52+ messages in thread
From: Avery Pennarun @ 2012-03-13  6:40 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Andrew Morton, Josh Triplett, Paul E. McKenney, Ingo Molnar,
	David S. Miller, Peter Zijlstra, Fabio M. Di Nitto,
	Johannes Weiner, Olaf Hering, Paul Gortmaker, Tejun Heo,
	H. Peter Anvin, linux-kernel, linux-mm

> that seems not right.
>
> for x86, setup_log_buf(1) is quite early called in setup_arch() before
> bootmem is there.
>
> bootmem should be killed after memblock is supported for arch that
> current support bootmem.

Hmm.  x86 uses nobootmem.c, which implements bootmem in terms of
memblock anyway.  It is definitely working at setup_log_buf() time (or
else it wouldn't be able to select a sensible buffer location).

I suppose you're saying that it wouldn't work for a hypothetical
architecture that *does* support bootmem and *also* supports
setup_log_buf(1).  Will there ever be such an architecture, or will
bootmem be retired first?

Thanks,

Avery

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 4/5] printk: use alloc_bootmem() instead of memblock_alloc().
@ 2012-03-13  6:40       ` Avery Pennarun
  0 siblings, 0 replies; 52+ messages in thread
From: Avery Pennarun @ 2012-03-13  6:40 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Andrew Morton, Josh Triplett, Paul E. McKenney, Ingo Molnar,
	David S. Miller, Peter Zijlstra, Fabio M. Di Nitto,
	Johannes Weiner, Olaf Hering, Paul Gortmaker, Tejun Heo,
	H. Peter Anvin, linux-kernel, linux-mm

> that seems not right.
>
> for x86, setup_log_buf(1) is quite early called in setup_arch() before
> bootmem is there.
>
> bootmem should be killed after memblock is supported for arch that
> current support bootmem.

Hmm.  x86 uses nobootmem.c, which implements bootmem in terms of
memblock anyway.  It is definitely working at setup_log_buf() time (or
else it wouldn't be able to select a sensible buffer location).

I suppose you're saying that it wouldn't work for a hypothetical
architecture that *does* support bootmem and *also* supports
setup_log_buf(1).  Will there ever be such an architecture, or will
bootmem be retired first?

Thanks,

Avery

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 0/5] Persist printk buffer across reboots.
  2012-03-13  6:00     ` Avery Pennarun
@ 2012-03-13  6:50       ` David Miller
  -1 siblings, 0 replies; 52+ messages in thread
From: David Miller @ 2012-03-13  6:50 UTC (permalink / raw)
  To: apenwarr
  Cc: akpm, josh, paulmck, mingo, a.p.zijlstra, fdinitto, hannes, olaf,
	paul.gortmaker, tj, hpa, yinghai, linux-kernel, linux-mm

From: Avery Pennarun <apenwarr@gmail.com>
Date: Tue, 13 Mar 2012 02:00:30 -0400

> Sounds good to me.  Do you have any pointers?  Just use an
> early_param?  If we see the early_param but we can't reserve the
> requested address, should we fall back to probing or disable the
> PRINTK_PERSIST mode entirely?

The interface is prom_retain() in f.e. arch/sparc/prom/misc_64.c

You give it a string name, a size in bytes, and an alignment.  And you
are given a physical address on success.  I'm pretty sure the string
name you give is one of the keys it uses to look up the same piece of
memory for you next time.  So you can have retained memory across soft
resets not just for log buffers, but for other things too.

The idea is that you call prom_retain() before you take a look at what
physical memory is available in the kernel, and the firmware takes
this physical chunk out of those available memory lists upon
prom_retain() success.


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 0/5] Persist printk buffer across reboots.
@ 2012-03-13  6:50       ` David Miller
  0 siblings, 0 replies; 52+ messages in thread
From: David Miller @ 2012-03-13  6:50 UTC (permalink / raw)
  To: apenwarr
  Cc: akpm, josh, paulmck, mingo, a.p.zijlstra, fdinitto, hannes, olaf,
	paul.gortmaker, tj, hpa, yinghai, linux-kernel, linux-mm

From: Avery Pennarun <apenwarr@gmail.com>
Date: Tue, 13 Mar 2012 02:00:30 -0400

> Sounds good to me.  Do you have any pointers?  Just use an
> early_param?  If we see the early_param but we can't reserve the
> requested address, should we fall back to probing or disable the
> PRINTK_PERSIST mode entirely?

The interface is prom_retain() in f.e. arch/sparc/prom/misc_64.c

You give it a string name, a size in bytes, and an alignment.  And you
are given a physical address on success.  I'm pretty sure the string
name you give is one of the keys it uses to look up the same piece of
memory for you next time.  So you can have retained memory across soft
resets not just for log buffers, but for other things too.

The idea is that you call prom_retain() before you take a look at what
physical memory is available in the kernel, and the firmware takes
this physical chunk out of those available memory lists upon
prom_retain() success.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 0/5] Persist printk buffer across reboots.
  2012-03-13  6:50       ` David Miller
@ 2012-03-13  7:14         ` Avery Pennarun
  -1 siblings, 0 replies; 52+ messages in thread
From: Avery Pennarun @ 2012-03-13  7:14 UTC (permalink / raw)
  To: David Miller
  Cc: akpm, josh, paulmck, mingo, a.p.zijlstra, fdinitto, hannes, olaf,
	paul.gortmaker, tj, hpa, yinghai, linux-kernel, linux-mm

On Tue, Mar 13, 2012 at 2:50 AM, David Miller <davem@davemloft.net> wrote:
> The idea is that you call prom_retain() before you take a look at what
> physical memory is available in the kernel, and the firmware takes
> this physical chunk out of those available memory lists upon
> prom_retain() success.

This sounds like exactly the API I would have wanted, however:

1) It's only available in arch/sparc so I can't test my patch if I try
to use it;
2) There's nobody that calls it so it might not work;
3) I don't understand the API so I'm not really confident that
reserving memory this way will actually prevent it from being seen by
the kernel.

In short, I think I would screw it up.

On the other hand, as written it seems like my code would also work on
sparc, and would work with more than one kind of memory area if more
than one module chose to use this technique.  (ie. Since the prober
actually reserves memory, the next prober would necessarily reserve a
different bit of memory, and as long as you're using the same kernel
as before and you do all reservations before enabling interrupts, you
should get consistent results.)

I suppose I could move the actual probe-and-allocate code somewhere
(bootmem.c?  memblock.c?) and add a 'name' parameter to it which is
ignored in the generic implementation.  Then someone could write an
arch-specific implementation.  Would that work?  If so, please
recommend which file to put the generic implementation in :)

Thanks,

Avery

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 0/5] Persist printk buffer across reboots.
@ 2012-03-13  7:14         ` Avery Pennarun
  0 siblings, 0 replies; 52+ messages in thread
From: Avery Pennarun @ 2012-03-13  7:14 UTC (permalink / raw)
  To: David Miller
  Cc: akpm, josh, paulmck, mingo, a.p.zijlstra, fdinitto, hannes, olaf,
	paul.gortmaker, tj, hpa, yinghai, linux-kernel, linux-mm

On Tue, Mar 13, 2012 at 2:50 AM, David Miller <davem@davemloft.net> wrote:
> The idea is that you call prom_retain() before you take a look at what
> physical memory is available in the kernel, and the firmware takes
> this physical chunk out of those available memory lists upon
> prom_retain() success.

This sounds like exactly the API I would have wanted, however:

1) It's only available in arch/sparc so I can't test my patch if I try
to use it;
2) There's nobody that calls it so it might not work;
3) I don't understand the API so I'm not really confident that
reserving memory this way will actually prevent it from being seen by
the kernel.

In short, I think I would screw it up.

On the other hand, as written it seems like my code would also work on
sparc, and would work with more than one kind of memory area if more
than one module chose to use this technique.  (ie. Since the prober
actually reserves memory, the next prober would necessarily reserve a
different bit of memory, and as long as you're using the same kernel
as before and you do all reservations before enabling interrupts, you
should get consistent results.)

I suppose I could move the actual probe-and-allocate code somewhere
(bootmem.c?  memblock.c?) and add a 'name' parameter to it which is
ignored in the generic implementation.  Then someone could write an
arch-specific implementation.  Would that work?  If so, please
recommend which file to put the generic implementation in :)

Thanks,

Avery

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 0/5] Persist printk buffer across reboots.
  2012-03-13  7:14         ` Avery Pennarun
@ 2012-03-13  7:18           ` David Miller
  -1 siblings, 0 replies; 52+ messages in thread
From: David Miller @ 2012-03-13  7:18 UTC (permalink / raw)
  To: apenwarr
  Cc: akpm, josh, paulmck, mingo, a.p.zijlstra, fdinitto, hannes, olaf,
	paul.gortmaker, tj, hpa, yinghai, linux-kernel, linux-mm

From: Avery Pennarun <apenwarr@gmail.com>
Date: Tue, 13 Mar 2012 03:14:21 -0400

> On Tue, Mar 13, 2012 at 2:50 AM, David Miller <davem@davemloft.net> wrote:
>> The idea is that you call prom_retain() before you take a look at what
>> physical memory is available in the kernel, and the firmware takes
>> this physical chunk out of those available memory lists upon
>> prom_retain() success.
> 
> This sounds like exactly the API I would have wanted, however:
> 
> 1) It's only available in arch/sparc so I can't test my patch if I try
> to use it;
> 2) There's nobody that calls it so it might not work;
> 3) I don't understand the API so I'm not really confident that
> reserving memory this way will actually prevent it from being seen by
> the kernel.
> 
> In short, I think I would screw it up.

I'm only saying that you should design your stuff such that an
architecture with such features could easily hook into it using this
kind facility.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 0/5] Persist printk buffer across reboots.
@ 2012-03-13  7:18           ` David Miller
  0 siblings, 0 replies; 52+ messages in thread
From: David Miller @ 2012-03-13  7:18 UTC (permalink / raw)
  To: apenwarr
  Cc: akpm, josh, paulmck, mingo, a.p.zijlstra, fdinitto, hannes, olaf,
	paul.gortmaker, tj, hpa, yinghai, linux-kernel, linux-mm

From: Avery Pennarun <apenwarr@gmail.com>
Date: Tue, 13 Mar 2012 03:14:21 -0400

> On Tue, Mar 13, 2012 at 2:50 AM, David Miller <davem@davemloft.net> wrote:
>> The idea is that you call prom_retain() before you take a look at what
>> physical memory is available in the kernel, and the firmware takes
>> this physical chunk out of those available memory lists upon
>> prom_retain() success.
> 
> This sounds like exactly the API I would have wanted, however:
> 
> 1) It's only available in arch/sparc so I can't test my patch if I try
> to use it;
> 2) There's nobody that calls it so it might not work;
> 3) I don't understand the API so I'm not really confident that
> reserving memory this way will actually prevent it from being seen by
> the kernel.
> 
> In short, I think I would screw it up.

I'm only saying that you should design your stuff such that an
architecture with such features could easily hook into it using this
kind facility.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 0/5] Persist printk buffer across reboots.
  2012-03-13  7:18           ` David Miller
@ 2012-03-13  8:10             ` Avery Pennarun
  -1 siblings, 0 replies; 52+ messages in thread
From: Avery Pennarun @ 2012-03-13  8:10 UTC (permalink / raw)
  To: David Miller
  Cc: akpm, josh, paulmck, mingo, a.p.zijlstra, fdinitto, hannes, olaf,
	paul.gortmaker, tj, hpa, yinghai, linux-kernel, linux-mm

On Tue, Mar 13, 2012 at 3:18 AM, David Miller <davem@davemloft.net> wrote:
> I'm only saying that you should design your stuff such that an
> architecture with such features could easily hook into it using this
> kind facility.

How about this?


diff --git a/include/linux/memblock.h b/include/linux/memblock.h
index a6bb102..7335cf7 100644
--- a/include/linux/memblock.h
+++ b/include/linux/memblock.h
@@ -59,6 +59,8 @@ int memblock_add(phys_addr_t base, phys_addr_t size);
 int memblock_remove(phys_addr_t base, phys_addr_t size);
 int memblock_free(phys_addr_t base, phys_addr_t size);
 int memblock_reserve(phys_addr_t base, phys_addr_t size);
+phys_addr_t memblock_reserve_by_name(const char *name,
+	phys_addr_t size, phys_addr_t align);

 #ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
 void __next_mem_pfn_range(int *idx, int nid, unsigned long *out_start_pfn,
@@ -246,6 +248,11 @@ static inline phys_addr_t
memblock_alloc(phys_addr_t size, phys_addr_t align)
 	return 0;
 }

+phys_addr_t memblock_reserve_by_name(const char *name,
+	phys_addr_t size, phys_addr_t align)
+{
+	return 0;
+}
 #endif /* CONFIG_HAVE_MEMBLOCK */

 #endif /* __KERNEL__ */
diff --git a/mm/memblock.c b/mm/memblock.c
index 99f2855..2ab4559 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -519,6 +519,38 @@ int __init_memblock memblock_reserve(phys_addr_t
base, phys_addr_t size)
 	return memblock_add_region(_rgn, base, size, MAX_NUMNODES);
 }

+#define RESERVE_SEARCH_END 0xfe000000
+#define RESERVE_SEARCH_JUMP (16*1024*1024)
+
+/**
+ * Find a well-defined location for the given memory area and reserve it.
+ * The generic version just scans through memory looking for an available
+ * area, and ignores the name.  An arch-specific version could request a
+ * named area from the bootloader (eg.  prom_retain()) in the hopes of
+ * getting a region guaranteed not to be messed up by the bootloader.
+ */
+phys_addr_t __init_memblock memblock_reserve_by_name(const char *name,
+	phys_addr_t size, phys_addr_t align)
+{
+	unsigned long where;
+
+	for (where = RESERVE_SEARCH_END - align;
+			where >= RESERVE_SEARCH_JUMP;
+			where -= RESERVE_SEARCH_JUMP) {
+		where &= ~(roundup_pow_of_two(align) - 1);
+		if (memblock_find_in_range(where, where + size,
+					size, align) != 0) {
+			memblock_reserve(where, size);
+			printk(KERN_INFO "memblock(%s): "
+				"reserved %lu @ 0x%08lx\n",
+				name, (unsigned long)size,
+				(unsigned long)where);
+			return where;
+		}
+	}
+	return 0;
+}
+
 /**
  * __next_free_mem_range - next function for for_each_free_mem_range()
  * @idx: pointer to u64 loop variable
-- 
1.7.7.3

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH 0/5] Persist printk buffer across reboots.
@ 2012-03-13  8:10             ` Avery Pennarun
  0 siblings, 0 replies; 52+ messages in thread
From: Avery Pennarun @ 2012-03-13  8:10 UTC (permalink / raw)
  To: David Miller
  Cc: akpm, josh, paulmck, mingo, a.p.zijlstra, fdinitto, hannes, olaf,
	paul.gortmaker, tj, hpa, yinghai, linux-kernel, linux-mm

On Tue, Mar 13, 2012 at 3:18 AM, David Miller <davem@davemloft.net> wrote:
> I'm only saying that you should design your stuff such that an
> architecture with such features could easily hook into it using this
> kind facility.

How about this?


diff --git a/include/linux/memblock.h b/include/linux/memblock.h
index a6bb102..7335cf7 100644
--- a/include/linux/memblock.h
+++ b/include/linux/memblock.h
@@ -59,6 +59,8 @@ int memblock_add(phys_addr_t base, phys_addr_t size);
 int memblock_remove(phys_addr_t base, phys_addr_t size);
 int memblock_free(phys_addr_t base, phys_addr_t size);
 int memblock_reserve(phys_addr_t base, phys_addr_t size);
+phys_addr_t memblock_reserve_by_name(const char *name,
+	phys_addr_t size, phys_addr_t align);

 #ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
 void __next_mem_pfn_range(int *idx, int nid, unsigned long *out_start_pfn,
@@ -246,6 +248,11 @@ static inline phys_addr_t
memblock_alloc(phys_addr_t size, phys_addr_t align)
 	return 0;
 }

+phys_addr_t memblock_reserve_by_name(const char *name,
+	phys_addr_t size, phys_addr_t align)
+{
+	return 0;
+}
 #endif /* CONFIG_HAVE_MEMBLOCK */

 #endif /* __KERNEL__ */
diff --git a/mm/memblock.c b/mm/memblock.c
index 99f2855..2ab4559 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -519,6 +519,38 @@ int __init_memblock memblock_reserve(phys_addr_t
base, phys_addr_t size)
 	return memblock_add_region(_rgn, base, size, MAX_NUMNODES);
 }

+#define RESERVE_SEARCH_END 0xfe000000
+#define RESERVE_SEARCH_JUMP (16*1024*1024)
+
+/**
+ * Find a well-defined location for the given memory area and reserve it.
+ * The generic version just scans through memory looking for an available
+ * area, and ignores the name.  An arch-specific version could request a
+ * named area from the bootloader (eg.  prom_retain()) in the hopes of
+ * getting a region guaranteed not to be messed up by the bootloader.
+ */
+phys_addr_t __init_memblock memblock_reserve_by_name(const char *name,
+	phys_addr_t size, phys_addr_t align)
+{
+	unsigned long where;
+
+	for (where = RESERVE_SEARCH_END - align;
+			where >= RESERVE_SEARCH_JUMP;
+			where -= RESERVE_SEARCH_JUMP) {
+		where &= ~(roundup_pow_of_two(align) - 1);
+		if (memblock_find_in_range(where, where + size,
+					size, align) != 0) {
+			memblock_reserve(where, size);
+			printk(KERN_INFO "memblock(%s): "
+				"reserved %lu @ 0x%08lx\n",
+				name, (unsigned long)size,
+				(unsigned long)where);
+			return where;
+		}
+	}
+	return 0;
+}
+
 /**
  * __next_free_mem_range - next function for for_each_free_mem_range()
  * @idx: pointer to u64 loop variable
-- 
1.7.7.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH 0/5] Persist printk buffer across reboots.
  2012-03-13  8:10             ` Avery Pennarun
@ 2012-03-13  8:16               ` David Miller
  -1 siblings, 0 replies; 52+ messages in thread
From: David Miller @ 2012-03-13  8:16 UTC (permalink / raw)
  To: apenwarr
  Cc: akpm, josh, paulmck, mingo, a.p.zijlstra, fdinitto, hannes, olaf,
	paul.gortmaker, tj, hpa, yinghai, linux-kernel, linux-mm

From: Avery Pennarun <apenwarr@gmail.com>
Date: Tue, 13 Mar 2012 04:10:42 -0400

> On Tue, Mar 13, 2012 at 3:18 AM, David Miller <davem@davemloft.net> wrote:
>> I'm only saying that you should design your stuff such that an
>> architecture with such features could easily hook into it using this
>> kind facility.
> 
> How about this?

Function signature looks good.

But on a platform with firmware based memory retaining facilitites
we're going to need to invoke this way before bootmem or memblocks are
even setup, because the retain call influences the free memory lists
that the firmware gives to us and those free memory lists are what we
use to populate the memblock tables.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 0/5] Persist printk buffer across reboots.
@ 2012-03-13  8:16               ` David Miller
  0 siblings, 0 replies; 52+ messages in thread
From: David Miller @ 2012-03-13  8:16 UTC (permalink / raw)
  To: apenwarr
  Cc: akpm, josh, paulmck, mingo, a.p.zijlstra, fdinitto, hannes, olaf,
	paul.gortmaker, tj, hpa, yinghai, linux-kernel, linux-mm

From: Avery Pennarun <apenwarr@gmail.com>
Date: Tue, 13 Mar 2012 04:10:42 -0400

> On Tue, Mar 13, 2012 at 3:18 AM, David Miller <davem@davemloft.net> wrote:
>> I'm only saying that you should design your stuff such that an
>> architecture with such features could easily hook into it using this
>> kind facility.
> 
> How about this?

Function signature looks good.

But on a platform with firmware based memory retaining facilitites
we're going to need to invoke this way before bootmem or memblocks are
even setup, because the retain call influences the free memory lists
that the firmware gives to us and those free memory lists are what we
use to populate the memblock tables.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 4/5] printk: use alloc_bootmem() instead of memblock_alloc().
  2012-03-13  6:40       ` Avery Pennarun
@ 2012-03-13  8:26         ` Ingo Molnar
  -1 siblings, 0 replies; 52+ messages in thread
From: Ingo Molnar @ 2012-03-13  8:26 UTC (permalink / raw)
  To: Avery Pennarun
  Cc: Yinghai Lu, Andrew Morton, Josh Triplett, Paul E. McKenney,
	David S. Miller, Peter Zijlstra, Fabio M. Di Nitto,
	Johannes Weiner, Olaf Hering, Paul Gortmaker, Tejun Heo,
	H. Peter Anvin, linux-kernel, linux-mm


* Avery Pennarun <apenwarr@gmail.com> wrote:

> Hmm.  x86 uses nobootmem.c, [...]

For new code, we use memblock_reserve(), memblock_alloc(), et al.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 4/5] printk: use alloc_bootmem() instead of memblock_alloc().
@ 2012-03-13  8:26         ` Ingo Molnar
  0 siblings, 0 replies; 52+ messages in thread
From: Ingo Molnar @ 2012-03-13  8:26 UTC (permalink / raw)
  To: Avery Pennarun
  Cc: Yinghai Lu, Andrew Morton, Josh Triplett, Paul E. McKenney,
	David S. Miller, Peter Zijlstra, Fabio M. Di Nitto,
	Johannes Weiner, Olaf Hering, Paul Gortmaker, Tejun Heo,
	H. Peter Anvin, linux-kernel, linux-mm


* Avery Pennarun <apenwarr@gmail.com> wrote:

> Hmm.  x86 uses nobootmem.c, [...]

For new code, we use memblock_reserve(), memblock_alloc(), et al.

Thanks,

	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 0/5] Persist printk buffer across reboots.
  2012-03-13  5:36 ` Avery Pennarun
@ 2012-03-13  8:32   ` Stephen Boyd
  -1 siblings, 0 replies; 52+ messages in thread
From: Stephen Boyd @ 2012-03-13  8:32 UTC (permalink / raw)
  To: Avery Pennarun
  Cc: Andrew Morton, Josh Triplett, Paul E. McKenney, Ingo Molnar,
	David S. Miller, Peter Zijlstra, Fabio M. Di Nitto,
	Johannes Weiner, Olaf Hering, Paul Gortmaker, Tejun Heo,
	H. Peter Anvin, Yinghai LU, linux-kernel, linux-mm

On 3/12/2012 10:36 PM, Avery Pennarun wrote:
> The last patch in this series implements a new CONFIG_PRINTK_PERSIST option
> that, when enabled, puts the printk buffer in a well-defined memory location
> so that we can keep appending to it after a reboot.  The upshot is that,
> even after a kernel panic or non-panic hard lockup, on the next boot
> userspace will be able to grab the kernel messages leading up to it.  It
> could then upload the messages to a server (for example) to keep crash
> statistics.
>
> The preceding patches in the series are mostly just things I fixed up while
> working on that patch.
>
> Some notes:
>
> - I'm not totally sure of the locking or portability issues when calling
>    memblock or bootmem.  This all happens really early, and I *think*
>    interrupts are still disabled at that time, so it's probably okay.
>
> - Tested this version on x86 (kvm) and it works with soft reboot (ie. reboot
>    -f).  Since some BIOSes wipe the memory during boot, you might not have
>    any luck.  It should be great on many embedded systems, though, including
>    the MIPS system I've tested a variant of this patch on.  (Our MIPS build
>    is based on a slightly older kernel so it's not 100% the same, but I think
>    this should behave identically.)
>
> - The way we choose a well-defined memory location is slightly suspicious
>    (we just count down from the top of the address space) but I've tested it
>    pretty carefully, and it seems to be okay.
>
> - In printk.c with CONFIG_PRINTK_PERSIST set, we're #defining words like
>    log_end.  It might be cleaner to replace all instances of log_end with
>    LOG_END to make this more clear.  This is also the reason the struct
>    logbits members start with _: because otherwise they conflict with the
>    macro.  Suggestions welcome.

Android has something similar called ram_console (see 
staging/android/ram_console.c). The console is dumped to a ram buffer 
that is reserved very early in platform setup code. Then when the phone 
reboots you can cat /proc/last_kmsg to get the previous kernel message 
for debugging. Can you use that code?

-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 0/5] Persist printk buffer across reboots.
@ 2012-03-13  8:32   ` Stephen Boyd
  0 siblings, 0 replies; 52+ messages in thread
From: Stephen Boyd @ 2012-03-13  8:32 UTC (permalink / raw)
  To: Avery Pennarun
  Cc: Andrew Morton, Josh Triplett, Paul E. McKenney, Ingo Molnar,
	David S. Miller, Peter Zijlstra, Fabio M. Di Nitto,
	Johannes Weiner, Olaf Hering, Paul Gortmaker, Tejun Heo,
	H. Peter Anvin, Yinghai LU, linux-kernel, linux-mm

On 3/12/2012 10:36 PM, Avery Pennarun wrote:
> The last patch in this series implements a new CONFIG_PRINTK_PERSIST option
> that, when enabled, puts the printk buffer in a well-defined memory location
> so that we can keep appending to it after a reboot.  The upshot is that,
> even after a kernel panic or non-panic hard lockup, on the next boot
> userspace will be able to grab the kernel messages leading up to it.  It
> could then upload the messages to a server (for example) to keep crash
> statistics.
>
> The preceding patches in the series are mostly just things I fixed up while
> working on that patch.
>
> Some notes:
>
> - I'm not totally sure of the locking or portability issues when calling
>    memblock or bootmem.  This all happens really early, and I *think*
>    interrupts are still disabled at that time, so it's probably okay.
>
> - Tested this version on x86 (kvm) and it works with soft reboot (ie. reboot
>    -f).  Since some BIOSes wipe the memory during boot, you might not have
>    any luck.  It should be great on many embedded systems, though, including
>    the MIPS system I've tested a variant of this patch on.  (Our MIPS build
>    is based on a slightly older kernel so it's not 100% the same, but I think
>    this should behave identically.)
>
> - The way we choose a well-defined memory location is slightly suspicious
>    (we just count down from the top of the address space) but I've tested it
>    pretty carefully, and it seems to be okay.
>
> - In printk.c with CONFIG_PRINTK_PERSIST set, we're #defining words like
>    log_end.  It might be cleaner to replace all instances of log_end with
>    LOG_END to make this more clear.  This is also the reason the struct
>    logbits members start with _: because otherwise they conflict with the
>    macro.  Suggestions welcome.

Android has something similar called ram_console (see 
staging/android/ram_console.c). The console is dumped to a ram buffer 
that is reserved very early in platform setup code. Then when the phone 
reboots you can cat /proc/last_kmsg to get the previous kernel message 
for debugging. Can you use that code?

-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 0/5] Persist printk buffer across reboots.
  2012-03-13  5:53   ` David Miller
@ 2012-03-13 13:50     ` Peter Zijlstra
  -1 siblings, 0 replies; 52+ messages in thread
From: Peter Zijlstra @ 2012-03-13 13:50 UTC (permalink / raw)
  To: David Miller
  Cc: apenwarr, akpm, josh, paulmck, mingo, fdinitto, hannes, olaf,
	paul.gortmaker, tj, hpa, yinghai, linux-kernel, linux-mm

On Mon, 2012-03-12 at 22:53 -0700, David Miller wrote:
> From: Avery Pennarun <apenwarr@gmail.com>
> Date: Tue, 13 Mar 2012 01:36:36 -0400
> 
> > The last patch in this series implements a new CONFIG_PRINTK_PERSIST option
> > that, when enabled, puts the printk buffer in a well-defined memory location
> > so that we can keep appending to it after a reboot.  The upshot is that,
> > even after a kernel panic or non-panic hard lockup, on the next boot
> > userspace will be able to grab the kernel messages leading up to it.  It
> > could then upload the messages to a server (for example) to keep crash
> > statistics.
> 
> On some platforms there are formal ways to reserve areas of memory
> such that the bootup firmware will know to not touch it on soft resets
> no matter what.  For example, on Sparc there are OpenFirmware calls to
> set aside such an area of soft-reset preserved memory.
> 
> I think some formal agreement with the system firmware is a lot better
> when available, and should be explicitly accomodated in these changes
> so that those of us with such facilities can very easily hook it up.

Shouldn't this all be near the pstore effort? I know pstore and the
soft-reset stuff aren't quite the same, but if that's the best Sparc can
do, then why not?

OTOH if Sparc can actually do pstore too, then it might make sense.

What I guess I'm saying is that we should try and minimize the duplicate
efforts here.. and it seems to me that writing a soft reset x86 backend
to pstore for those machines that don't actually have the acpi flash
crap might be more useful and less duplicative.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 0/5] Persist printk buffer across reboots.
@ 2012-03-13 13:50     ` Peter Zijlstra
  0 siblings, 0 replies; 52+ messages in thread
From: Peter Zijlstra @ 2012-03-13 13:50 UTC (permalink / raw)
  To: David Miller
  Cc: apenwarr, akpm, josh, paulmck, mingo, fdinitto, hannes, olaf,
	paul.gortmaker, tj, hpa, yinghai, linux-kernel, linux-mm

On Mon, 2012-03-12 at 22:53 -0700, David Miller wrote:
> From: Avery Pennarun <apenwarr@gmail.com>
> Date: Tue, 13 Mar 2012 01:36:36 -0400
> 
> > The last patch in this series implements a new CONFIG_PRINTK_PERSIST option
> > that, when enabled, puts the printk buffer in a well-defined memory location
> > so that we can keep appending to it after a reboot.  The upshot is that,
> > even after a kernel panic or non-panic hard lockup, on the next boot
> > userspace will be able to grab the kernel messages leading up to it.  It
> > could then upload the messages to a server (for example) to keep crash
> > statistics.
> 
> On some platforms there are formal ways to reserve areas of memory
> such that the bootup firmware will know to not touch it on soft resets
> no matter what.  For example, on Sparc there are OpenFirmware calls to
> set aside such an area of soft-reset preserved memory.
> 
> I think some formal agreement with the system firmware is a lot better
> when available, and should be explicitly accomodated in these changes
> so that those of us with such facilities can very easily hook it up.

Shouldn't this all be near the pstore effort? I know pstore and the
soft-reset stuff aren't quite the same, but if that's the best Sparc can
do, then why not?

OTOH if Sparc can actually do pstore too, then it might make sense.

What I guess I'm saying is that we should try and minimize the duplicate
efforts here.. and it seems to me that writing a soft reset x86 backend
to pstore for those machines that don't actually have the acpi flash
crap might be more useful and less duplicative.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 0/5] Persist printk buffer across reboots.
  2012-03-13  5:36 ` Avery Pennarun
@ 2012-03-13 17:08   ` Daniel Walker
  -1 siblings, 0 replies; 52+ messages in thread
From: Daniel Walker @ 2012-03-13 17:08 UTC (permalink / raw)
  To: Avery Pennarun
  Cc: Andrew Morton, Josh Triplett, Paul E. McKenney, Ingo Molnar,
	David S. Miller, Peter Zijlstra, Fabio M. Di Nitto,
	Johannes Weiner, Olaf Hering, Paul Gortmaker, Tejun Heo,
	H. Peter Anvin, Yinghai LU, linux-kernel, linux-mm

On Tue, Mar 13, 2012 at 01:36:36AM -0400, Avery Pennarun wrote:
> The last patch in this series implements a new CONFIG_PRINTK_PERSIST option
> that, when enabled, puts the printk buffer in a well-defined memory location
> so that we can keep appending to it after a reboot.  The upshot is that,
> even after a kernel panic or non-panic hard lockup, on the next boot
> userspace will be able to grab the kernel messages leading up to it.  It
> could then upload the messages to a server (for example) to keep crash
> statistics.

There's currently driver/mtd/mtdoops.c, fs/pstore/, and
drivers/staging/android/ram_console.c that do similar things
as this. Did you investigate those for potentially modifying them to add
this functionality ? If so what issues did you find?

I have a arm MSM G1 with persistent memory at 0x16d00000 size 20000bytes..
It's fairly simple you just have to ioremap the memory, but then it's good
for writing.. Currently the android ram_console uses this. How would I
convert this area for use with your changes?

Daniel

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 0/5] Persist printk buffer across reboots.
@ 2012-03-13 17:08   ` Daniel Walker
  0 siblings, 0 replies; 52+ messages in thread
From: Daniel Walker @ 2012-03-13 17:08 UTC (permalink / raw)
  To: Avery Pennarun
  Cc: Andrew Morton, Josh Triplett, Paul E. McKenney, Ingo Molnar,
	David S. Miller, Peter Zijlstra, Fabio M. Di Nitto,
	Johannes Weiner, Olaf Hering, Paul Gortmaker, Tejun Heo,
	H. Peter Anvin, Yinghai LU, linux-kernel, linux-mm

On Tue, Mar 13, 2012 at 01:36:36AM -0400, Avery Pennarun wrote:
> The last patch in this series implements a new CONFIG_PRINTK_PERSIST option
> that, when enabled, puts the printk buffer in a well-defined memory location
> so that we can keep appending to it after a reboot.  The upshot is that,
> even after a kernel panic or non-panic hard lockup, on the next boot
> userspace will be able to grab the kernel messages leading up to it.  It
> could then upload the messages to a server (for example) to keep crash
> statistics.

There's currently driver/mtd/mtdoops.c, fs/pstore/, and
drivers/staging/android/ram_console.c that do similar things
as this. Did you investigate those for potentially modifying them to add
this functionality ? If so what issues did you find?

I have a arm MSM G1 with persistent memory at 0x16d00000 size 20000bytes..
It's fairly simple you just have to ioremap the memory, but then it's good
for writing.. Currently the android ram_console uses this. How would I
convert this area for use with your changes?

Daniel

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 4/5] printk: use alloc_bootmem() instead of memblock_alloc().
  2012-03-13  6:40       ` Avery Pennarun
@ 2012-03-13 21:50         ` Yinghai Lu
  -1 siblings, 0 replies; 52+ messages in thread
From: Yinghai Lu @ 2012-03-13 21:50 UTC (permalink / raw)
  To: Avery Pennarun
  Cc: Andrew Morton, Josh Triplett, Paul E. McKenney, Ingo Molnar,
	David S. Miller, Peter Zijlstra, Fabio M. Di Nitto,
	Johannes Weiner, Olaf Hering, Paul Gortmaker, Tejun Heo,
	H. Peter Anvin, linux-kernel, linux-mm

On Mon, Mar 12, 2012 at 11:40 PM, Avery Pennarun <apenwarr@gmail.com> wrote:
>> that seems not right.
>>
>> for x86, setup_log_buf(1) is quite early called in setup_arch() before
>> bootmem is there.
>>
>> bootmem should be killed after memblock is supported for arch that
>> current support bootmem.
>
> Hmm.  x86 uses nobootmem.c, which implements bootmem in terms of
> memblock anyway.  It is definitely working at setup_log_buf() time (or
> else it wouldn't be able to select a sensible buffer location).


ok, you may could do that now.
only after recent changes from Tejun, that kill early_node_map().

before that, we only can use nobootmem after
arch/x86/kernel/setup.c::setup_arch/initmem_init()
but memblock alloc could be used just after
arch/x86/kernel/setup.c::setup_arch/memblock_x86_fill()

Now you put back bootmem calling early, will cause confusion.

>
> I suppose you're saying that it wouldn't work for a hypothetical
> architecture that *does* support bootmem and *also* supports
> setup_log_buf(1).  Will there ever be such an architecture, or will
> bootmem be retired first?

we should use adding memblock_alloc calling instead... go backward...

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 4/5] printk: use alloc_bootmem() instead of memblock_alloc().
@ 2012-03-13 21:50         ` Yinghai Lu
  0 siblings, 0 replies; 52+ messages in thread
From: Yinghai Lu @ 2012-03-13 21:50 UTC (permalink / raw)
  To: Avery Pennarun
  Cc: Andrew Morton, Josh Triplett, Paul E. McKenney, Ingo Molnar,
	David S. Miller, Peter Zijlstra, Fabio M. Di Nitto,
	Johannes Weiner, Olaf Hering, Paul Gortmaker, Tejun Heo,
	H. Peter Anvin, linux-kernel, linux-mm

On Mon, Mar 12, 2012 at 11:40 PM, Avery Pennarun <apenwarr@gmail.com> wrote:
>> that seems not right.
>>
>> for x86, setup_log_buf(1) is quite early called in setup_arch() before
>> bootmem is there.
>>
>> bootmem should be killed after memblock is supported for arch that
>> current support bootmem.
>
> Hmm.  x86 uses nobootmem.c, which implements bootmem in terms of
> memblock anyway.  It is definitely working at setup_log_buf() time (or
> else it wouldn't be able to select a sensible buffer location).


ok, you may could do that now.
only after recent changes from Tejun, that kill early_node_map().

before that, we only can use nobootmem after
arch/x86/kernel/setup.c::setup_arch/initmem_init()
but memblock alloc could be used just after
arch/x86/kernel/setup.c::setup_arch/memblock_x86_fill()

Now you put back bootmem calling early, will cause confusion.

>
> I suppose you're saying that it wouldn't work for a hypothetical
> architecture that *does* support bootmem and *also* supports
> setup_log_buf(1).  Will there ever be such an architecture, or will
> bootmem be retired first?

we should use adding memblock_alloc calling instead... go backward...

Thanks

Yinghai

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 0/5] Persist printk buffer across reboots.
  2012-03-13 17:08   ` Daniel Walker
@ 2012-03-13 22:10     ` Andrew Morton
  -1 siblings, 0 replies; 52+ messages in thread
From: Andrew Morton @ 2012-03-13 22:10 UTC (permalink / raw)
  To: Daniel Walker
  Cc: Avery Pennarun, Josh Triplett, Paul E. McKenney, Ingo Molnar,
	David S. Miller, Peter Zijlstra, Fabio M. Di Nitto,
	Johannes Weiner, Olaf Hering, Paul Gortmaker, Tejun Heo,
	H. Peter Anvin, Yinghai LU, linux-kernel, linux-mm

On Tue, 13 Mar 2012 10:08:51 -0700
Daniel Walker <dwalker@fifo99.com> wrote:

> On Tue, Mar 13, 2012 at 01:36:36AM -0400, Avery Pennarun wrote:
> > The last patch in this series implements a new CONFIG_PRINTK_PERSIST option
> > that, when enabled, puts the printk buffer in a well-defined memory location
> > so that we can keep appending to it after a reboot.  The upshot is that,
> > even after a kernel panic or non-panic hard lockup, on the next boot
> > userspace will be able to grab the kernel messages leading up to it.  It
> > could then upload the messages to a server (for example) to keep crash
> > statistics.
> 
> There's currently driver/mtd/mtdoops.c, fs/pstore/, and
> drivers/staging/android/ram_console.c that do similar things
> as this. Did you investigate those for potentially modifying them to add
> this functionality ? If so what issues did you find?
> 
> I have a arm MSM G1 with persistent memory at 0x16d00000 size 20000bytes..
> It's fairly simple you just have to ioremap the memory, but then it's good
> for writing.. Currently the android ram_console uses this. How would I
> convert this area for use with your changes?

Yes, and various local implementations which do things such as stuffing
the log buffer into NVRAM as the kernel is crashing.

I do think we'd need some back-end driver arrangement which will permit
the use of stores which aren't in addressible mamory.

It's quite the can of worms, but definitely worth doing if we can get
it approximately correct.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 0/5] Persist printk buffer across reboots.
@ 2012-03-13 22:10     ` Andrew Morton
  0 siblings, 0 replies; 52+ messages in thread
From: Andrew Morton @ 2012-03-13 22:10 UTC (permalink / raw)
  To: Daniel Walker
  Cc: Avery Pennarun, Josh Triplett, Paul E. McKenney, Ingo Molnar,
	David S. Miller, Peter Zijlstra, Fabio M. Di Nitto,
	Johannes Weiner, Olaf Hering, Paul Gortmaker, Tejun Heo,
	H. Peter Anvin, Yinghai LU, linux-kernel, linux-mm

On Tue, 13 Mar 2012 10:08:51 -0700
Daniel Walker <dwalker@fifo99.com> wrote:

> On Tue, Mar 13, 2012 at 01:36:36AM -0400, Avery Pennarun wrote:
> > The last patch in this series implements a new CONFIG_PRINTK_PERSIST option
> > that, when enabled, puts the printk buffer in a well-defined memory location
> > so that we can keep appending to it after a reboot.  The upshot is that,
> > even after a kernel panic or non-panic hard lockup, on the next boot
> > userspace will be able to grab the kernel messages leading up to it.  It
> > could then upload the messages to a server (for example) to keep crash
> > statistics.
> 
> There's currently driver/mtd/mtdoops.c, fs/pstore/, and
> drivers/staging/android/ram_console.c that do similar things
> as this. Did you investigate those for potentially modifying them to add
> this functionality ? If so what issues did you find?
> 
> I have a arm MSM G1 with persistent memory at 0x16d00000 size 20000bytes..
> It's fairly simple you just have to ioremap the memory, but then it's good
> for writing.. Currently the android ram_console uses this. How would I
> convert this area for use with your changes?

Yes, and various local implementations which do things such as stuffing
the log buffer into NVRAM as the kernel is crashing.

I do think we'd need some back-end driver arrangement which will permit
the use of stores which aren't in addressible mamory.

It's quite the can of worms, but definitely worth doing if we can get
it approximately correct.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 0/5] Persist printk buffer across reboots.
  2012-03-13 13:50     ` Peter Zijlstra
@ 2012-03-14  1:57       ` Daniel Walker
  -1 siblings, 0 replies; 52+ messages in thread
From: Daniel Walker @ 2012-03-14  1:57 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: David Miller, apenwarr, akpm, josh, paulmck, mingo, fdinitto,
	hannes, olaf, paul.gortmaker, tj, hpa, yinghai, linux-kernel,
	linux-mm

On Tue, Mar 13, 2012 at 02:50:04PM +0100, Peter Zijlstra wrote:
> On Mon, 2012-03-12 at 22:53 -0700, David Miller wrote:
> > From: Avery Pennarun <apenwarr@gmail.com>
> > Date: Tue, 13 Mar 2012 01:36:36 -0400
> > 
> > > The last patch in this series implements a new CONFIG_PRINTK_PERSIST option
> > > that, when enabled, puts the printk buffer in a well-defined memory location
> > > so that we can keep appending to it after a reboot.  The upshot is that,
> > > even after a kernel panic or non-panic hard lockup, on the next boot
> > > userspace will be able to grab the kernel messages leading up to it.  It
> > > could then upload the messages to a server (for example) to keep crash
> > > statistics.
> > 
> > On some platforms there are formal ways to reserve areas of memory
> > such that the bootup firmware will know to not touch it on soft resets
> > no matter what.  For example, on Sparc there are OpenFirmware calls to
> > set aside such an area of soft-reset preserved memory.
> > 
> > I think some formal agreement with the system firmware is a lot better
> > when available, and should be explicitly accomodated in these changes
> > so that those of us with such facilities can very easily hook it up.
> 
> Shouldn't this all be near the pstore effort? I know pstore and the
> soft-reset stuff aren't quite the same, but if that's the best Sparc can
> do, then why not?
> 
> OTOH if Sparc can actually do pstore too, then it might make sense.
> 
> What I guess I'm saying is that we should try and minimize the duplicate
> efforts here.. and it seems to me that writing a soft reset x86 backend
> to pstore for those machines that don't actually have the acpi flash
> crap might be more useful and less duplicative.

I don't disagree, but pstore is written with this transactional notion in
mind. It's doesn't appear to be setup to handle just a straight memory
area. Changing mtdoops to do this would be more trivial. It would be merging
android ram_console with mtdoops basically. However, I don't know if that would
cover what Avery is doing here.

pstore does seems to have the nicest user interface (might be better in
debugfs tho). If someone wanted to move forward with pstore they
would have to write some some sort of,

int pstore_register_simple(unsigned long addr, unsigned long size);

to cover all the memory areas that aren't transaction based, or make
pstore accept a platform_device.

Daniel

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 0/5] Persist printk buffer across reboots.
@ 2012-03-14  1:57       ` Daniel Walker
  0 siblings, 0 replies; 52+ messages in thread
From: Daniel Walker @ 2012-03-14  1:57 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: David Miller, apenwarr, akpm, josh, paulmck, mingo, fdinitto,
	hannes, olaf, paul.gortmaker, tj, hpa, yinghai, linux-kernel,
	linux-mm

On Tue, Mar 13, 2012 at 02:50:04PM +0100, Peter Zijlstra wrote:
> On Mon, 2012-03-12 at 22:53 -0700, David Miller wrote:
> > From: Avery Pennarun <apenwarr@gmail.com>
> > Date: Tue, 13 Mar 2012 01:36:36 -0400
> > 
> > > The last patch in this series implements a new CONFIG_PRINTK_PERSIST option
> > > that, when enabled, puts the printk buffer in a well-defined memory location
> > > so that we can keep appending to it after a reboot.  The upshot is that,
> > > even after a kernel panic or non-panic hard lockup, on the next boot
> > > userspace will be able to grab the kernel messages leading up to it.  It
> > > could then upload the messages to a server (for example) to keep crash
> > > statistics.
> > 
> > On some platforms there are formal ways to reserve areas of memory
> > such that the bootup firmware will know to not touch it on soft resets
> > no matter what.  For example, on Sparc there are OpenFirmware calls to
> > set aside such an area of soft-reset preserved memory.
> > 
> > I think some formal agreement with the system firmware is a lot better
> > when available, and should be explicitly accomodated in these changes
> > so that those of us with such facilities can very easily hook it up.
> 
> Shouldn't this all be near the pstore effort? I know pstore and the
> soft-reset stuff aren't quite the same, but if that's the best Sparc can
> do, then why not?
> 
> OTOH if Sparc can actually do pstore too, then it might make sense.
> 
> What I guess I'm saying is that we should try and minimize the duplicate
> efforts here.. and it seems to me that writing a soft reset x86 backend
> to pstore for those machines that don't actually have the acpi flash
> crap might be more useful and less duplicative.

I don't disagree, but pstore is written with this transactional notion in
mind. It's doesn't appear to be setup to handle just a straight memory
area. Changing mtdoops to do this would be more trivial. It would be merging
android ram_console with mtdoops basically. However, I don't know if that would
cover what Avery is doing here.

pstore does seems to have the nicest user interface (might be better in
debugfs tho). If someone wanted to move forward with pstore they
would have to write some some sort of,

int pstore_register_simple(unsigned long addr, unsigned long size);

to cover all the memory areas that aren't transaction based, or make
pstore accept a platform_device.

Daniel

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 0/5] Persist printk buffer across reboots.
  2012-03-13 22:10     ` Andrew Morton
@ 2012-03-14  2:19       ` Daniel Walker
  -1 siblings, 0 replies; 52+ messages in thread
From: Daniel Walker @ 2012-03-14  2:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Avery Pennarun, Josh Triplett, Paul E. McKenney, Ingo Molnar,
	David S. Miller, Peter Zijlstra, Fabio M. Di Nitto,
	Johannes Weiner, Olaf Hering, Paul Gortmaker, Tejun Heo,
	H. Peter Anvin, Yinghai LU, linux-kernel, linux-mm, Seiji Aguchi

On Tue, Mar 13, 2012 at 03:10:49PM -0700, Andrew Morton wrote:
> On Tue, 13 Mar 2012 10:08:51 -0700
> Daniel Walker <dwalker@fifo99.com> wrote:
> 
> > On Tue, Mar 13, 2012 at 01:36:36AM -0400, Avery Pennarun wrote:
> > > The last patch in this series implements a new CONFIG_PRINTK_PERSIST option
> > > that, when enabled, puts the printk buffer in a well-defined memory location
> > > so that we can keep appending to it after a reboot.  The upshot is that,
> > > even after a kernel panic or non-panic hard lockup, on the next boot
> > > userspace will be able to grab the kernel messages leading up to it.  It
> > > could then upload the messages to a server (for example) to keep crash
> > > statistics.
> > 
> > There's currently driver/mtd/mtdoops.c, fs/pstore/, and
> > drivers/staging/android/ram_console.c that do similar things
> > as this. Did you investigate those for potentially modifying them to add
> > this functionality ? If so what issues did you find?
> > 
> > I have a arm MSM G1 with persistent memory at 0x16d00000 size 20000bytes..
> > It's fairly simple you just have to ioremap the memory, but then it's good
> > for writing.. Currently the android ram_console uses this. How would I
> > convert this area for use with your changes?
> 
> Yes, and various local implementations which do things such as stuffing
> the log buffer into NVRAM as the kernel is crashing.
> 
> I do think we'd need some back-end driver arrangement which will permit
> the use of stores which aren't in addressible mamory.
> 
> It's quite the can of worms, but definitely worth doing if we can get
> it approximately correct.

For sure.. I've had at least a couple non-OOPS based crashed I would
have liked to get logs for.

There is also this series,

http://lists.infradead.org/pipermail/kexec/2011-July/005258.html

It seems awkward that pstore is in fs/pstore/ then pstore ends up as the
"back end" where it could just be the whole solution.

Daniel

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 0/5] Persist printk buffer across reboots.
@ 2012-03-14  2:19       ` Daniel Walker
  0 siblings, 0 replies; 52+ messages in thread
From: Daniel Walker @ 2012-03-14  2:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Avery Pennarun, Josh Triplett, Paul E. McKenney, Ingo Molnar,
	David S. Miller, Peter Zijlstra, Fabio M. Di Nitto,
	Johannes Weiner, Olaf Hering, Paul Gortmaker, Tejun Heo,
	H. Peter Anvin, Yinghai LU, linux-kernel, linux-mm, Seiji Aguchi

On Tue, Mar 13, 2012 at 03:10:49PM -0700, Andrew Morton wrote:
> On Tue, 13 Mar 2012 10:08:51 -0700
> Daniel Walker <dwalker@fifo99.com> wrote:
> 
> > On Tue, Mar 13, 2012 at 01:36:36AM -0400, Avery Pennarun wrote:
> > > The last patch in this series implements a new CONFIG_PRINTK_PERSIST option
> > > that, when enabled, puts the printk buffer in a well-defined memory location
> > > so that we can keep appending to it after a reboot.  The upshot is that,
> > > even after a kernel panic or non-panic hard lockup, on the next boot
> > > userspace will be able to grab the kernel messages leading up to it.  It
> > > could then upload the messages to a server (for example) to keep crash
> > > statistics.
> > 
> > There's currently driver/mtd/mtdoops.c, fs/pstore/, and
> > drivers/staging/android/ram_console.c that do similar things
> > as this. Did you investigate those for potentially modifying them to add
> > this functionality ? If so what issues did you find?
> > 
> > I have a arm MSM G1 with persistent memory at 0x16d00000 size 20000bytes..
> > It's fairly simple you just have to ioremap the memory, but then it's good
> > for writing.. Currently the android ram_console uses this. How would I
> > convert this area for use with your changes?
> 
> Yes, and various local implementations which do things such as stuffing
> the log buffer into NVRAM as the kernel is crashing.
> 
> I do think we'd need some back-end driver arrangement which will permit
> the use of stores which aren't in addressible mamory.
> 
> It's quite the can of worms, but definitely worth doing if we can get
> it approximately correct.

For sure.. I've had at least a couple non-OOPS based crashed I would
have liked to get logs for.

There is also this series,

http://lists.infradead.org/pipermail/kexec/2011-July/005258.html

It seems awkward that pstore is in fs/pstore/ then pstore ends up as the
"back end" where it could just be the whole solution.

Daniel

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 0/5] Persist printk buffer across reboots.
  2012-03-13 22:10     ` Andrew Morton
@ 2012-03-14  2:21       ` Avery Pennarun
  -1 siblings, 0 replies; 52+ messages in thread
From: Avery Pennarun @ 2012-03-14  2:21 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Daniel Walker, Josh Triplett, Paul E. McKenney, Ingo Molnar,
	David S. Miller, Peter Zijlstra, Fabio M. Di Nitto,
	Johannes Weiner, Olaf Hering, Paul Gortmaker, Tejun Heo,
	H. Peter Anvin, Yinghai LU, linux-kernel, linux-mm

On Tue, Mar 13, 2012 at 6:10 PM, Andrew Morton
<akpm@linux-foundation.org> wrote:
> On Tue, 13 Mar 2012 10:08:51 -0700
> Daniel Walker <dwalker@fifo99.com> wrote:
>> There's currently driver/mtd/mtdoops.c, fs/pstore/, and
>> drivers/staging/android/ram_console.c that do similar things
>> as this. Did you investigate those for potentially modifying them to add
>> this functionality ? If so what issues did you find?
>>
>> I have a arm MSM G1 with persistent memory at 0x16d00000 size 20000bytes..
>> It's fairly simple you just have to ioremap the memory, but then it's good
>> for writing.. Currently the android ram_console uses this. How would I
>> convert this area for use with your changes?
>
> Yes, and various local implementations which do things such as stuffing
> the log buffer into NVRAM as the kernel is crashing.
>
> I do think we'd need some back-end driver arrangement which will permit
> the use of stores which aren't in addressible mamory.
>
> It's quite the can of worms, but definitely worth doing if we can get
> it approximately correct.

So okay, it seems there are a lot of competing ideas around.  In my
defense, I had looked at a couple of them (certainly not the full
variety represented in this thread) and the approach in my patch does
have a few advantages, I think:

1. It's extremely small and simple and easy to see when it's correct,
regardless of your platform (and even if your platform does or doesn't
have bootloader-level memory reservations - for example it works on
x86 with kvm).

2. It does not increase the code size or slowness of the post-init
kernel.  (There's one new __init function, but no new runtime code
except a couple of extra pointer dereferences.)

3. It doesn't introduce any new APIs; there's just good old dmesg,
whose history now goes back farther.  (/proc/kmsg is unaffected, ie.
it still only shows history since boot, so klogd won't do anything
weird.)

4. It captures much more than just panics.  I don't know why people
put so much stock in catching panics; about half the problems I've had
to deal with are hard-lockups, not panics, and any solution that only
works with panics doesn't help me much.  (Also, trying to do useful
work after a panic doesn't even work every time, but "do nothing on
panic and I'll pick it up later" does.)  So things like mtdoops or
ramoops or kexec are not a complete solution IMHO.  For example, one
method I've used already for tracking down a hard lockup is to print
short status messages at LOG_DEBUG level (ie. not to the visible
console, but it goes in the buffer) at regular intervals inside the
driver that is crashing, and use a very large printk buffer.  Then,
after a crash and reset, the collection of status messages from a
variety of test devices in the field can be uploaded to a single place
and analyzed en masse, even if there was never a crash.

So that's my justification for writing this in the first place.  I'm
certainly not opposed to all those other methods if they make people
happy, but I was hoping to make a patch so simple and unproblematic
that it could actually get into the official kernel.

Now, the big downside of my approach is that we're just taking our
chances that the bootloader and early kernel boot won't eat our
buffer.  That's pretty inelegant (albeit not a problem for my use
case), so David Miller's suggestion to make it extensible by the
platform layer seems like a fine idea to me.  It seems like, say, the
ioremap'd nvram areas and other things people have suggested should
all be doable as long as the extension API is defined correctly.  I
would appreciate some more concrete suggestions on where to stick in a
hook to get memory allocated as soon as possible, so it works with eg.
prom_retain().

On the other hand, if the kernel maintainers don't really want the
patch and you think I should replace it with one of the several
overengineered not-yet-in-kernel-anyway megastructure memory
reservation systems, I don't think I'm going to bother; continuing to
apply my nice simple patch to our modified kernel is pretty easy and
will work the way I want (specifically, no performance/memsize
overhead).

Anyway, concrete pointers for how to link very early into
prom_retain() are very welcome, if you think this patch series is not
an evolutionary dead end.  I could just poke at it, but I have a
feeling someone here already knows exactly where I need to insert the
right call to make it work.

Have fun,

Avery

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 0/5] Persist printk buffer across reboots.
@ 2012-03-14  2:21       ` Avery Pennarun
  0 siblings, 0 replies; 52+ messages in thread
From: Avery Pennarun @ 2012-03-14  2:21 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Daniel Walker, Josh Triplett, Paul E. McKenney, Ingo Molnar,
	David S. Miller, Peter Zijlstra, Fabio M. Di Nitto,
	Johannes Weiner, Olaf Hering, Paul Gortmaker, Tejun Heo,
	H. Peter Anvin, Yinghai LU, linux-kernel, linux-mm

On Tue, Mar 13, 2012 at 6:10 PM, Andrew Morton
<akpm@linux-foundation.org> wrote:
> On Tue, 13 Mar 2012 10:08:51 -0700
> Daniel Walker <dwalker@fifo99.com> wrote:
>> There's currently driver/mtd/mtdoops.c, fs/pstore/, and
>> drivers/staging/android/ram_console.c that do similar things
>> as this. Did you investigate those for potentially modifying them to add
>> this functionality ? If so what issues did you find?
>>
>> I have a arm MSM G1 with persistent memory at 0x16d00000 size 20000bytes..
>> It's fairly simple you just have to ioremap the memory, but then it's good
>> for writing.. Currently the android ram_console uses this. How would I
>> convert this area for use with your changes?
>
> Yes, and various local implementations which do things such as stuffing
> the log buffer into NVRAM as the kernel is crashing.
>
> I do think we'd need some back-end driver arrangement which will permit
> the use of stores which aren't in addressible mamory.
>
> It's quite the can of worms, but definitely worth doing if we can get
> it approximately correct.

So okay, it seems there are a lot of competing ideas around.  In my
defense, I had looked at a couple of them (certainly not the full
variety represented in this thread) and the approach in my patch does
have a few advantages, I think:

1. It's extremely small and simple and easy to see when it's correct,
regardless of your platform (and even if your platform does or doesn't
have bootloader-level memory reservations - for example it works on
x86 with kvm).

2. It does not increase the code size or slowness of the post-init
kernel.  (There's one new __init function, but no new runtime code
except a couple of extra pointer dereferences.)

3. It doesn't introduce any new APIs; there's just good old dmesg,
whose history now goes back farther.  (/proc/kmsg is unaffected, ie.
it still only shows history since boot, so klogd won't do anything
weird.)

4. It captures much more than just panics.  I don't know why people
put so much stock in catching panics; about half the problems I've had
to deal with are hard-lockups, not panics, and any solution that only
works with panics doesn't help me much.  (Also, trying to do useful
work after a panic doesn't even work every time, but "do nothing on
panic and I'll pick it up later" does.)  So things like mtdoops or
ramoops or kexec are not a complete solution IMHO.  For example, one
method I've used already for tracking down a hard lockup is to print
short status messages at LOG_DEBUG level (ie. not to the visible
console, but it goes in the buffer) at regular intervals inside the
driver that is crashing, and use a very large printk buffer.  Then,
after a crash and reset, the collection of status messages from a
variety of test devices in the field can be uploaded to a single place
and analyzed en masse, even if there was never a crash.

So that's my justification for writing this in the first place.  I'm
certainly not opposed to all those other methods if they make people
happy, but I was hoping to make a patch so simple and unproblematic
that it could actually get into the official kernel.

Now, the big downside of my approach is that we're just taking our
chances that the bootloader and early kernel boot won't eat our
buffer.  That's pretty inelegant (albeit not a problem for my use
case), so David Miller's suggestion to make it extensible by the
platform layer seems like a fine idea to me.  It seems like, say, the
ioremap'd nvram areas and other things people have suggested should
all be doable as long as the extension API is defined correctly.  I
would appreciate some more concrete suggestions on where to stick in a
hook to get memory allocated as soon as possible, so it works with eg.
prom_retain().

On the other hand, if the kernel maintainers don't really want the
patch and you think I should replace it with one of the several
overengineered not-yet-in-kernel-anyway megastructure memory
reservation systems, I don't think I'm going to bother; continuing to
apply my nice simple patch to our modified kernel is pretty easy and
will work the way I want (specifically, no performance/memsize
overhead).

Anyway, concrete pointers for how to link very early into
prom_retain() are very welcome, if you think this patch series is not
an evolutionary dead end.  I could just poke at it, but I have a
feeling someone here already knows exactly where I need to insert the
right call to make it work.

Have fun,

Avery

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 4/5] printk: use alloc_bootmem() instead of memblock_alloc().
  2012-03-13 21:50         ` Yinghai Lu
@ 2012-03-14  2:23           ` Avery Pennarun
  -1 siblings, 0 replies; 52+ messages in thread
From: Avery Pennarun @ 2012-03-14  2:23 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Andrew Morton, Josh Triplett, Paul E. McKenney, Ingo Molnar,
	David S. Miller, Peter Zijlstra, Fabio M. Di Nitto,
	Johannes Weiner, Olaf Hering, Paul Gortmaker, Tejun Heo,
	H. Peter Anvin, linux-kernel, linux-mm

On Tue, Mar 13, 2012 at 5:50 PM, Yinghai Lu <yinghai@kernel.org> wrote:
> Now you put back bootmem calling early, will cause confusion.
[...]
> we should use adding memblock_alloc calling instead... go backward...

Okay, I'm convinced.  I've updated my series so CONFIG_PRINTK_PERSIST
only works with HAVE_MEMBLOCK, and I've removed the patch to
unconditionally call bootmem in the existing non-PRINTK_PERSIST case.

(I'll upload the patches later once the other threads play out.)

Thanks for the quick feedback!

Avery

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 4/5] printk: use alloc_bootmem() instead of memblock_alloc().
@ 2012-03-14  2:23           ` Avery Pennarun
  0 siblings, 0 replies; 52+ messages in thread
From: Avery Pennarun @ 2012-03-14  2:23 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Andrew Morton, Josh Triplett, Paul E. McKenney, Ingo Molnar,
	David S. Miller, Peter Zijlstra, Fabio M. Di Nitto,
	Johannes Weiner, Olaf Hering, Paul Gortmaker, Tejun Heo,
	H. Peter Anvin, linux-kernel, linux-mm

On Tue, Mar 13, 2012 at 5:50 PM, Yinghai Lu <yinghai@kernel.org> wrote:
> Now you put back bootmem calling early, will cause confusion.
[...]
> we should use adding memblock_alloc calling instead... go backward...

Okay, I'm convinced.  I've updated my series so CONFIG_PRINTK_PERSIST
only works with HAVE_MEMBLOCK, and I've removed the patch to
unconditionally call bootmem in the existing non-PRINTK_PERSIST case.

(I'll upload the patches later once the other threads play out.)

Thanks for the quick feedback!

Avery

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: [PATCH 0/5] Persist printk buffer across reboots.
  2012-03-14  2:19       ` Daniel Walker
@ 2012-03-15 22:10         ` Seiji Aguchi
  -1 siblings, 0 replies; 52+ messages in thread
From: Seiji Aguchi @ 2012-03-15 22:10 UTC (permalink / raw)
  To: Daniel Walker, Andrew Morton
  Cc: Avery Pennarun, Josh Triplett, Paul E. McKenney, Ingo Molnar,
	David S. Miller, Peter Zijlstra, Fabio M. Di Nitto,
	Johannes Weiner, Olaf Hering, Paul Gortmaker, Tejun Heo,
	H. Peter Anvin, Yinghai LU, linux-kernel, linux-mm, Luck,
	Tony (tony.luck@intel.com)

Hi,

> There is also this series,
> 
> http://lists.infradead.org/pipermail/kexec/2011-July/005258.html
> 
> It seems awkward that pstore is in fs/pstore/ then pstore ends up as the "back end" where it could just be the whole solution.

I just wanted to avoid deadlocks of pstore and its drivers such as mtdoops, ramoops, and efi_pstore in panic case.
That is still under discussion in lkml.

I have no objection to modifying mtdoops/ram_console to use pstore.

>pstore does seems to have the nicest user interface (might be better in debugfs tho). If someone wanted to move forward with pstore they would have to write some some sort of,
>
>int pstore_register_simple(unsigned long addr, unsigned long size);
>
>to cover all the memory areas that aren't transaction based, or make pstore accept a platform_device.

If you would like to introduce new feature to pstore, Tony Luck is the appropriate person to discuss.

Seiji

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: [PATCH 0/5] Persist printk buffer across reboots.
@ 2012-03-15 22:10         ` Seiji Aguchi
  0 siblings, 0 replies; 52+ messages in thread
From: Seiji Aguchi @ 2012-03-15 22:10 UTC (permalink / raw)
  To: Daniel Walker, Andrew Morton
  Cc: Avery Pennarun, Josh Triplett, Paul E. McKenney, Ingo Molnar,
	David S. Miller, Peter Zijlstra, Fabio M. Di Nitto,
	Johannes Weiner, Olaf Hering, Paul Gortmaker, Tejun Heo,
	H. Peter Anvin, Yinghai LU, linux-kernel, linux-mm, Luck,
	Tony (tony.luck@intel.com)

Hi,

> There is also this series,
> 
> http://lists.infradead.org/pipermail/kexec/2011-July/005258.html
> 
> It seems awkward that pstore is in fs/pstore/ then pstore ends up as the "back end" where it could just be the whole solution.

I just wanted to avoid deadlocks of pstore and its drivers such as mtdoops, ramoops, and efi_pstore in panic case.
That is still under discussion in lkml.

I have no objection to modifying mtdoops/ram_console to use pstore.

>pstore does seems to have the nicest user interface (might be better in debugfs tho). If someone wanted to move forward with pstore they would have to write some some sort of,
>
>int pstore_register_simple(unsigned long addr, unsigned long size);
>
>to cover all the memory areas that aren't transaction based, or make pstore accept a platform_device.

If you would like to introduce new feature to pstore, Tony Luck is the appropriate person to discuss.

Seiji

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 52+ messages in thread

end of thread, other threads:[~2012-03-15 22:22 UTC | newest]

Thread overview: 52+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-03-13  5:36 [PATCH 0/5] Persist printk buffer across reboots Avery Pennarun
2012-03-13  5:36 ` Avery Pennarun
2012-03-13  5:36 ` [PATCH 1/5] mm: bootmem: BUG() if you try to allocate bootmem too late Avery Pennarun
2012-03-13  5:36   ` Avery Pennarun
2012-03-13  5:36 ` [PATCH 2/5] mm: bootmem: it's okay to reserve_bootmem an invalid address Avery Pennarun
2012-03-13  5:36   ` Avery Pennarun
2012-03-13  5:36 ` [PATCH 3/5] mm: nobootmem: implement reserve_bootmem() in terms of memblock Avery Pennarun
2012-03-13  5:36   ` Avery Pennarun
2012-03-13  5:36 ` [PATCH 4/5] printk: use alloc_bootmem() instead of memblock_alloc() Avery Pennarun
2012-03-13  5:36   ` Avery Pennarun
2012-03-13  6:13   ` Yinghai Lu
2012-03-13  6:13     ` Yinghai Lu
2012-03-13  6:40     ` Avery Pennarun
2012-03-13  6:40       ` Avery Pennarun
2012-03-13  8:26       ` Ingo Molnar
2012-03-13  8:26         ` Ingo Molnar
2012-03-13 21:50       ` Yinghai Lu
2012-03-13 21:50         ` Yinghai Lu
2012-03-14  2:23         ` Avery Pennarun
2012-03-14  2:23           ` Avery Pennarun
2012-03-13  5:36 ` [PATCH 5/5] printk: CONFIG_PRINTK_PERSIST: persist printk buffer across reboots Avery Pennarun
2012-03-13  5:36   ` Avery Pennarun
2012-03-13  5:53 ` [PATCH 0/5] Persist " David Miller
2012-03-13  5:53   ` David Miller
2012-03-13  6:00   ` Avery Pennarun
2012-03-13  6:00     ` Avery Pennarun
2012-03-13  6:50     ` David Miller
2012-03-13  6:50       ` David Miller
2012-03-13  7:14       ` Avery Pennarun
2012-03-13  7:14         ` Avery Pennarun
2012-03-13  7:18         ` David Miller
2012-03-13  7:18           ` David Miller
2012-03-13  8:10           ` Avery Pennarun
2012-03-13  8:10             ` Avery Pennarun
2012-03-13  8:16             ` David Miller
2012-03-13  8:16               ` David Miller
2012-03-13 13:50   ` Peter Zijlstra
2012-03-13 13:50     ` Peter Zijlstra
2012-03-14  1:57     ` Daniel Walker
2012-03-14  1:57       ` Daniel Walker
2012-03-13  8:32 ` Stephen Boyd
2012-03-13  8:32   ` Stephen Boyd
2012-03-13 17:08 ` Daniel Walker
2012-03-13 17:08   ` Daniel Walker
2012-03-13 22:10   ` Andrew Morton
2012-03-13 22:10     ` Andrew Morton
2012-03-14  2:19     ` Daniel Walker
2012-03-14  2:19       ` Daniel Walker
2012-03-15 22:10       ` Seiji Aguchi
2012-03-15 22:10         ` Seiji Aguchi
2012-03-14  2:21     ` Avery Pennarun
2012-03-14  2:21       ` Avery Pennarun

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.