All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH v1 0/2] printk: Shared kernel logging
@ 2016-09-30  0:55 Sean Hudson
  2016-09-30  0:55 ` [RFC PATCH v1 1/2] printk: collect offsets into replaceable structure Sean Hudson
                   ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: Sean Hudson @ 2016-09-30  0:55 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sean Hudson, Andrew Morton, Thomas Gleixner, Yang Shi,
	Ingo Molnar, Prarit Bhargava, Rasmus Villemoes, Paul E. McKenney,
	Yaowei Bai, Kees Cook, Andrey Ryabinin, Johannes Weiner,
	Tejun Heo, Arnd Bergmann, Ard Biesheuvel, Thomas Garnier,
	Petr Mladek, Nicolas Pitre, Sergey Senozhatsky, Borislav Petkov,
	Tim Bird, Steven Rostedt, Grant Likely

This patch set is based on Linus' v4.8-rc8 tag.

This debug feature allows the kernel to use an external buffer and control
block for kernel log messages. The feature is controlled by an optional
command line parameter. The existing buffer and control block can contain
existing log messages from previous boot cycles and/or the bootloader. The
command line parameter was chosen for flexibility, cross arch portability,
and the ability to dynamically enable/disable this feature. The parameter
specifies the address of a control block used to replace the default log
buffer. Existing bootloader and kernel log messages are kept, in order,
inside the new buffer. After a boot that preserves the buffer contents, a
bootloader can display both kernel and bootloader log entries from multiple,
previous boots. It also allows the kernel to display bootloader log entries
along with its own messages.

This feature is intended for debug purposes and has no effect unless the
command line parameter is specified. Further, it validates the passed
control block carefully and if any checks fail, it falls back to the default
behaviour. As such, it can be left enabled by default.

Memory Reservation

This feature expects the bootloader to reserve/preserve the shared buffer
memory. This reservation needs to prevent the kernel from overwriting the
external log control block and log entries. In my testing, I've used the
'fdt' commands in uboot to dynamically inject reserved memory regions via
the DT to the kernel.

Sean Hudson (2):
  printk: collect offsets into replaceable structure
  printk: external log buffer (CONFIG_LOGBUFFER)

 init/Kconfig           |  12 +
 init/main.c            |   2 +
 kernel/printk/printk.c | 598 +++++++++++++++++++++++++++++++++++--------------
 3 files changed, 445 insertions(+), 167 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [RFC PATCH v1 1/2] printk: collect offsets into replaceable structure
  2016-09-30  0:55 [RFC PATCH v1 0/2] printk: Shared kernel logging Sean Hudson
@ 2016-09-30  0:55 ` Sean Hudson
  2016-09-30  0:55 ` [RFC PATCH v1 2/2] printk: external log buffer (CONFIG_LOGBUFFER) Sean Hudson
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 16+ messages in thread
From: Sean Hudson @ 2016-09-30  0:55 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sean Hudson, Andrew Morton, Thomas Gleixner, Yang Shi,
	Ingo Molnar, Prarit Bhargava, Rasmus Villemoes, Paul E. McKenney,
	Yaowei Bai, Kees Cook, Andrey Ryabinin, Johannes Weiner,
	Tejun Heo, Arnd Bergmann, Ard Biesheuvel, Thomas Garnier,
	Petr Mladek, Nicolas Pitre, Sergey Senozhatsky, Borislav Petkov,
	Tim Bird, Steven Rostedt, Grant Likely

Currently, printk relies on several indices that are declared as static
global variables in printk.c.  This patch collects those into a single
structure referenced by a pointer.  This allows easier replacement of
these indices and pinning to a specific locatino.

Signed-off-by: Sean Hudson <sean_hudson@mentor.com>
---
 kernel/printk/printk.c | 335 +++++++++++++++++++++++++------------------------
 1 file changed, 172 insertions(+), 163 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index eea6dbc..7a441f5 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -353,28 +353,6 @@ DEFINE_RAW_SPINLOCK(logbuf_lock);
 
 #ifdef CONFIG_PRINTK
 DECLARE_WAIT_QUEUE_HEAD(log_wait);
-/* the next printk record to read by syslog(READ) or /proc/kmsg */
-static u64 syslog_seq;
-static u32 syslog_idx;
-static enum log_flags syslog_prev;
-static size_t syslog_partial;
-
-/* index and sequence number of the first record stored in the buffer */
-static u64 log_first_seq;
-static u32 log_first_idx;
-
-/* index and sequence number of the next record to store in the buffer */
-static u64 log_next_seq;
-static u32 log_next_idx;
-
-/* the next printk record to write to the console */
-static u64 console_seq;
-static u32 console_idx;
-static enum log_flags console_prev;
-
-/* the next printk record to read after the last 'clear' command */
-static u64 clear_seq;
-static u32 clear_idx;
 
 #define PREFIX_MAX		32
 #define LOG_LINE_MAX		(1024 - PREFIX_MAX)
@@ -386,19 +364,54 @@ static u32 clear_idx;
 #define LOG_ALIGN __alignof__(struct printk_log)
 #define __LOG_BUF_LEN (1 << CONFIG_LOG_BUF_SHIFT)
 static char __log_buf[__LOG_BUF_LEN] __aligned(LOG_ALIGN);
-static char *log_buf = __log_buf;
-static u32 log_buf_len = __LOG_BUF_LEN;
+
+/*
+ * This control block collects tracking offsets for the log.
+ */
+struct lcb_t {
+	/* Pointer to log buffer space and length of space */
+	char *log_buf;
+	u32 log_buf_len;
+
+	/* index and sequence of the first record stored in the buffer */
+	u64 log_first_seq;
+	u32 log_first_idx;
+
+	/* index and sequence of the next record to store in the buffer */
+	u64 log_next_seq;
+	u32 log_next_idx;
+
+	/* the next printk record to read by syslog(READ) or /proc/kmsg */
+	u64 syslog_seq;
+	u32 syslog_idx;
+	enum log_flags syslog_prev;
+	size_t syslog_partial;
+
+	/* the next printk record to write to the console */
+	u64 console_seq;
+	u32 console_idx;
+	enum log_flags console_prev;
+
+	/* the next printk record to read after the last 'clear' command */
+	u64 clear_seq;
+	u32 clear_idx;
+} lcb_t;
+
+static struct lcb_t __lcb = {
+	__log_buf, __LOG_BUF_LEN, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 };
+
+static struct lcb_t *lcb = &__lcb;
 
 /* Return log buffer address */
 char *log_buf_addr_get(void)
 {
-	return log_buf;
+	return lcb->log_buf;
 }
 
 /* Return log buffer size */
 u32 log_buf_len_get(void)
 {
-	return log_buf_len;
+	return lcb->log_buf_len;
 }
 
 /* human readable text of the record */
@@ -416,21 +429,21 @@ static char *log_dict(const struct printk_log *msg)
 /* get record by index; idx must point to valid msg */
 static struct printk_log *log_from_idx(u32 idx)
 {
-	struct printk_log *msg = (struct printk_log *)(log_buf + idx);
+	struct printk_log *msg = (struct printk_log *)(lcb->log_buf + idx);
 
 	/*
 	 * A length == 0 record is the end of buffer marker. Wrap around and
 	 * read the message at the start of the buffer.
 	 */
 	if (!msg->len)
-		return (struct printk_log *)log_buf;
+		return (struct printk_log *)lcb->log_buf;
 	return msg;
 }
 
 /* get next record; idx must point to valid msg */
 static u32 log_next(u32 idx)
 {
-	struct printk_log *msg = (struct printk_log *)(log_buf + idx);
+	struct printk_log *msg = (struct printk_log *)(lcb->log_buf + idx);
 
 	/* length == 0 indicates the end of the buffer; wrap */
 	/*
@@ -439,7 +452,7 @@ static u32 log_next(u32 idx)
 	 * return the one after that.
 	 */
 	if (!msg->len) {
-		msg = (struct printk_log *)log_buf;
+		msg = (struct printk_log *)lcb->log_buf;
 		return msg->len;
 	}
 	return idx + msg->len;
@@ -458,10 +471,11 @@ static int logbuf_has_space(u32 msg_size, bool empty)
 {
 	u32 free;
 
-	if (log_next_idx > log_first_idx || empty)
-		free = max(log_buf_len - log_next_idx, log_first_idx);
+	if (lcb->log_next_idx > lcb->log_first_idx || empty)
+		free = max(lcb->log_buf_len - lcb->log_next_idx,
+				lcb->log_first_idx);
 	else
-		free = log_first_idx - log_next_idx;
+		free = lcb->log_first_idx - lcb->log_next_idx;
 
 	/*
 	 * We need space also for an empty header that signalizes wrapping
@@ -472,20 +486,20 @@ static int logbuf_has_space(u32 msg_size, bool empty)
 
 static int log_make_free_space(u32 msg_size)
 {
-	while (log_first_seq < log_next_seq &&
+	while (lcb->log_first_seq < lcb->log_next_seq &&
 	       !logbuf_has_space(msg_size, false)) {
 		/* drop old messages until we have enough contiguous space */
-		log_first_idx = log_next(log_first_idx);
-		log_first_seq++;
+		lcb->log_first_idx = log_next(lcb->log_first_idx);
+		lcb->log_first_seq++;
 	}
 
-	if (clear_seq < log_first_seq) {
-		clear_seq = log_first_seq;
-		clear_idx = log_first_idx;
+	if (lcb->clear_seq < lcb->log_first_seq) {
+		lcb->clear_seq = lcb->log_first_seq;
+		lcb->clear_idx = lcb->log_first_idx;
 	}
 
 	/* sequence numbers are equal, so the log buffer is empty */
-	if (logbuf_has_space(msg_size, log_first_seq == log_next_seq))
+	if (logbuf_has_space(msg_size, lcb->log_first_seq == lcb->log_next_seq))
 		return 0;
 
 	return -ENOMEM;
@@ -518,7 +532,7 @@ static u32 truncate_msg(u16 *text_len, u16 *trunc_msg_len,
 	 * The message should not take the whole buffer. Otherwise, it might
 	 * get removed too soon.
 	 */
-	u32 max_text_len = log_buf_len / MAX_LOG_TAKE_PART;
+	u32 max_text_len = lcb->log_buf_len / MAX_LOG_TAKE_PART;
 	if (*text_len > max_text_len)
 		*text_len = max_text_len;
 	/* enable the warning message */
@@ -551,18 +565,20 @@ static int log_store(int facility, int level,
 			return 0;
 	}
 
-	if (log_next_idx + size + sizeof(struct printk_log) > log_buf_len) {
+	if (lcb->log_next_idx + size + sizeof(struct printk_log) >
+			lcb->log_buf_len) {
 		/*
 		 * This message + an additional empty header does not fit
 		 * at the end of the buffer. Add an empty header with len == 0
 		 * to signify a wrap around.
 		 */
-		memset(log_buf + log_next_idx, 0, sizeof(struct printk_log));
-		log_next_idx = 0;
+		memset(lcb->log_buf + lcb->log_next_idx, 0,
+				sizeof(struct printk_log));
+		lcb->log_next_idx = 0;
 	}
 
 	/* fill message */
-	msg = (struct printk_log *)(log_buf + log_next_idx);
+	msg = (struct printk_log *)(lcb->log_buf + lcb->log_next_idx);
 	memcpy(log_text(msg), text, text_len);
 	msg->text_len = text_len;
 	if (trunc_msg_len) {
@@ -582,8 +598,8 @@ static int log_store(int facility, int level,
 	msg->len = size;
 
 	/* insert message */
-	log_next_idx += msg->len;
-	log_next_seq++;
+	lcb->log_next_idx += msg->len;
+	lcb->log_next_seq++;
 
 	return msg->text_len;
 }
@@ -801,7 +817,7 @@ static ssize_t devkmsg_read(struct file *file, char __user *buf,
 	if (ret)
 		return ret;
 	raw_spin_lock_irq(&logbuf_lock);
-	while (user->seq == log_next_seq) {
+	while (user->seq == lcb->log_next_seq) {
 		if (file->f_flags & O_NONBLOCK) {
 			ret = -EAGAIN;
 			raw_spin_unlock_irq(&logbuf_lock);
@@ -810,16 +826,16 @@ static ssize_t devkmsg_read(struct file *file, char __user *buf,
 
 		raw_spin_unlock_irq(&logbuf_lock);
 		ret = wait_event_interruptible(log_wait,
-					       user->seq != log_next_seq);
+					       user->seq != lcb->log_next_seq);
 		if (ret)
 			goto out;
 		raw_spin_lock_irq(&logbuf_lock);
 	}
 
-	if (user->seq < log_first_seq) {
+	if (user->seq < lcb->log_first_seq) {
 		/* our last seen message is gone, return error and reset */
-		user->idx = log_first_idx;
-		user->seq = log_first_seq;
+		user->idx = lcb->log_first_idx;
+		user->seq = lcb->log_first_seq;
 		ret = -EPIPE;
 		raw_spin_unlock_irq(&logbuf_lock);
 		goto out;
@@ -866,8 +882,8 @@ static loff_t devkmsg_llseek(struct file *file, loff_t offset, int whence)
 	switch (whence) {
 	case SEEK_SET:
 		/* the first record */
-		user->idx = log_first_idx;
-		user->seq = log_first_seq;
+		user->idx = lcb->log_first_idx;
+		user->seq = lcb->log_first_seq;
 		break;
 	case SEEK_DATA:
 		/*
@@ -875,13 +891,13 @@ static loff_t devkmsg_llseek(struct file *file, loff_t offset, int whence)
 		 * like issued by 'dmesg -c'. Reading /dev/kmsg itself
 		 * changes no global state, and does not clear anything.
 		 */
-		user->idx = clear_idx;
-		user->seq = clear_seq;
+		user->idx = lcb->clear_idx;
+		user->seq = lcb->clear_seq;
 		break;
 	case SEEK_END:
 		/* after the last record */
-		user->idx = log_next_idx;
-		user->seq = log_next_seq;
+		user->idx = lcb->log_next_idx;
+		user->seq = lcb->log_next_seq;
 		break;
 	default:
 		ret = -EINVAL;
@@ -901,9 +917,9 @@ static unsigned int devkmsg_poll(struct file *file, poll_table *wait)
 	poll_wait(file, &log_wait, wait);
 
 	raw_spin_lock_irq(&logbuf_lock);
-	if (user->seq < log_next_seq) {
+	if (user->seq < lcb->log_next_seq) {
 		/* return error when data has vanished underneath us */
-		if (user->seq < log_first_seq)
+		if (user->seq < lcb->log_first_seq)
 			ret = POLLIN|POLLRDNORM|POLLERR|POLLPRI;
 		else
 			ret = POLLIN|POLLRDNORM;
@@ -939,8 +955,8 @@ static int devkmsg_open(struct inode *inode, struct file *file)
 	mutex_init(&user->lock);
 
 	raw_spin_lock_irq(&logbuf_lock);
-	user->idx = log_first_idx;
-	user->seq = log_first_seq;
+	user->idx = lcb->log_first_idx;
+	user->seq = lcb->log_first_seq;
 	raw_spin_unlock_irq(&logbuf_lock);
 
 	file->private_data = user;
@@ -981,11 +997,11 @@ const struct file_operations kmsg_fops = {
  */
 void log_buf_kexec_setup(void)
 {
-	VMCOREINFO_SYMBOL(log_buf);
-	VMCOREINFO_SYMBOL(log_buf_len);
-	VMCOREINFO_SYMBOL(log_first_idx);
-	VMCOREINFO_SYMBOL(clear_idx);
-	VMCOREINFO_SYMBOL(log_next_idx);
+	VMCOREINFO_SYMBOL(lcb->log_buf);
+	VMCOREINFO_SYMBOL(lcb->log_buf_len);
+	VMCOREINFO_SYMBOL(lcb->log_first_idx);
+	VMCOREINFO_SYMBOL(lcb->clear_idx);
+	VMCOREINFO_SYMBOL(lcb->log_next_idx);
 	/*
 	 * Export struct printk_log size and field offsets. User space tools can
 	 * parse it and detect any changes to structure down the line.
@@ -1006,7 +1022,7 @@ static void __init log_buf_len_update(unsigned size)
 {
 	if (size)
 		size = roundup_pow_of_two(size);
-	if (size > log_buf_len)
+	if (size > lcb->log_buf_len)
 		new_log_buf_len = size;
 }
 
@@ -1060,7 +1076,7 @@ void __init setup_log_buf(int early)
 	char *new_log_buf;
 	int free;
 
-	if (log_buf != __log_buf)
+	if (!lcb || lcb->log_buf != __log_buf)
 		return;
 
 	if (!early && !new_log_buf_len)
@@ -1084,14 +1100,14 @@ void __init setup_log_buf(int early)
 	}
 
 	raw_spin_lock_irqsave(&logbuf_lock, flags);
-	log_buf_len = new_log_buf_len;
-	log_buf = new_log_buf;
+	lcb->log_buf_len = new_log_buf_len;
+	lcb->log_buf = new_log_buf;
 	new_log_buf_len = 0;
-	free = __LOG_BUF_LEN - log_next_idx;
-	memcpy(log_buf, __log_buf, __LOG_BUF_LEN);
+	free = __LOG_BUF_LEN - lcb->log_next_idx;
+	memcpy(lcb->log_buf, __log_buf, __LOG_BUF_LEN);
 	raw_spin_unlock_irqrestore(&logbuf_lock, flags);
 
-	pr_info("log_buf_len: %d bytes\n", log_buf_len);
+	pr_info("log_buf_len: %d bytes\n", lcb->log_buf_len);
 	pr_info("early log buf free: %d(%d%%)\n",
 		free, (free * 100) / __LOG_BUF_LEN);
 }
@@ -1287,33 +1303,33 @@ static int syslog_print(char __user *buf, int size)
 		size_t skip;
 
 		raw_spin_lock_irq(&logbuf_lock);
-		if (syslog_seq < log_first_seq) {
+		if (lcb->syslog_seq < lcb->log_first_seq) {
 			/* messages are gone, move to first one */
-			syslog_seq = log_first_seq;
-			syslog_idx = log_first_idx;
-			syslog_prev = 0;
-			syslog_partial = 0;
+			lcb->syslog_seq = lcb->log_first_seq;
+			lcb->syslog_idx = lcb->log_first_idx;
+			lcb->syslog_prev = 0;
+			lcb->syslog_partial = 0;
 		}
-		if (syslog_seq == log_next_seq) {
+		if (lcb->syslog_seq == lcb->log_next_seq) {
 			raw_spin_unlock_irq(&logbuf_lock);
 			break;
 		}
 
-		skip = syslog_partial;
-		msg = log_from_idx(syslog_idx);
-		n = msg_print_text(msg, syslog_prev, true, text,
+		skip = lcb->syslog_partial;
+		msg = log_from_idx(lcb->syslog_idx);
+		n = msg_print_text(msg, lcb->syslog_prev, true, text,
 				   LOG_LINE_MAX + PREFIX_MAX);
-		if (n - syslog_partial <= size) {
+		if (n - lcb->syslog_partial <= size) {
 			/* message fits into buffer, move forward */
-			syslog_idx = log_next(syslog_idx);
-			syslog_seq++;
-			syslog_prev = msg->flags;
-			n -= syslog_partial;
-			syslog_partial = 0;
+			lcb->syslog_idx = log_next(lcb->syslog_idx);
+			lcb->syslog_seq++;
+			lcb->syslog_prev = msg->flags;
+			n -= lcb->syslog_partial;
+			lcb->syslog_partial = 0;
 		} else if (!len){
 			/* partial read(), remember position */
 			n = size;
-			syslog_partial += n;
+			lcb->syslog_partial += n;
 		} else
 			n = 0;
 		raw_spin_unlock_irq(&logbuf_lock);
@@ -1356,10 +1372,10 @@ static int syslog_print_all(char __user *buf, int size, bool clear)
 		 * Find first record that fits, including all following records,
 		 * into the user-provided buffer for this dump.
 		 */
-		seq = clear_seq;
-		idx = clear_idx;
+		seq = lcb->clear_seq;
+		idx = lcb->clear_idx;
 		prev = 0;
-		while (seq < log_next_seq) {
+		while (seq < lcb->log_next_seq) {
 			struct printk_log *msg = log_from_idx(idx);
 
 			len += msg_print_text(msg, prev, true, NULL, 0);
@@ -1369,10 +1385,10 @@ static int syslog_print_all(char __user *buf, int size, bool clear)
 		}
 
 		/* move first record forward until length fits into the buffer */
-		seq = clear_seq;
-		idx = clear_idx;
+		seq = lcb->clear_seq;
+		idx = lcb->clear_idx;
 		prev = 0;
-		while (len > size && seq < log_next_seq) {
+		while (len > size && seq < lcb->log_next_seq) {
 			struct printk_log *msg = log_from_idx(idx);
 
 			len -= msg_print_text(msg, prev, true, NULL, 0);
@@ -1382,7 +1398,7 @@ static int syslog_print_all(char __user *buf, int size, bool clear)
 		}
 
 		/* last message fitting into this dump */
-		next_seq = log_next_seq;
+		next_seq = lcb->log_next_seq;
 
 		len = 0;
 		while (len >= 0 && seq < next_seq) {
@@ -1406,18 +1422,18 @@ static int syslog_print_all(char __user *buf, int size, bool clear)
 				len += textlen;
 			raw_spin_lock_irq(&logbuf_lock);
 
-			if (seq < log_first_seq) {
+			if (seq < lcb->log_first_seq) {
 				/* messages are gone, move to next one */
-				seq = log_first_seq;
-				idx = log_first_idx;
+				seq = lcb->log_first_seq;
+				idx = lcb->log_first_idx;
 				prev = 0;
 			}
 		}
 	}
 
 	if (clear) {
-		clear_seq = log_next_seq;
-		clear_idx = log_next_idx;
+		lcb->clear_seq = lcb->log_next_seq;
+		lcb->clear_idx = lcb->log_next_idx;
 	}
 	raw_spin_unlock_irq(&logbuf_lock);
 
@@ -1452,7 +1468,7 @@ int do_syslog(int type, char __user *buf, int len, int source)
 			goto out;
 		}
 		error = wait_event_interruptible(log_wait,
-						 syslog_seq != log_next_seq);
+				lcb->syslog_seq != lcb->log_next_seq);
 		if (error)
 			goto out;
 		error = syslog_print(buf, len);
@@ -1507,12 +1523,12 @@ int do_syslog(int type, char __user *buf, int len, int source)
 	/* Number of chars in the log buffer */
 	case SYSLOG_ACTION_SIZE_UNREAD:
 		raw_spin_lock_irq(&logbuf_lock);
-		if (syslog_seq < log_first_seq) {
+		if (lcb->syslog_seq < lcb->log_first_seq) {
 			/* messages are gone, move to first one */
-			syslog_seq = log_first_seq;
-			syslog_idx = log_first_idx;
-			syslog_prev = 0;
-			syslog_partial = 0;
+			lcb->syslog_seq = lcb->log_first_seq;
+			lcb->syslog_idx = lcb->log_first_idx;
+			lcb->syslog_prev = 0;
+			lcb->syslog_partial = 0;
 		}
 		if (source == SYSLOG_FROM_PROC) {
 			/*
@@ -1520,14 +1536,14 @@ int do_syslog(int type, char __user *buf, int len, int source)
 			 * for pending data, not the size; return the count of
 			 * records, not the length.
 			 */
-			error = log_next_seq - syslog_seq;
+			error = lcb->log_next_idx - lcb->syslog_idx;
 		} else {
-			u64 seq = syslog_seq;
-			u32 idx = syslog_idx;
-			enum log_flags prev = syslog_prev;
+			u64 seq = lcb->syslog_seq;
+			u32 idx = lcb->syslog_idx;
+			enum log_flags prev = lcb->syslog_prev;
 
 			error = 0;
-			while (seq < log_next_seq) {
+			while (seq < lcb->log_next_seq) {
 				struct printk_log *msg = log_from_idx(idx);
 
 				error += msg_print_text(msg, prev, true, NULL, 0);
@@ -1535,13 +1551,13 @@ int do_syslog(int type, char __user *buf, int len, int source)
 				seq++;
 				prev = msg->flags;
 			}
-			error -= syslog_partial;
+			error -= lcb->syslog_partial;
 		}
 		raw_spin_unlock_irq(&logbuf_lock);
 		break;
 	/* Size of the log buffer */
 	case SYSLOG_ACTION_SIZE_BUFFER:
-		error = log_buf_len;
+		error = lcb->log_buf_len;
 		break;
 	default:
 		error = -EINVAL;
@@ -1558,7 +1574,7 @@ SYSCALL_DEFINE3(syslog, int, type, char __user *, buf, int, len)
 
 /*
  * Call the console drivers, asking them to write out
- * log_buf[start] to log_buf[end - 1].
+ * lcb->log_buf[start] to lcb->log_buf[end - 1].
  * The console_lock must be held.
  */
 static void call_console_drivers(int level,
@@ -1710,7 +1726,7 @@ static size_t cont_print_text(char *text, size_t size)
 	size_t textlen = 0;
 	size_t len;
 
-	if (cont.cons == 0 && (console_prev & LOG_NEWLINE)) {
+	if (cont.cons == 0 && (lcb->console_prev & LOG_NEWLINE)) {
 		textlen += print_time(cont.ts_nsec, text);
 		size -= textlen;
 	}
@@ -1985,15 +2001,6 @@ EXPORT_SYMBOL(printk);
 #define LOG_LINE_MAX		0
 #define PREFIX_MAX		0
 
-static u64 syslog_seq;
-static u32 syslog_idx;
-static u64 console_seq;
-static u32 console_idx;
-static enum log_flags syslog_prev;
-static u64 log_first_seq;
-static u32 log_first_idx;
-static u64 log_next_seq;
-static enum log_flags console_prev;
 static struct cont {
 	size_t len;
 	size_t cons;
@@ -2308,7 +2315,7 @@ static void console_cont_flush(char *text, size_t size)
 	 * busy. The earlier ones need to be printed before this one, we
 	 * did not flush any fragment so far, so just let it queue up.
 	 */
-	if (console_seq < log_next_seq && !cont.cons)
+	if (lcb->console_seq < lcb->log_next_seq && !cont.cons)
 		goto out;
 
 	len = cont_print_text(text, size);
@@ -2385,27 +2392,28 @@ again:
 		int level;
 
 		raw_spin_lock_irqsave(&logbuf_lock, flags);
-		if (seen_seq != log_next_seq) {
+		if (seen_seq != lcb->log_next_seq) {
 			wake_klogd = true;
-			seen_seq = log_next_seq;
+			seen_seq = lcb->log_next_seq;
 		}
 
-		if (console_seq < log_first_seq) {
+		if (lcb->console_seq < lcb->log_first_seq) {
 			len = sprintf(text, "** %u printk messages dropped ** ",
-				      (unsigned)(log_first_seq - console_seq));
+				      (unsigned int)(lcb->log_first_seq -
+						 lcb->console_seq));
 
 			/* messages are gone, move to first one */
-			console_seq = log_first_seq;
-			console_idx = log_first_idx;
-			console_prev = 0;
+			lcb->console_seq = lcb->log_first_seq;
+			lcb->console_idx = lcb->log_first_idx;
+			lcb->console_prev = 0;
 		} else {
 			len = 0;
 		}
 skip:
-		if (console_seq == log_next_seq)
+		if (lcb->console_seq == lcb->log_next_seq)
 			break;
 
-		msg = log_from_idx(console_idx);
+		msg = log_from_idx(lcb->console_idx);
 		level = msg->level;
 		if ((msg->flags & LOG_NOCONS) ||
 				suppress_message_printing(level)) {
@@ -2414,32 +2422,33 @@ skip:
 			 * directly to the console when we received it, and
 			 * record that has level above the console loglevel.
 			 */
-			console_idx = log_next(console_idx);
-			console_seq++;
+			lcb->console_idx = log_next(lcb->console_idx);
+			lcb->console_seq++;
 			/*
 			 * We will get here again when we register a new
 			 * CON_PRINTBUFFER console. Clear the flag so we
 			 * will properly dump everything later.
 			 */
 			msg->flags &= ~LOG_NOCONS;
-			console_prev = msg->flags;
+			lcb->console_prev = msg->flags;
 			goto skip;
 		}
 
-		len += msg_print_text(msg, console_prev, false,
+		len += msg_print_text(msg, lcb->console_prev, false,
 				      text + len, sizeof(text) - len);
 		if (nr_ext_console_drivers) {
 			ext_len = msg_print_ext_header(ext_text,
 						sizeof(ext_text),
-						msg, console_seq, console_prev);
+						msg, lcb->console_seq,
+						lcb->console_prev);
 			ext_len += msg_print_ext_body(ext_text + ext_len,
 						sizeof(ext_text) - ext_len,
 						log_dict(msg), msg->dict_len,
 						log_text(msg), msg->text_len);
 		}
-		console_idx = log_next(console_idx);
-		console_seq++;
-		console_prev = msg->flags;
+		lcb->console_idx = log_next(lcb->console_idx);
+		lcb->console_seq++;
+		lcb->console_prev = msg->flags;
 		raw_spin_unlock(&logbuf_lock);
 
 		stop_critical_timings();	/* don't trace print latency */
@@ -2467,7 +2476,7 @@ skip:
 	 * flush, no worries.
 	 */
 	raw_spin_lock(&logbuf_lock);
-	retry = console_seq != log_next_seq;
+	retry = lcb->console_seq != lcb->log_next_seq;
 	raw_spin_unlock_irqrestore(&logbuf_lock, flags);
 
 	if (retry && console_trylock())
@@ -2732,9 +2741,9 @@ void register_console(struct console *newcon)
 		 * for us.
 		 */
 		raw_spin_lock_irqsave(&logbuf_lock, flags);
-		console_seq = syslog_seq;
-		console_idx = syslog_idx;
-		console_prev = syslog_prev;
+		lcb->console_seq = lcb->syslog_seq;
+		lcb->console_idx = lcb->syslog_idx;
+		lcb->console_prev = lcb->syslog_prev;
 		raw_spin_unlock_irqrestore(&logbuf_lock, flags);
 		/*
 		 * We're about to replay the log buffer.  Only do this to the
@@ -3029,10 +3038,10 @@ void kmsg_dump(enum kmsg_dump_reason reason)
 		dumper->active = true;
 
 		raw_spin_lock_irqsave(&logbuf_lock, flags);
-		dumper->cur_seq = clear_seq;
-		dumper->cur_idx = clear_idx;
-		dumper->next_seq = log_next_seq;
-		dumper->next_idx = log_next_idx;
+		dumper->cur_seq = lcb->clear_seq;
+		dumper->cur_idx = lcb->clear_idx;
+		dumper->next_seq = lcb->log_next_seq;
+		dumper->next_idx = lcb->log_next_idx;
 		raw_spin_unlock_irqrestore(&logbuf_lock, flags);
 
 		/* invoke dumper which will iterate over records */
@@ -3073,14 +3082,14 @@ bool kmsg_dump_get_line_nolock(struct kmsg_dumper *dumper, bool syslog,
 	if (!dumper->active)
 		goto out;
 
-	if (dumper->cur_seq < log_first_seq) {
+	if (dumper->cur_seq < lcb->log_first_seq) {
 		/* messages are gone, move to first available one */
-		dumper->cur_seq = log_first_seq;
-		dumper->cur_idx = log_first_idx;
+		dumper->cur_seq = lcb->log_first_seq;
+		dumper->cur_idx = lcb->log_first_idx;
 	}
 
 	/* last entry */
-	if (dumper->cur_seq >= log_next_seq)
+	if (dumper->cur_seq >= lcb->log_next_seq)
 		goto out;
 
 	msg = log_from_idx(dumper->cur_idx);
@@ -3161,10 +3170,10 @@ bool kmsg_dump_get_buffer(struct kmsg_dumper *dumper, bool syslog,
 		goto out;
 
 	raw_spin_lock_irqsave(&logbuf_lock, flags);
-	if (dumper->cur_seq < log_first_seq) {
+	if (dumper->cur_seq < lcb->log_first_seq) {
 		/* messages are gone, move to first available one */
-		dumper->cur_seq = log_first_seq;
-		dumper->cur_idx = log_first_idx;
+		dumper->cur_seq = lcb->log_first_seq;
+		dumper->cur_idx = lcb->log_first_idx;
 	}
 
 	/* last entry */
@@ -3236,10 +3245,10 @@ EXPORT_SYMBOL_GPL(kmsg_dump_get_buffer);
  */
 void kmsg_dump_rewind_nolock(struct kmsg_dumper *dumper)
 {
-	dumper->cur_seq = clear_seq;
-	dumper->cur_idx = clear_idx;
-	dumper->next_seq = log_next_seq;
-	dumper->next_idx = log_next_idx;
+	dumper->cur_seq = lcb->clear_seq;
+	dumper->cur_idx = lcb->clear_idx;
+	dumper->next_seq = lcb->log_next_seq;
+	dumper->next_idx = lcb->log_next_idx;
 }
 
 /**
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC PATCH v1 2/2] printk: external log buffer (CONFIG_LOGBUFFER)
  2016-09-30  0:55 [RFC PATCH v1 0/2] printk: Shared kernel logging Sean Hudson
  2016-09-30  0:55 ` [RFC PATCH v1 1/2] printk: collect offsets into replaceable structure Sean Hudson
@ 2016-09-30  0:55 ` Sean Hudson
  2016-09-30 12:39   ` Petr Mladek
  2016-09-30  1:36 ` [RFC PATCH v1 0/2] printk: Shared kernel logging Kees Cook
  2016-10-04 21:08 ` [RFC PATCH v2 " Sean Hudson
  3 siblings, 1 reply; 16+ messages in thread
From: Sean Hudson @ 2016-09-30  0:55 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sean Hudson, Andrew Morton, Thomas Gleixner, Yang Shi,
	Ingo Molnar, Prarit Bhargava, Rasmus Villemoes, Paul E. McKenney,
	Yaowei Bai, Kees Cook, Andrey Ryabinin, Johannes Weiner,
	Tejun Heo, Arnd Bergmann, Ard Biesheuvel, Thomas Garnier,
	Petr Mladek, Nicolas Pitre, Sergey Senozhatsky, Borislav Petkov,
	Tim Bird, Steven Rostedt, Grant Likely

This debug feature provides a convenient way to collect log entries across
multiple, warmboot cycles and to share those entries with a boot loader.
It allows the kernel to use an external buffer for kernel log messages and
is controlled by an optional command line parameter. The buffer can contain
existing log messages from previous boot cycles and/or the bootloader. The
command line parameter was chosen for flexibility, cross arch portability,
and the ability to dynamically enable/disable this feature. The parameter
specifies the address of a control block used to replace the default log
buffer.  Existing bootloader and kernel log messages are kept, in order,
inside the new buffer.  After a boot that preserves the buffer contents, a
bootloader can display both kernel and bootloader log entries from
multiple, previous boots. It also allows the kernel to display bootloader
log entries along with its own messages.

This feature is intended for debug purposes and has no effect unless the
command line parameter is specified.  Further, it validates the passed
control block carefully and if any checks fail, it falls back to the
default behaviour.  As such, it can be left enabled by default.

Memory Reservation
------------------
This feature expects the bootloader to reserve/preserve the shared buffer
memory. This reservation needs to prevent the kernel from overwriting the
external log control block and log entries. In my testing, I've used the
'fdt' commands in uboot to dynamically inject reserved memory regions via
the DT to the kernel.

Based on the initial work of Wolfgang Denk and Igor Lisitsin [1].
Also based on work by: Alexander Streit <Alexander.Streit@continental-corporation.com>

[1] http://git.denx.de/?p=linux-2.6-denx.git;a=commitdiff;h=212f61c7fd3b952a81d1459dd32a86a32ddfd4ce

Signed-off-by: Sean Hudson <sean_hudson@mentor.com>
---
 init/Kconfig           |  12 +++
 init/main.c            |   2 +
 kernel/printk/printk.c | 267 +++++++++++++++++++++++++++++++++++++++++++++++--
 3 files changed, 275 insertions(+), 6 deletions(-)

diff --git a/init/Kconfig b/init/Kconfig
index cac3f09..746183b 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1493,6 +1493,18 @@ config PRINTK
 	  very difficult to diagnose system problems, saying N here is
 	  strongly discouraged.
 
+config LOGBUFFER
+	bool "External logbuffer" if PRINTK
+	default n
+	depends on PRINTK
+	help
+	  This option enables support for an alternative, "external" printk log
+	  buffer. If memory contents are preserved, e.g. after a warmboot, this
+	  provides a known location for the boot loader to read and display printk
+	  entries from the kernel.  If desired, the bootloader can write its own
+	  log entries which the kernel will display with its own log entries.
+	  Further, this capability can be used across multiple warmboot cycles.
+
 config PRINTK_NMI
 	def_bool y
 	depends on PRINTK
diff --git a/init/main.c b/init/main.c
index a8a58e2..4a5913c 100644
--- a/init/main.c
+++ b/init/main.c
@@ -93,6 +93,7 @@ static int kernel_init(void *);
 extern void init_IRQ(void);
 extern void fork_init(void);
 extern void radix_tree_init(void);
+extern void setup_ext_logbuff(void);
 
 /*
  * Debug helper: via this flag we know that we are in 'early bootup code'
@@ -535,6 +536,7 @@ asmlinkage __visible void __init start_kernel(void)
 	sort_main_extable();
 	trap_init();
 	mm_init();
+	setup_ext_logbuff();
 
 	/*
 	 * Set up the scheduler prior starting any interrupts (such as the
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 7a441f5..017b4d4 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -331,17 +331,17 @@ enum log_flags {
 };
 
 struct printk_log {
-	u64 ts_nsec;		/* timestamp in nanoseconds */
+#ifdef CONFIG_LOGBUFFER
+	u32 log_magic;		/* sanity check number */
+#endif
 	u16 len;		/* length of entire record */
 	u16 text_len;		/* length of text buffer */
 	u16 dict_len;		/* length of dictionary buffer */
 	u8 facility;		/* syslog facility */
 	u8 flags:5;		/* internal record flags */
 	u8 level:3;		/* syslog level */
+	u64 ts_nsec;		/* timestamp in nanoseconds */
 }
-#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
-__packed __aligned(4)
-#endif
 ;
 
 /*
@@ -366,7 +366,14 @@ DECLARE_WAIT_QUEUE_HEAD(log_wait);
 static char __log_buf[__LOG_BUF_LEN] __aligned(LOG_ALIGN);
 
 /*
- * This control block collects tracking offsets for the log.
+ * This control block collects tracking offsets for the log into a single
+ * place.  It also facilitates pointing the log to another location, and,
+ * when combined with the CONFIG_LOGBUFFER feature, it allows log sharing
+ * between the bootloader and the kernel.
+ *
+ * NOTE:
+ *   By convention, the control block and the log buffer are contiguous.
+ *   This requirement can be relaxed, if desired.
  */
 struct lcb_t {
 	/* Pointer to log buffer space and length of space */
@@ -395,13 +402,42 @@ struct lcb_t {
 	/* the next printk record to read after the last 'clear' command */
 	u64 clear_seq;
 	u32 clear_idx;
-} lcb_t;
+
+#ifdef CONFIG_LOGBUFFER
+	u32 log_version;
+	u32 lcb_padded_len;
+	u32 lcb_size;
+	u32 log_hdr_size;
+	phys_addr_t log_phys_addr;
+	u32 lcb_magic;
+#endif
+};
+
+/* Log control block */
+#ifdef CONFIG_LOGBUFFER
+
+/*
+ * The bootloader may write an unexpected version/format of the log, so
+ * these values provide a way to check that the format agrees.
+ */
+#define LOGBUFF_MAGIC			0xc0de4ced
+#define LOGBUFF_LOG_VERSION		3
+#define LOGBUFF_CB_PADDED_LENGTH	1024
 
 static struct lcb_t __lcb = {
+	__log_buf, __LOG_BUF_LEN, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+	LOGBUFF_LOG_VERSION, LOGBUFF_CB_PADDED_LENGTH, sizeof(struct lcb_t),
+	sizeof(struct printk_log), 0, LOGBUFF_MAGIC
+};
+#else
+static struct lcb_t __lcb = {
 	__log_buf, __LOG_BUF_LEN, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 };
+#endif /* CONFIG_LOGBUFFER */
 
 static struct lcb_t *lcb = &__lcb;
 
+void setup_ext_logbuff(void);
+
 /* Return log buffer address */
 char *log_buf_addr_get(void)
 {
@@ -596,6 +632,9 @@ static int log_store(int facility, int level,
 		msg->ts_nsec = local_clock();
 	memset(log_dict(msg) + dict_len, 0, pad_len);
 	msg->len = size;
+#ifdef CONFIG_LOGBUFFER
+	msg->log_magic = LOGBUFF_MAGIC;
+#endif
 
 	/* insert message */
 	lcb->log_next_idx += msg->len;
@@ -1070,6 +1109,21 @@ static void __init log_buf_add_cpu(void)
 static inline void log_buf_add_cpu(void) {}
 #endif /* CONFIG_SMP */
 
+#ifdef CONFIG_LOGBUFFER
+/* Bootloader log control block address passed in on the kernel cmdline */
+phys_addr_t lcb_phys_addr __initdata;
+
+/* save requested lcb_phys_addr since it's too early to process it */
+static int __init lcb_phys_addr_setup(char *str)
+{
+	lcb_phys_addr = memparse(str, &str);
+
+	return 0;
+}
+
+early_param("lcb_phys_addr", lcb_phys_addr_setup);
+#endif
+
 void __init setup_log_buf(int early)
 {
 	unsigned long flags;
@@ -3328,4 +3382,205 @@ void show_regs_print_info(const char *log_lvl)
 	       log_lvl, current, task_stack_page(current));
 }
 
+#ifdef CONFIG_LOGBUFFER
+
+static struct printk_log *log_from_idx_and_base(const char *const base,
+						const u32 idx)
+{
+	struct printk_log *msg = (struct printk_log *)(base + idx);
+
+	if (!msg->len || msg->log_magic != LOGBUFF_MAGIC)
+		return (struct printk_log *)base;
+	return msg;
+}
+
+static u32 log_next_idx_from_idx(const char *const base, const u32 idx)
+{
+	struct printk_log const *const msg = (struct printk_log *)(base + idx);
+
+	if (!msg->len || msg->log_magic != LOGBUFF_MAGIC)
+		return 0;
+	return idx + msg->len;
+}
+
+/*
+ * The bootloader may write a different version of the log than expected.
+ * So, check that the incoming lcb matches that values that the kernel
+ * expects.
+ */
+static bool is_valid_external_cb(const struct lcb_t * const new_lcb)
+{
+	/* Validate the new control buffer */
+	return (new_lcb &&
+		new_lcb->log_version == LOGBUFF_LOG_VERSION &&
+		new_lcb->lcb_magic == LOGBUFF_MAGIC &&
+		new_lcb->lcb_padded_len == LOGBUFF_CB_PADDED_LENGTH &&
+		new_lcb->lcb_size == sizeof(struct lcb_t) &&
+		new_lcb->log_hdr_size == sizeof(struct printk_log) &&
+		new_lcb->lcb_size <= new_lcb->lcb_padded_len &&
+		new_lcb->log_first_idx <= new_lcb->log_buf_len &&
+		new_lcb->log_next_idx <= new_lcb->log_buf_len &&
+		new_lcb->syslog_idx <= new_lcb->log_buf_len &&
+		new_lcb->console_idx <= new_lcb->log_buf_len &&
+		new_lcb->clear_idx <= new_lcb->log_buf_len &&
+		new_lcb->log_phys_addr != 0);
+}
+
+/*
+ * Copy the kernel log entries that have occurred so far in this boot cycle
+ * into the new log buffer location found in the log control block specified
+ * on the kernel command line. Then switch to using that location from this
+ * point onward.
+ * To ensure correctness, we must consider :
+ *  - the bootloader may be using a different structure
+ *  - size of the external buffer may not match the size of the default
+ *  - an unknown number of bootloader log entries may be present
+ *  - an unknown number of kernel log entries may be present
+ *  - wrapping of the bootloader log may have already occurred (unlikely)
+ *  - wrapping of the kernel log may have already occurred (unlikely)
+ *
+ * Memory Reservation
+ * ------------------
+ *  This feature expects the bootloader to reserve/preserve the shared buffer
+ *  memory. This reservation needs to prevent the kernel from overwriting the
+ *  external log control block and log entries. In my testing, I've used the
+ *  'fdt' commands in uboot to dynamically inject reserved memory regions via
+ *  the DT to the kernel.
+ *
+ */
+void __init setup_ext_logbuff(void)
+{
+	struct printk_log *nextblmsg, *currkmsg;
+	u64 currkmsg_seq;
+	u32 currkmsg_idx, offset_idx;
+	struct lcb_t *new_lcb;
+	unsigned long flags;
+	int ret;
+
+	if (!lcb_phys_addr)
+		return;
+
+	new_lcb = phys_to_virt(lcb_phys_addr);
+	new_lcb->log_buf = phys_to_virt(new_lcb->log_phys_addr);
+
+	/* Validate the new control buffer */
+	if (!is_valid_external_cb(new_lcb)) {
+		pr_warn("log : External log cb failed validation. Ignored\n");
+		pr_debug
+		    ("Kernel logbuffer of size %u at 0x%p (virtual address)\n",
+		     lcb->log_buf_len, lcb->log_buf);
+		return;
+	}
+
+	if (!new_log_buf_len)
+		pr_warn("log_buf_len ignored in favor of external log\n");
+
+	pr_debug
+	    ("log : Replacing cb @ 0x%p of %lu bytes with cb @ 0x%p of %d/%d bytes\n",
+	     lcb, sizeof(struct lcb_t), new_lcb, new_lcb->lcb_size,
+	     new_lcb->lcb_padded_len);
+
+	/* Capture the virtual address for the buffer */
+	new_lcb->log_buf = phys_to_virt(new_lcb->log_phys_addr);
+	pr_debug("log : Replacing buffer @ 0x%p (VA) of %d bytes with one @ 0x%p (VA) of %d bytes\n",
+		 lcb->log_buf, lcb->log_buf_len, new_lcb->log_buf,
+		 new_lcb->log_buf_len);
+
+	raw_spin_lock_irqsave(&logbuf_lock, flags);
+
+	/*
+	 * Offset sequence numbers by the number of existing bootloader msgs
+	 * NOTE: Functions above handle if the first seq becomes > than these
+	 */
+	if (lcb->syslog_seq != new_lcb->log_first_seq)
+		new_lcb->syslog_seq = lcb->syslog_seq + new_lcb->log_next_seq;
+	if (lcb->console_seq != new_lcb->log_first_seq)
+		new_lcb->console_seq = lcb->console_seq + new_lcb->log_next_seq;
+	if (lcb->clear_seq != new_lcb->log_first_seq)
+		new_lcb->clear_seq = lcb->clear_seq + new_lcb->log_next_seq;
+
+	/* Copy kernel specific data into the new/shared control block. */
+	new_lcb->syslog_prev = lcb->syslog_prev;
+	new_lcb->syslog_partial = lcb->syslog_partial;
+	new_lcb->console_prev = lcb->console_prev;
+
+	/*
+	 * Copy existing entries one at a time into the external buffer space.
+	 * The new CB will incrementally update as each kernel entry is added.
+	 * The existing CB will not be modified.
+	 */
+	for (currkmsg_seq = lcb->log_first_seq,
+	     currkmsg_idx = lcb->log_first_idx;
+	     currkmsg_seq <= lcb->log_next_seq &&
+	     currkmsg_idx < lcb->log_buf_len;
+	     currkmsg_idx = log_next(currkmsg_idx), currkmsg_seq++) {
+		currkmsg = log_from_idx_and_base(lcb->log_buf, currkmsg_idx);
+
+		/* If necessary, free bl entries to fit kernel entries */
+		while (new_lcb->log_first_seq < new_lcb->log_next_seq) {
+			u32 free;
+
+			if (new_lcb->log_next_idx > new_lcb->log_first_idx)
+				free = max(new_lcb->log_buf_len -
+					   new_lcb->log_next_idx,
+					   new_lcb->log_first_idx);
+			else
+				free = new_lcb->log_first_idx -
+				    new_lcb->log_next_idx;
+
+			if (free > currkmsg->len + sizeof(struct printk_log))
+				break;
+
+			/* Drop oldest messages until there is enough space */
+			new_lcb->log_first_idx =
+			    log_next_idx_from_idx(new_lcb->log_buf,
+						  new_lcb->log_first_idx);
+			new_lcb->log_first_seq++;
+		}
+
+		nextblmsg = (struct printk_log *)
+		    (new_lcb->log_buf + new_lcb->log_next_idx);
+
+		/* Check if the next record will fit along with a wrap record */
+		if (new_lcb->log_next_idx + currkmsg->len +
+		    sizeof(struct printk_log) >= new_lcb->log_buf_len) {
+			memset(nextblmsg, 0, sizeof(struct printk_log));
+			nextblmsg = (struct printk_log *)new_lcb->log_buf;
+			new_lcb->log_next_idx = 0;
+		}
+		/* Copy kernel message to bootloader buffer */
+		memcpy(nextblmsg, currkmsg, currkmsg->len);
+
+		/* Before moving to the next kernel entry, adjust indices */
+		if (currkmsg_seq == lcb->log_first_seq)
+			offset_idx = new_lcb->log_first_seq;
+		else
+			offset_idx = new_lcb->log_next_idx;
+
+		if (currkmsg_seq == lcb->syslog_seq)
+			new_lcb->syslog_idx = offset_idx;
+		if (currkmsg_seq == lcb->console_seq)
+			new_lcb->console_idx = offset_idx;
+		if (currkmsg_seq == lcb->clear_seq)
+			new_lcb->clear_idx = offset_idx;
+
+		/* Update next offset and sequence */
+		if (currkmsg_seq < lcb->log_next_seq) {
+			new_lcb->log_next_idx += currkmsg->len;
+			new_lcb->log_next_seq++;
+		}
+	}
+
+	/* Overwrite the default lcb pointer, buffer pointer, and length. */
+	lcb = new_lcb;
+	lcb->log_buf = (char *)new_lcb->log_buf;
+	lcb->log_buf_len = new_lcb->log_buf_len;
+
+	raw_spin_unlock_irqrestore(&logbuf_lock, flags);
+
+	pr_info("Kernel logbuffer of size %u at 0x%p (virtual address)\n",
+		lcb->log_buf_len, lcb->log_buf);
+}
+#endif
+
 #endif
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH v1 0/2] printk: Shared kernel logging
  2016-09-30  0:55 [RFC PATCH v1 0/2] printk: Shared kernel logging Sean Hudson
  2016-09-30  0:55 ` [RFC PATCH v1 1/2] printk: collect offsets into replaceable structure Sean Hudson
  2016-09-30  0:55 ` [RFC PATCH v1 2/2] printk: external log buffer (CONFIG_LOGBUFFER) Sean Hudson
@ 2016-09-30  1:36 ` Kees Cook
  2016-10-01  1:56   ` Sean Hudson
  2016-10-04 21:08 ` [RFC PATCH v2 " Sean Hudson
  3 siblings, 1 reply; 16+ messages in thread
From: Kees Cook @ 2016-09-30  1:36 UTC (permalink / raw)
  To: Sean Hudson
  Cc: LKML, Andrew Morton, Thomas Gleixner, Yang Shi, Ingo Molnar,
	Prarit Bhargava, Rasmus Villemoes, Paul E. McKenney, Yaowei Bai,
	Andrey Ryabinin, Johannes Weiner, Tejun Heo, Arnd Bergmann,
	Ard Biesheuvel, Thomas Garnier, Petr Mladek, Nicolas Pitre,
	Sergey Senozhatsky, Borislav Petkov, Tim Bird, Steven Rostedt,
	Grant Likely

On Thu, Sep 29, 2016 at 5:55 PM, Sean Hudson <sean_hudson@mentor.com> wrote:
> This patch set is based on Linus' v4.8-rc8 tag.
>
> This debug feature allows the kernel to use an external buffer and control
> block for kernel log messages. The feature is controlled by an optional
> command line parameter. The existing buffer and control block can contain
> existing log messages from previous boot cycles and/or the bootloader. The
> command line parameter was chosen for flexibility, cross arch portability,
> and the ability to dynamically enable/disable this feature. The parameter
> specifies the address of a control block used to replace the default log
> buffer. Existing bootloader and kernel log messages are kept, in order,
> inside the new buffer. After a boot that preserves the buffer contents, a
> bootloader can display both kernel and bootloader log entries from multiple,
> previous boots. It also allows the kernel to display bootloader log entries
> along with its own messages.
>
> This feature is intended for debug purposes and has no effect unless the
> command line parameter is specified. Further, it validates the passed
> control block carefully and if any checks fail, it falls back to the default
> behaviour. As such, it can be left enabled by default.
>
> Memory Reservation
>
> This feature expects the bootloader to reserve/preserve the shared buffer
> memory. This reservation needs to prevent the kernel from overwriting the
> external log control block and log entries. In my testing, I've used the
> 'fdt' commands in uboot to dynamically inject reserved memory regions via
> the DT to the kernel.

Interesting! I wonder if this can be adjusted to incorporate the
existing console logging feature in the pstore which does a similar
thing? Though pstore doesn't know about bootloader logs, really, it's
just storing kernel logs in a ring buffer. Maybe this can provide a
backend to pstore or something, especially since pstore initialization
happens "too late" for this to really be very sensible. It just seems
like it'd be nice to have a single persistent console memory region...

-Kees

>
> Sean Hudson (2):
>   printk: collect offsets into replaceable structure
>   printk: external log buffer (CONFIG_LOGBUFFER)
>
>  init/Kconfig           |  12 +
>  init/main.c            |   2 +
>  kernel/printk/printk.c | 598 +++++++++++++++++++++++++++++++++++--------------
>  3 files changed, 445 insertions(+), 167 deletions(-)
>
> --
> 1.9.1
>
>



-- 
Kees Cook
Nexus Security

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH v1 2/2] printk: external log buffer (CONFIG_LOGBUFFER)
  2016-09-30  0:55 ` [RFC PATCH v1 2/2] printk: external log buffer (CONFIG_LOGBUFFER) Sean Hudson
@ 2016-09-30 12:39   ` Petr Mladek
  2016-10-01  4:06     ` Sean Hudson
  0 siblings, 1 reply; 16+ messages in thread
From: Petr Mladek @ 2016-09-30 12:39 UTC (permalink / raw)
  To: Sean Hudson
  Cc: linux-kernel, Andrew Morton, Thomas Gleixner, Yang Shi,
	Ingo Molnar, Prarit Bhargava, Rasmus Villemoes, Paul E. McKenney,
	Yaowei Bai, Kees Cook, Andrey Ryabinin, Johannes Weiner,
	Tejun Heo, Arnd Bergmann, Ard Biesheuvel, Thomas Garnier,
	Nicolas Pitre, Sergey Senozhatsky, Borislav Petkov, Tim Bird,
	Steven Rostedt, Grant Likely

On Thu 2016-09-29 19:55:56, Sean Hudson wrote:
> This debug feature provides a convenient way to collect log entries across
> multiple, warmboot cycles and to share those entries with a boot loader.
> It allows the kernel to use an external buffer for kernel log messages and
> is controlled by an optional command line parameter. The buffer can contain
> existing log messages from previous boot cycles and/or the bootloader. The
> command line parameter was chosen for flexibility, cross arch portability,
> and the ability to dynamically enable/disable this feature. The parameter
> specifies the address of a control block used to replace the default log
> buffer.  Existing bootloader and kernel log messages are kept, in order,
> inside the new buffer.  After a boot that preserves the buffer contents, a
> bootloader can display both kernel and bootloader log entries from
> multiple, previous boots. It also allows the kernel to display bootloader
> log entries along with its own messages.
> 
> This feature is intended for debug purposes and has no effect unless the
> command line parameter is specified.  Further, it validates the passed
> control block carefully and if any checks fail, it falls back to the
> default behaviour.  As such, it can be left enabled by default.
> 
> Memory Reservation
> ------------------
> This feature expects the bootloader to reserve/preserve the shared buffer
> memory. This reservation needs to prevent the kernel from overwriting the
> external log control block and log entries. In my testing, I've used the
> 'fdt' commands in uboot to dynamically inject reserved memory regions via
> the DT to the kernel.
> 
> Based on the initial work of Wolfgang Denk and Igor Lisitsin [1].
> Also based on work by: Alexander Streit <Alexander.Streit@continental-corporation.com>
> 
> [1] http://git.denx.de/?p=linux-2.6-denx.git;a=commitdiff;h=212f61c7fd3b952a81d1459dd32a86a32ddfd4ce
> 
> Signed-off-by: Sean Hudson <sean_hudson@mentor.com>
> ---
>  init/Kconfig           |  12 +++
>  init/main.c            |   2 +
>  kernel/printk/printk.c | 267 +++++++++++++++++++++++++++++++++++++++++++++++--
>  3 files changed, 275 insertions(+), 6 deletions(-)
> 
> diff --git a/init/Kconfig b/init/Kconfig
> index cac3f09..746183b 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -1493,6 +1493,18 @@ config PRINTK
>  	  very difficult to diagnose system problems, saying N here is
>  	  strongly discouraged.
>  
> +config LOGBUFFER
> +	bool "External logbuffer" if PRINTK
> +	default n
> +	depends on PRINTK
> +	help
> +	  This option enables support for an alternative, "external" printk log
> +	  buffer. If memory contents are preserved, e.g. after a warmboot, this
> +	  provides a known location for the boot loader to read and display printk
> +	  entries from the kernel.  If desired, the bootloader can write its own
> +	  log entries which the kernel will display with its own log entries.
> +	  Further, this capability can be used across multiple warmboot cycles.
> +
>  config PRINTK_NMI
>  	def_bool y
>  	depends on PRINTK
> diff --git a/init/main.c b/init/main.c
> index a8a58e2..4a5913c 100644
> --- a/init/main.c
> +++ b/init/main.c
> @@ -93,6 +93,7 @@ static int kernel_init(void *);
>  extern void init_IRQ(void);
>  extern void fork_init(void);
>  extern void radix_tree_init(void);
> +extern void setup_ext_logbuff(void);
>  
>  /*
>   * Debug helper: via this flag we know that we are in 'early bootup code'
> @@ -535,6 +536,7 @@ asmlinkage __visible void __init start_kernel(void)
>  	sort_main_extable();
>  	trap_init();
>  	mm_init();
> +	setup_ext_logbuff();
>  
>  	/*
>  	 * Set up the scheduler prior starting any interrupts (such as the
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index 7a441f5..017b4d4 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -395,13 +402,42 @@ struct lcb_t {
> +/*
> + * The bootloader may write an unexpected version/format of the log, so
> + * these values provide a way to check that the format agrees.
> + */
> +#define LOGBUFF_MAGIC			0xc0de4ced
> +#define LOGBUFF_LOG_VERSION		3
> +#define LOGBUFF_CB_PADDED_LENGTH	1024
>  
>  static struct lcb_t __lcb = {
> +	__log_buf, __LOG_BUF_LEN, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
> +	LOGBUFF_LOG_VERSION, LOGBUFF_CB_PADDED_LENGTH, sizeof(struct lcb_t),
> +	sizeof(struct printk_log), 0, LOGBUFF_MAGIC
> +};
> +#else
> +static struct lcb_t __lcb = {
>  	__log_buf, __LOG_BUF_LEN, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 };
> +#endif /* CONFIG_LOGBUFFER */
>  
>  static struct lcb_t *lcb = &__lcb;
>  
> +void setup_ext_logbuff(void);
> +
>  /* Return log buffer address */
>  char *log_buf_addr_get(void)
>  {
> +/*
> + * The bootloader may write a different version of the log than expected.
> + * So, check that the incoming lcb matches that values that the kernel
> + * expects.
> + */
> +static bool is_valid_external_cb(const struct lcb_t * const new_lcb)
> +{
> +	/* Validate the new control buffer */
> +	return (new_lcb &&
> +		new_lcb->log_version == LOGBUFF_LOG_VERSION &&
> +		new_lcb->lcb_magic == LOGBUFF_MAGIC &&

The bootloader might be able to write the messages in different
formats. How would it know which one will be the right one
for the given kernel, please?

What are the plans? Do you want to add and maintain the compatibility
in U-boot or also in other bootloaders?

BTW: I always wanted to put all these infornation into a struct.
Also I wanted to rename struct printk_log to a less confusing
name, e.g. printk_msg. The problem is that these structures
are used also by external aplications, .e.g crash, makedumpfine
that need to be updated as well. It is a pain. So, I am a bit
scared to add even more external applications that depend
on these structures.


> +		new_lcb->lcb_padded_len == LOGBUFF_CB_PADDED_LENGTH &&
> +		new_lcb->lcb_size == sizeof(struct lcb_t) &&
> +		new_lcb->log_hdr_size == sizeof(struct printk_log) &&
> +		new_lcb->lcb_size <= new_lcb->lcb_padded_len &&
> +		new_lcb->log_first_idx <= new_lcb->log_buf_len &&
> +		new_lcb->log_next_idx <= new_lcb->log_buf_len &&
> +		new_lcb->syslog_idx <= new_lcb->log_buf_len &&
> +		new_lcb->console_idx <= new_lcb->log_buf_len &&
> +		new_lcb->clear_idx <= new_lcb->log_buf_len &&
> +		new_lcb->log_phys_addr != 0);
> +}
> +
> +/*
> + * Copy the kernel log entries that have occurred so far in this boot cycle
> + * into the new log buffer location found in the log control block specified
> + * on the kernel command line. Then switch to using that location from this
> + * point onward.
> + * To ensure correctness, we must consider :
> + *  - the bootloader may be using a different structure
> + *  - size of the external buffer may not match the size of the default
> + *  - an unknown number of bootloader log entries may be present
> + *  - an unknown number of kernel log entries may be present
> + *  - wrapping of the bootloader log may have already occurred (unlikely)
> + *  - wrapping of the kernel log may have already occurred (unlikely)
> + *
> + * Memory Reservation
> + * ------------------
> + *  This feature expects the bootloader to reserve/preserve the shared buffer
> + *  memory. This reservation needs to prevent the kernel from overwriting the
> + *  external log control block and log entries. In my testing, I've used the
> + *  'fdt' commands in uboot to dynamically inject reserved memory regions via
> + *  the DT to the kernel.
> + *
> + */
> +void __init setup_ext_logbuff(void)
> +{
> +	struct printk_log *nextblmsg, *currkmsg;
> +	u64 currkmsg_seq;
> +	u32 currkmsg_idx, offset_idx;
> +	struct lcb_t *new_lcb;
> +	unsigned long flags;
> +	int ret;
> +
> +	if (!lcb_phys_addr)
> +		return;
> +
> +	new_lcb = phys_to_virt(lcb_phys_addr);
> +	new_lcb->log_buf = phys_to_virt(new_lcb->log_phys_addr);
> +
> +	/* Validate the new control buffer */
> +	if (!is_valid_external_cb(new_lcb)) {
> +		pr_warn("log : External log cb failed validation. Ignored\n");
> +		pr_debug
> +		    ("Kernel logbuffer of size %u at 0x%p (virtual address)\n",
> +		     lcb->log_buf_len, lcb->log_buf);
> +		return;
> +	}
> +
> +	if (!new_log_buf_len)
> +		pr_warn("log_buf_len ignored in favor of external log\n");
> +
> +	pr_debug
> +	    ("log : Replacing cb @ 0x%p of %lu bytes with cb @ 0x%p of %d/%d bytes\n",
> +	     lcb, sizeof(struct lcb_t), new_lcb, new_lcb->lcb_size,
> +	     new_lcb->lcb_padded_len);
> +
> +	/* Capture the virtual address for the buffer */
> +	new_lcb->log_buf = phys_to_virt(new_lcb->log_phys_addr);
> +	pr_debug("log : Replacing buffer @ 0x%p (VA) of %d bytes with one @ 0x%p (VA) of %d bytes\n",
> +		 lcb->log_buf, lcb->log_buf_len, new_lcb->log_buf,
> +		 new_lcb->log_buf_len);
> +
> +	raw_spin_lock_irqsave(&logbuf_lock, flags);
> +
> +	/*
> +	 * Offset sequence numbers by the number of existing bootloader msgs
> +	 * NOTE: Functions above handle if the first seq becomes > than these
> +	 */
> +	if (lcb->syslog_seq != new_lcb->log_first_seq)
> +		new_lcb->syslog_seq = lcb->syslog_seq + new_lcb->log_next_seq;
> +	if (lcb->console_seq != new_lcb->log_first_seq)
> +		new_lcb->console_seq = lcb->console_seq + new_lcb->log_next_seq;
> +	if (lcb->clear_seq != new_lcb->log_first_seq)
> +		new_lcb->clear_seq = lcb->clear_seq + new_lcb->log_next_seq;
> +
> +	/* Copy kernel specific data into the new/shared control block. */
> +	new_lcb->syslog_prev = lcb->syslog_prev;
> +	new_lcb->syslog_partial = lcb->syslog_partial;
> +	new_lcb->console_prev = lcb->console_prev;
> +
> +	/*
> +	 * Copy existing entries one at a time into the external buffer space.
> +	 * The new CB will incrementally update as each kernel entry is added.
> +	 * The existing CB will not be modified.
> +	 */
> +	for (currkmsg_seq = lcb->log_first_seq,
> +	     currkmsg_idx = lcb->log_first_idx;
> +	     currkmsg_seq <= lcb->log_next_seq &&
> +	     currkmsg_idx < lcb->log_buf_len;
> +	     currkmsg_idx = log_next(currkmsg_idx), currkmsg_seq++) {
> +		currkmsg = log_from_idx_and_base(lcb->log_buf, currkmsg_idx);
> +
> +		/* If necessary, free bl entries to fit kernel entries */
> +		while (new_lcb->log_first_seq < new_lcb->log_next_seq) {
> +			u32 free;
> +
> +			if (new_lcb->log_next_idx > new_lcb->log_first_idx)
> +				free = max(new_lcb->log_buf_len -
> +					   new_lcb->log_next_idx,
> +					   new_lcb->log_first_idx);
> +			else
> +				free = new_lcb->log_first_idx -
> +				    new_lcb->log_next_idx;
> +
> +			if (free > currkmsg->len + sizeof(struct printk_log))
> +				break;
> +
> +			/* Drop oldest messages until there is enough space */
> +			new_lcb->log_first_idx =
> +			    log_next_idx_from_idx(new_lcb->log_buf,
> +						  new_lcb->log_first_idx);
> +			new_lcb->log_first_seq++;
> +		}
> +
> +		nextblmsg = (struct printk_log *)
> +		    (new_lcb->log_buf + new_lcb->log_next_idx);
> +
> +		/* Check if the next record will fit along with a wrap record */
> +		if (new_lcb->log_next_idx + currkmsg->len +
> +		    sizeof(struct printk_log) >= new_lcb->log_buf_len) {
> +			memset(nextblmsg, 0, sizeof(struct printk_log));
> +			nextblmsg = (struct printk_log *)new_lcb->log_buf;
> +			new_lcb->log_next_idx = 0;
> +		}
> +		/* Copy kernel message to bootloader buffer */
> +		memcpy(nextblmsg, currkmsg, currkmsg->len);
> +
> +		/* Before moving to the next kernel entry, adjust indices */
> +		if (currkmsg_seq == lcb->log_first_seq)
> +			offset_idx = new_lcb->log_first_seq;
> +		else
> +			offset_idx = new_lcb->log_next_idx;
> +
> +		if (currkmsg_seq == lcb->syslog_seq)
> +			new_lcb->syslog_idx = offset_idx;
> +		if (currkmsg_seq == lcb->console_seq)
> +			new_lcb->console_idx = offset_idx;
> +		if (currkmsg_seq == lcb->clear_seq)
> +			new_lcb->clear_idx = offset_idx;
> +
> +		/* Update next offset and sequence */
> +		if (currkmsg_seq < lcb->log_next_seq) {
> +			new_lcb->log_next_idx += currkmsg->len;
> +			new_lcb->log_next_seq++;
> +		}
> +	}
> +
> +	/* Overwrite the default lcb pointer, buffer pointer, and length. */
> +	lcb = new_lcb;
> +	lcb->log_buf = (char *)new_lcb->log_buf;
> +	lcb->log_buf_len = new_lcb->log_buf_len;

Is the bootloader allowed to write to the shared buffer from this
point on? How will be synchronized the writes from the kernel and
the bootloader?

How protected will the address of the shared buffer? Would not
it open a security hole? We need to make sure that non-privileged
user or non-trusted application must not read sensitive data
from the log or break the structure of the log by writing
invalid data.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH v1 0/2] printk: Shared kernel logging
  2016-09-30  1:36 ` [RFC PATCH v1 0/2] printk: Shared kernel logging Kees Cook
@ 2016-10-01  1:56   ` Sean Hudson
  2016-10-01  2:42     ` Kees Cook
  0 siblings, 1 reply; 16+ messages in thread
From: Sean Hudson @ 2016-10-01  1:56 UTC (permalink / raw)
  To: Kees Cook
  Cc: LKML, Andrew Morton, Thomas Gleixner, Yang Shi, Ingo Molnar,
	Prarit Bhargava, Rasmus Villemoes, Paul E. McKenney, Yaowei Bai,
	Andrey Ryabinin, Johannes Weiner, Tejun Heo, Arnd Bergmann,
	Ard Biesheuvel, Thomas Garnier, Petr Mladek, Nicolas Pitre,
	Sergey Senozhatsky, Borislav Petkov, Tim Bird, Steven Rostedt,
	Grant Likely


[-- Attachment #1.1: Type: text/plain, Size: 2442 bytes --]

On 9/29/2016 8:36 PM, Kees Cook wrote:
> On Thu, Sep 29, 2016 at 5:55 PM, Sean Hudson <sean_hudson@mentor.com>
> wrote:
>> This patch set is based on Linus' v4.8-rc8 tag.
>> 
>> This debug feature allows the kernel to use an external buffer and
>> control block for kernel log messages. The feature is controlled by
>> an optional command line parameter. The existing buffer and control
>> block can contain existing log messages from previous boot cycles
>> and/or the bootloader. The command line parameter was chosen for
>> flexibility, cross arch portability, and the ability to dynamically
>> enable/disable this feature. The parameter specifies the address of
>> a control block used to replace the default log buffer. Existing
>> bootloader and kernel log messages are kept, in order, inside the
>> new buffer. After a boot that preserves the buffer contents, a 
>> bootloader can display both kernel and bootloader log entries from
>> multiple, previous boots. It also allows the kernel to display
>> bootloader log entries along with its own messages.
>> 
>> This feature is intended for debug purposes and has no effect
>> unless the command line parameter is specified. Further, it
>> validates the passed control block carefully and if any checks
>> fail, it falls back to the default behaviour. As such, it can be
>> left enabled by default.
>> 
>> Memory Reservation
>> 
>> This feature expects the bootloader to reserve/preserve the shared
>> buffer memory. This reservation needs to prevent the kernel from
>> overwriting the external log control block and log entries. In my
>> testing, I've used the 'fdt' commands in uboot to dynamically
>> inject reserved memory regions via the DT to the kernel.
> 
> Interesting! I wonder if this can be adjusted to incorporate the 
> existing console logging feature in the pstore which does a similar 
> thing? Though pstore doesn't know about bootloader logs, really,
> it's just storing kernel logs in a ring buffer. Maybe this can
> provide a backend to pstore or something, especially since pstore
> initialization happens "too late" for this to really be very
> sensible. It just seems like it'd be nice to have a single persistent
> console memory region...
> 

I don't know that much about pstore.  From your description though, it
sounds feasible to put the two together at some point.  How arch
specific is pstore?

-- 
Sean


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH v1 0/2] printk: Shared kernel logging
  2016-10-01  1:56   ` Sean Hudson
@ 2016-10-01  2:42     ` Kees Cook
  0 siblings, 0 replies; 16+ messages in thread
From: Kees Cook @ 2016-10-01  2:42 UTC (permalink / raw)
  To: Sean Hudson
  Cc: LKML, Andrew Morton, Thomas Gleixner, Yang Shi, Ingo Molnar,
	Prarit Bhargava, Rasmus Villemoes, Paul E. McKenney, Yaowei Bai,
	Andrey Ryabinin, Johannes Weiner, Tejun Heo, Arnd Bergmann,
	Ard Biesheuvel, Thomas Garnier, Petr Mladek, Nicolas Pitre,
	Sergey Senozhatsky, Borislav Petkov, Tim Bird, Steven Rostedt,
	Grant Likely

On Fri, Sep 30, 2016 at 6:56 PM, Sean Hudson <darknighte@darknighte.com> wrote:
> On 9/29/2016 8:36 PM, Kees Cook wrote:
>> On Thu, Sep 29, 2016 at 5:55 PM, Sean Hudson <sean_hudson@mentor.com>
>> wrote:
>>> This patch set is based on Linus' v4.8-rc8 tag.
>>>
>>> This debug feature allows the kernel to use an external buffer and
>>> control block for kernel log messages. The feature is controlled by
>>> an optional command line parameter. The existing buffer and control
>>> block can contain existing log messages from previous boot cycles
>>> and/or the bootloader. The command line parameter was chosen for
>>> flexibility, cross arch portability, and the ability to dynamically
>>> enable/disable this feature. The parameter specifies the address of
>>> a control block used to replace the default log buffer. Existing
>>> bootloader and kernel log messages are kept, in order, inside the
>>> new buffer. After a boot that preserves the buffer contents, a
>>> bootloader can display both kernel and bootloader log entries from
>>> multiple, previous boots. It also allows the kernel to display
>>> bootloader log entries along with its own messages.
>>>
>>> This feature is intended for debug purposes and has no effect
>>> unless the command line parameter is specified. Further, it
>>> validates the passed control block carefully and if any checks
>>> fail, it falls back to the default behaviour. As such, it can be
>>> left enabled by default.
>>>
>>> Memory Reservation
>>>
>>> This feature expects the bootloader to reserve/preserve the shared
>>> buffer memory. This reservation needs to prevent the kernel from
>>> overwriting the external log control block and log entries. In my
>>> testing, I've used the 'fdt' commands in uboot to dynamically
>>> inject reserved memory regions via the DT to the kernel.
>>
>> Interesting! I wonder if this can be adjusted to incorporate the
>> existing console logging feature in the pstore which does a similar
>> thing? Though pstore doesn't know about bootloader logs, really,
>> it's just storing kernel logs in a ring buffer. Maybe this can
>> provide a backend to pstore or something, especially since pstore
>> initialization happens "too late" for this to really be very
>> sensible. It just seems like it'd be nice to have a single persistent
>> console memory region...
>>
>
> I don't know that much about pstore.  From your description though, it
> sounds feasible to put the two together at some point.  How arch
> specific is pstore?

The only "weird" part is how to declare the reserved memory range.
That depends on either a platform driver or a Device Tree entry, but
isn't strictly "arch specific". Otherwise, console writes are
preserved in persistent memory as declared by the reservation info. It
doesnt, however, _load_ console from such a place.

-Kees

-- 
Kees Cook
Nexus Security

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH v1 2/2] printk: external log buffer (CONFIG_LOGBUFFER)
  2016-09-30 12:39   ` Petr Mladek
@ 2016-10-01  4:06     ` Sean Hudson
  2016-10-04 11:27       ` Petr Mladek
  0 siblings, 1 reply; 16+ messages in thread
From: Sean Hudson @ 2016-10-01  4:06 UTC (permalink / raw)
  To: Petr Mladek
  Cc: linux-kernel, Andrew Morton, Thomas Gleixner, Yang Shi,
	Ingo Molnar, Prarit Bhargava, Rasmus Villemoes, Paul E. McKenney,
	Yaowei Bai, Kees Cook, Andrey Ryabinin, Johannes Weiner,
	Tejun Heo, Arnd Bergmann, Ard Biesheuvel, Thomas Garnier,
	Nicolas Pitre, Sergey Senozhatsky, Borislav Petkov, Tim Bird,
	Steven Rostedt, Grant Likely

On 9/30/2016 7:39 AM, Petr Mladek wrote:
> On Thu 2016-09-29 19:55:56, Sean Hudson wrote:
>> This debug feature provides a convenient way to collect log entries across
>> multiple, warmboot cycles and to share those entries with a boot loader.
>> It allows the kernel to use an external buffer for kernel log messages and
>> is controlled by an optional command line parameter. The buffer can contain
>> existing log messages from previous boot cycles and/or the bootloader. The
>> command line parameter was chosen for flexibility, cross arch portability,
>> and the ability to dynamically enable/disable this feature. The parameter
>> specifies the address of a control block used to replace the default log
>> buffer.  Existing bootloader and kernel log messages are kept, in order,
>> inside the new buffer.  After a boot that preserves the buffer contents, a
>> bootloader can display both kernel and bootloader log entries from
>> multiple, previous boots. It also allows the kernel to display bootloader
>> log entries along with its own messages.
>>
>> This feature is intended for debug purposes and has no effect unless the
>> command line parameter is specified.  Further, it validates the passed
>> control block carefully and if any checks fail, it falls back to the
>> default behaviour.  As such, it can be left enabled by default.
>>
>> Memory Reservation
>> ------------------
>> This feature expects the bootloader to reserve/preserve the shared buffer
>> memory. This reservation needs to prevent the kernel from overwriting the
>> external log control block and log entries. In my testing, I've used the
>> 'fdt' commands in uboot to dynamically inject reserved memory regions via
>> the DT to the kernel.
>>
>> Based on the initial work of Wolfgang Denk and Igor Lisitsin [1].
>> Also based on work by: Alexander Streit <Alexander.Streit@continental-corporation.com>
>>
>> [1] http://git.denx.de/?p=linux-2.6-denx.git;a=commitdiff;h=212f61c7fd3b952a81d1459dd32a86a32ddfd4ce
>>
>> Signed-off-by: Sean Hudson <sean_hudson@mentor.com>
>> ---
>>  init/Kconfig           |  12 +++
>>  init/main.c            |   2 +
>>  kernel/printk/printk.c | 267 +++++++++++++++++++++++++++++++++++++++++++++++--
>>  3 files changed, 275 insertions(+), 6 deletions(-)
>>
>> diff --git a/init/Kconfig b/init/Kconfig
>> index cac3f09..746183b 100644
>> --- a/init/Kconfig
>> +++ b/init/Kconfig
>> @@ -1493,6 +1493,18 @@ config PRINTK
>>  	  very difficult to diagnose system problems, saying N here is
>>  	  strongly discouraged.
>>  
>> +config LOGBUFFER
>> +	bool "External logbuffer" if PRINTK
>> +	default n
>> +	depends on PRINTK
>> +	help
>> +	  This option enables support for an alternative, "external" printk log
>> +	  buffer. If memory contents are preserved, e.g. after a warmboot, this
>> +	  provides a known location for the boot loader to read and display printk
>> +	  entries from the kernel.  If desired, the bootloader can write its own
>> +	  log entries which the kernel will display with its own log entries.
>> +	  Further, this capability can be used across multiple warmboot cycles.
>> +
>>  config PRINTK_NMI
>>  	def_bool y
>>  	depends on PRINTK
>> diff --git a/init/main.c b/init/main.c
>> index a8a58e2..4a5913c 100644
>> --- a/init/main.c
>> +++ b/init/main.c
>> @@ -93,6 +93,7 @@ static int kernel_init(void *);
>>  extern void init_IRQ(void);
>>  extern void fork_init(void);
>>  extern void radix_tree_init(void);
>> +extern void setup_ext_logbuff(void);
>>  
>>  /*
>>   * Debug helper: via this flag we know that we are in 'early bootup code'
>> @@ -535,6 +536,7 @@ asmlinkage __visible void __init start_kernel(void)
>>  	sort_main_extable();
>>  	trap_init();
>>  	mm_init();
>> +	setup_ext_logbuff();
>>  
>>  	/*
>>  	 * Set up the scheduler prior starting any interrupts (such as the
>> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
>> index 7a441f5..017b4d4 100644
>> --- a/kernel/printk/printk.c
>> +++ b/kernel/printk/printk.c
>> @@ -395,13 +402,42 @@ struct lcb_t {
>> +/*
>> + * The bootloader may write an unexpected version/format of the log, so
>> + * these values provide a way to check that the format agrees.
>> + */
>> +#define LOGBUFF_MAGIC			0xc0de4ced
>> +#define LOGBUFF_LOG_VERSION		3
>> +#define LOGBUFF_CB_PADDED_LENGTH	1024
>>  
>>  static struct lcb_t __lcb = {
>> +	__log_buf, __LOG_BUF_LEN, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
>> +	LOGBUFF_LOG_VERSION, LOGBUFF_CB_PADDED_LENGTH, sizeof(struct lcb_t),
>> +	sizeof(struct printk_log), 0, LOGBUFF_MAGIC
>> +};
>> +#else
>> +static struct lcb_t __lcb = {
>>  	__log_buf, __LOG_BUF_LEN, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 };
>> +#endif /* CONFIG_LOGBUFFER */
>>  
>>  static struct lcb_t *lcb = &__lcb;
>>  
>> +void setup_ext_logbuff(void);
>> +
>>  /* Return log buffer address */
>>  char *log_buf_addr_get(void)
>>  {
>> +/*
>> + * The bootloader may write a different version of the log than expected.
>> + * So, check that the incoming lcb matches that values that the kernel
>> + * expects.
>> + */
>> +static bool is_valid_external_cb(const struct lcb_t * const new_lcb)
>> +{
>> +	/* Validate the new control buffer */
>> +	return (new_lcb &&
>> +		new_lcb->log_version == LOGBUFF_LOG_VERSION &&
>> +		new_lcb->lcb_magic == LOGBUFF_MAGIC &&
> 
> The bootloader might be able to write the messages in different
> formats. How would it know which one will be the right one
> for the given kernel, please?
> 
> What are the plans? Do you want to add and maintain the compatibility
> in U-boot or also in other bootloaders?
> 
> BTW: I always wanted to put all these infornation into a struct.
> Also I wanted to rename struct printk_log to a less confusing
> name, e.g. printk_msg. The problem is that these structures
> are used also by external aplications, .e.g crash, makedumpfine
> that need to be updated as well. It is a pain. So, I am a bit
> scared to add even more external applications that depend
> on these structures.
> 
> 
>> +		new_lcb->lcb_padded_len == LOGBUFF_CB_PADDED_LENGTH &&
>> +		new_lcb->lcb_size == sizeof(struct lcb_t) &&
>> +		new_lcb->log_hdr_size == sizeof(struct printk_log) &&
>> +		new_lcb->lcb_size <= new_lcb->lcb_padded_len &&
>> +		new_lcb->log_first_idx <= new_lcb->log_buf_len &&
>> +		new_lcb->log_next_idx <= new_lcb->log_buf_len &&
>> +		new_lcb->syslog_idx <= new_lcb->log_buf_len &&
>> +		new_lcb->console_idx <= new_lcb->log_buf_len &&
>> +		new_lcb->clear_idx <= new_lcb->log_buf_len &&
>> +		new_lcb->log_phys_addr != 0);
>> +}
>> +
>> +/*
>> + * Copy the kernel log entries that have occurred so far in this boot cycle
>> + * into the new log buffer location found in the log control block specified
>> + * on the kernel command line. Then switch to using that location from this
>> + * point onward.
>> + * To ensure correctness, we must consider :
>> + *  - the bootloader may be using a different structure
>> + *  - size of the external buffer may not match the size of the default
>> + *  - an unknown number of bootloader log entries may be present
>> + *  - an unknown number of kernel log entries may be present
>> + *  - wrapping of the bootloader log may have already occurred (unlikely)
>> + *  - wrapping of the kernel log may have already occurred (unlikely)
>> + *
>> + * Memory Reservation
>> + * ------------------
>> + *  This feature expects the bootloader to reserve/preserve the shared buffer
>> + *  memory. This reservation needs to prevent the kernel from overwriting the
>> + *  external log control block and log entries. In my testing, I've used the
>> + *  'fdt' commands in uboot to dynamically inject reserved memory regions via
>> + *  the DT to the kernel.
>> + *
>> + */
>> +void __init setup_ext_logbuff(void)
>> +{
>> +	struct printk_log *nextblmsg, *currkmsg;
>> +	u64 currkmsg_seq;
>> +	u32 currkmsg_idx, offset_idx;
>> +	struct lcb_t *new_lcb;
>> +	unsigned long flags;
>> +	int ret;
>> +
>> +	if (!lcb_phys_addr)
>> +		return;
>> +
>> +	new_lcb = phys_to_virt(lcb_phys_addr);
>> +	new_lcb->log_buf = phys_to_virt(new_lcb->log_phys_addr);
>> +
>> +	/* Validate the new control buffer */
>> +	if (!is_valid_external_cb(new_lcb)) {
>> +		pr_warn("log : External log cb failed validation. Ignored\n");
>> +		pr_debug
>> +		    ("Kernel logbuffer of size %u at 0x%p (virtual address)\n",
>> +		     lcb->log_buf_len, lcb->log_buf);
>> +		return;
>> +	}
>> +
>> +	if (!new_log_buf_len)
>> +		pr_warn("log_buf_len ignored in favor of external log\n");
>> +
>> +	pr_debug
>> +	    ("log : Replacing cb @ 0x%p of %lu bytes with cb @ 0x%p of %d/%d bytes\n",
>> +	     lcb, sizeof(struct lcb_t), new_lcb, new_lcb->lcb_size,
>> +	     new_lcb->lcb_padded_len);
>> +
>> +	/* Capture the virtual address for the buffer */
>> +	new_lcb->log_buf = phys_to_virt(new_lcb->log_phys_addr);
>> +	pr_debug("log : Replacing buffer @ 0x%p (VA) of %d bytes with one @ 0x%p (VA) of %d bytes\n",
>> +		 lcb->log_buf, lcb->log_buf_len, new_lcb->log_buf,
>> +		 new_lcb->log_buf_len);
>> +
>> +	raw_spin_lock_irqsave(&logbuf_lock, flags);
>> +
>> +	/*
>> +	 * Offset sequence numbers by the number of existing bootloader msgs
>> +	 * NOTE: Functions above handle if the first seq becomes > than these
>> +	 */
>> +	if (lcb->syslog_seq != new_lcb->log_first_seq)
>> +		new_lcb->syslog_seq = lcb->syslog_seq + new_lcb->log_next_seq;
>> +	if (lcb->console_seq != new_lcb->log_first_seq)
>> +		new_lcb->console_seq = lcb->console_seq + new_lcb->log_next_seq;
>> +	if (lcb->clear_seq != new_lcb->log_first_seq)
>> +		new_lcb->clear_seq = lcb->clear_seq + new_lcb->log_next_seq;
>> +
>> +	/* Copy kernel specific data into the new/shared control block. */
>> +	new_lcb->syslog_prev = lcb->syslog_prev;
>> +	new_lcb->syslog_partial = lcb->syslog_partial;
>> +	new_lcb->console_prev = lcb->console_prev;
>> +
>> +	/*
>> +	 * Copy existing entries one at a time into the external buffer space.
>> +	 * The new CB will incrementally update as each kernel entry is added.
>> +	 * The existing CB will not be modified.
>> +	 */
>> +	for (currkmsg_seq = lcb->log_first_seq,
>> +	     currkmsg_idx = lcb->log_first_idx;
>> +	     currkmsg_seq <= lcb->log_next_seq &&
>> +	     currkmsg_idx < lcb->log_buf_len;
>> +	     currkmsg_idx = log_next(currkmsg_idx), currkmsg_seq++) {
>> +		currkmsg = log_from_idx_and_base(lcb->log_buf, currkmsg_idx);
>> +
>> +		/* If necessary, free bl entries to fit kernel entries */
>> +		while (new_lcb->log_first_seq < new_lcb->log_next_seq) {
>> +			u32 free;
>> +
>> +			if (new_lcb->log_next_idx > new_lcb->log_first_idx)
>> +				free = max(new_lcb->log_buf_len -
>> +					   new_lcb->log_next_idx,
>> +					   new_lcb->log_first_idx);
>> +			else
>> +				free = new_lcb->log_first_idx -
>> +				    new_lcb->log_next_idx;
>> +
>> +			if (free > currkmsg->len + sizeof(struct printk_log))
>> +				break;
>> +
>> +			/* Drop oldest messages until there is enough space */
>> +			new_lcb->log_first_idx =
>> +			    log_next_idx_from_idx(new_lcb->log_buf,
>> +						  new_lcb->log_first_idx);
>> +			new_lcb->log_first_seq++;
>> +		}
>> +
>> +		nextblmsg = (struct printk_log *)
>> +		    (new_lcb->log_buf + new_lcb->log_next_idx);
>> +
>> +		/* Check if the next record will fit along with a wrap record */
>> +		if (new_lcb->log_next_idx + currkmsg->len +
>> +		    sizeof(struct printk_log) >= new_lcb->log_buf_len) {
>> +			memset(nextblmsg, 0, sizeof(struct printk_log));
>> +			nextblmsg = (struct printk_log *)new_lcb->log_buf;
>> +			new_lcb->log_next_idx = 0;
>> +		}
>> +		/* Copy kernel message to bootloader buffer */
>> +		memcpy(nextblmsg, currkmsg, currkmsg->len);
>> +
>> +		/* Before moving to the next kernel entry, adjust indices */
>> +		if (currkmsg_seq == lcb->log_first_seq)
>> +			offset_idx = new_lcb->log_first_seq;
>> +		else
>> +			offset_idx = new_lcb->log_next_idx;
>> +
>> +		if (currkmsg_seq == lcb->syslog_seq)
>> +			new_lcb->syslog_idx = offset_idx;
>> +		if (currkmsg_seq == lcb->console_seq)
>> +			new_lcb->console_idx = offset_idx;
>> +		if (currkmsg_seq == lcb->clear_seq)
>> +			new_lcb->clear_idx = offset_idx;
>> +
>> +		/* Update next offset and sequence */
>> +		if (currkmsg_seq < lcb->log_next_seq) {
>> +			new_lcb->log_next_idx += currkmsg->len;
>> +			new_lcb->log_next_seq++;
>> +		}
>> +	}
>> +
>> +	/* Overwrite the default lcb pointer, buffer pointer, and length. */
>> +	lcb = new_lcb;
>> +	lcb->log_buf = (char *)new_lcb->log_buf;
>> +	lcb->log_buf_len = new_lcb->log_buf_len;
> 
> Is the bootloader allowed to write to the shared buffer from this
> point on? How will be synchronized the writes from the kernel and
> the bootloader?

I am a bit confused by this question.  I assume that the kernel and
bootloader will not be accessing the same log space simultaneously.
While each one is running, they will utilize the same data structures to
keep track of current entries.

> 
> How protected will the address of the shared buffer? Would not
> it open a security hole? We need to make sure that non-privileged
> user or non-trusted application must not read sensitive data
> from the log or break the structure of the log by writing
> invalid data.
> 

There is no notion of protection of the address.  However, additional
checking could be added. Currently, there is potential for corrupt log
entries from the bootloader causing the kernel to crash.  I can't think
of a way to prevent that from happening.  Also, as you point out, a
bootloader could read log contents from the shared region.  Of course,
in order for either to happen, the bootloader would already be
compromised and I'm not sure that reading log entries or crashing the
kernel is such a big consideration.

-- 
Sean

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH v1 2/2] printk: external log buffer (CONFIG_LOGBUFFER)
  2016-10-01  4:06     ` Sean Hudson
@ 2016-10-04 11:27       ` Petr Mladek
  2016-10-04 17:55         ` Sean Hudson
  0 siblings, 1 reply; 16+ messages in thread
From: Petr Mladek @ 2016-10-04 11:27 UTC (permalink / raw)
  To: Sean Hudson
  Cc: linux-kernel, Andrew Morton, Thomas Gleixner, Yang Shi,
	Ingo Molnar, Prarit Bhargava, Rasmus Villemoes, Paul E. McKenney,
	Yaowei Bai, Kees Cook, Andrey Ryabinin, Johannes Weiner,
	Tejun Heo, Arnd Bergmann, Ard Biesheuvel, Thomas Garnier,
	Nicolas Pitre, Sergey Senozhatsky, Borislav Petkov, Tim Bird,
	Steven Rostedt, Grant Likely

On Fri 2016-09-30 23:06:49, Sean Hudson wrote:
> On 9/30/2016 7:39 AM, Petr Mladek wrote:
> > On Thu 2016-09-29 19:55:56, Sean Hudson wrote:
> >> +	/* Overwrite the default lcb pointer, buffer pointer, and length. */
> >> +	lcb = new_lcb;
> >> +	lcb->log_buf = (char *)new_lcb->log_buf;
> >> +	lcb->log_buf_len = new_lcb->log_buf_len;
> > 
> > Is the bootloader allowed to write to the shared buffer from this
> > point on? How will be synchronized the writes from the kernel and
> > the bootloader?
> 
> I am a bit confused by this question.  I assume that the kernel and
> bootloader will not be accessing the same log space simultaneously.
> While each one is running, they will utilize the same data structures to
> keep track of current entries.

They both will write new messages at the end of the buffer. Therefore
they both might want to write to exactly the same location. Also if
the buffer is full, the oldest messages are overwritten. Therefore
even reading must be synchronized against writes.

These operations are synchronized using logbuf_lock on the kernel side.
I do not know much about bootloaders but I guess that the bootloader
will not be able to use the logbuf_lock. So the question is how
the bootloader and kernel will synchronize the access to the shared
buffer.

Or did I miss anything?


> > How protected will the address of the shared buffer? Would not
> > it open a security hole? We need to make sure that non-privileged
> > user or non-trusted application must not read sensitive data
> > from the log or break the structure of the log by writing
> > invalid data.
> 
> There is no notion of protection of the address.  However, additional
> checking could be added. Currently, there is potential for corrupt log
> entries from the bootloader causing the kernel to crash.
> I can't think of a way to prevent that from happening.

IMHO, the only way is to revisit all locations when the log buffer
is accessed and add all the needed consistency checks. We must make
sure that we do not access outside of the buffer. Also the log_*_seq
numbers are supposed to grow linearly...


>  Also, as you point out, a bootloader could read log contents from
> the shared region. Of course, in order for either to happen,
> the bootloader would already be compromised and I'm not sure that
>  reading log entries or crashing the kernel is such a big consideration.

The log might contain addresses of some kernel structures, for example
in some error messages. For this reason, only a privileged users are
able to read it. We must make sure that they will not get accessible
for a non-privileged user, for example via an address to the shared
buffer in the bootloader config.

BTW: You did not answered the question about how the bootloader would
know the right version of the log format. I am afraid that we do not
want to maintain backward compatibility on the kernel side. The printk
code already is too complex.

Also you did not answered the question about your plans. I wonder
which bootloader would be able to use such a feature.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH v1 2/2] printk: external log buffer (CONFIG_LOGBUFFER)
  2016-10-04 11:27       ` Petr Mladek
@ 2016-10-04 17:55         ` Sean Hudson
  2016-10-05 12:48           ` Petr Mladek
  0 siblings, 1 reply; 16+ messages in thread
From: Sean Hudson @ 2016-10-04 17:55 UTC (permalink / raw)
  To: Petr Mladek
  Cc: linux-kernel, Andrew Morton, Thomas Gleixner, Yang Shi,
	Ingo Molnar, Prarit Bhargava, Rasmus Villemoes, Paul E. McKenney,
	Yaowei Bai, Kees Cook, Andrey Ryabinin, Johannes Weiner,
	Tejun Heo, Arnd Bergmann, Ard Biesheuvel, Thomas Garnier,
	Nicolas Pitre, Sergey Senozhatsky, Borislav Petkov, Tim Bird,
	Steven Rostedt, Grant Likely


[-- Attachment #1.1: Type: text/plain, Size: 5638 bytes --]

On 10/4/2016 6:27 AM, Petr Mladek wrote:
> On Fri 2016-09-30 23:06:49, Sean Hudson wrote:
>> On 9/30/2016 7:39 AM, Petr Mladek wrote:
>>>
>>> Is the bootloader allowed to write to the shared buffer from this
>>> point on? How will be synchronized the writes from the kernel and
>>> the bootloader?
>>
>> I am a bit confused by this question.  I assume that the kernel and
>> bootloader will not be accessing the same log space simultaneously.
>> While each one is running, they will utilize the same data structures to
>> keep track of current entries.
> 
> They both will write new messages at the end of the buffer. Therefore
> they both might want to write to exactly the same location. Also if
> the buffer is full, the oldest messages are overwritten. Therefore
> even reading must be synchronized against writes.
> 
> These operations are synchronized using logbuf_lock on the kernel side.
> I do not know much about bootloaders but I guess that the bootloader
> will not be able to use the logbuf_lock. So the question is how
> the bootloader and kernel will synchronize the access to the shared
> buffer.
> 
> Or did I miss anything?

The state of the log is preserved in the log control block, or lcb,
which is passed when the boot loader passes control to the kernel.  In
order to make the transition as seamless as possible, the boot loader
appends entries in the same way that the kernel does and overwrites the
oldest entries once the ring buffer is full.  It also updates tracking
indices in the lcb.  When the boot loader passes control to the kernel,
the kernel adopts the lcb and appends entries using that structure.  In
this way, the lcb provides continuity between the boot loader and the
kernel for the state of the buffer.

Note, there is a small window of time during early boot in which the
kernel puts entries into the default location, a staticlly declared
char[].  These entries are appended to the external location once
setup_ext_logbuff() is called from init/main.c.  (I'd love to close that
gap entirely, but all solutions I've tried involve arch specific code,
which I'm trying to avoid.)

>>> How protected will the address of the shared buffer? Would not
>>> it open a security hole? We need to make sure that non-privileged
>>> user or non-trusted application must not read sensitive data
>>> from the log or break the structure of the log by writing
>>> invalid data.
>>
>> There is no notion of protection of the address.  However, additional
>> checking could be added. Currently, there is potential for corrupt log
>> entries from the bootloader causing the kernel to crash.
>> I can't think of a way to prevent that from happening.
> 
> IMHO, the only way is to revisit all locations when the log buffer
> is accessed and add all the needed consistency checks. We must make
> sure that we do not access outside of the buffer. Also the log_*_seq
> numbers are supposed to grow linearly...

Adding an explicit walk of the external log entries should be trivial.
That will only validate that the passed in log entries are consistent
internally.  I will be submitting a v2 patch shortly that fixes an error
that kernel-ci caught. (if CONFIG_PRINTK is disabled, compilation fails)
I'll see about adding that check at the same time.

>>  Also, as you point out, a bootloader could read log contents from
>> the shared region. Of course, in order for either to happen,
>> the bootloader would already be compromised and I'm not sure that
>>  reading log entries or crashing the kernel is such a big consideration.
> 
> The log might contain addresses of some kernel structures, for example
> in some error messages. For this reason, only a privileged users are
> able to read it. We must make sure that they will not get accessible
> for a non-privileged user, for example via an address to the shared
> buffer in the bootloader config.

Ahhh, if I am understanding correctly, your concern is that an
unprivileged program in user space would be able to gain access directly
to the log entries via the external buffer location.  I haven't tested
that scenario, so I'm not sure if the memory reservation via the DT
protects that memory from user space.  I'll have to check on that.  In
the meantime, if you know of standard ways to protect memory from user
space access, please let me know.

> BTW: You did not answered the question about how the bootloader would
> know the right version of the log format. I am afraid that we do not
> want to maintain backward compatibility on the kernel side. The printk
> code already is too complex.

This is a matter of magic numbers.  Uboot has two preexisting log
structure versions. Since the kernel did not have a version, I used
LOGBUFF_LOG_VERSION = 3, if CONFIG_LOGBUFFER is enabled.  So, if the
boot loader passes the command line parameter 'lcb_phys_addr' and the
contents of that address pass the validation checks found in
is_valid_external_cb(), then the external buffer is used.  Those checks
include matching the version number, structure sizes match, etc.
Otherwise, the external buffer is ignored.  This should allow for the
kernel to change in the future, e.g. v4, and continue to run correctly
even if an older boot loader passes a v3 version of the log entries.

> Also you did not answered the question about your plans. I wonder
> which bootloader would be able to use such a feature.
> 

Sorry about that, I missed the question in your previous reply.  I am
preparing to submit a patch set to mainline uBoot that adds this capability.



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [RFC PATCH v2 0/2] printk: Shared kernel logging
  2016-09-30  0:55 [RFC PATCH v1 0/2] printk: Shared kernel logging Sean Hudson
                   ` (2 preceding siblings ...)
  2016-09-30  1:36 ` [RFC PATCH v1 0/2] printk: Shared kernel logging Kees Cook
@ 2016-10-04 21:08 ` Sean Hudson
  2016-10-04 21:08   ` [RFC PATCH v2 1/2] printk: collect offsets into replaceable structure Sean Hudson
  2016-10-04 21:08   ` [RFC PATCH v2 2/2] printk: external log buffer (CONFIG_LOGBUFFER) Sean Hudson
  3 siblings, 2 replies; 16+ messages in thread
From: Sean Hudson @ 2016-10-04 21:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sean Hudson, Andrew Morton, Thomas Gleixner, Yang Shi,
	Ingo Molnar, Prarit Bhargava, Rasmus Villemoes, Paul E. McKenney,
	Yaowei Bai, Kees Cook, Andrey Ryabinin, Johannes Weiner,
	Tejun Heo, Arnd Bergmann, Ard Biesheuvel, Thomas Garnier,
	Petr Mladek, Nicolas Pitre, Sergey Senozhatsky, Borislav Petkov,
	Tim Bird, Steven Rostedt, Grant Likely

This patch set is based on Linus' v4.8-rc8 tag.

This debug feature allows the kernel to use an external buffer and control
block for kernel log messages. The feature is controlled by an optional
command line parameter. The existing buffer and control block can contain
existing log messages from previous boot cycles and/or the bootloader. The
command line parameter was chosen for flexibility, cross arch portability,
and the ability to dynamically enable/disable this feature. The parameter
specifies the address of a control block used to replace the default log
buffer. Existing bootloader and kernel log messages are kept, in order,
inside the new buffer. After a boot that preserves the buffer contents, a
bootloader can display both kernel and bootloader log entries from multiple,
previous boots. It also allows the kernel to display bootloader log entries
along with its own messages.

This feature is intended for debug purposes and has no effect unless the
command line parameter is specified. Further, it validates the passed
control block carefully and if any checks fail, it falls back to the default
behaviour. As such, it can be left enabled by default.

Memory Reservation

This feature expects the bootloader to reserve/preserve the shared buffer
memory. This reservation needs to prevent the kernel from overwriting the
external log control block and log entries. In my testing, I've used the
'fdt' commands in uboot to dynamically inject reserved memory regions via
the DT to the kernel.

Sean Hudson (2):
  printk: collect offsets into replaceable structure
  printk: external log buffer (CONFIG_LOGBUFFER)

 init/Kconfig           |  12 +
 init/main.c            |   7 +
 kernel/printk/printk.c | 599 +++++++++++++++++++++++++++++++++++--------------
 3 files changed, 451 insertions(+), 167 deletions(-)


Changes since v1: Fix CONFIG_PRINTK disabled build

-- 
2.7.4

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [RFC PATCH v2 1/2] printk: collect offsets into replaceable structure
  2016-10-04 21:08 ` [RFC PATCH v2 " Sean Hudson
@ 2016-10-04 21:08   ` Sean Hudson
  2016-10-04 21:08   ` [RFC PATCH v2 2/2] printk: external log buffer (CONFIG_LOGBUFFER) Sean Hudson
  1 sibling, 0 replies; 16+ messages in thread
From: Sean Hudson @ 2016-10-04 21:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sean Hudson, Andrew Morton, Thomas Gleixner, Yang Shi,
	Ingo Molnar, Prarit Bhargava, Rasmus Villemoes, Paul E. McKenney,
	Yaowei Bai, Kees Cook, Andrey Ryabinin, Johannes Weiner,
	Tejun Heo, Arnd Bergmann, Ard Biesheuvel, Thomas Garnier,
	Petr Mladek, Nicolas Pitre, Sergey Senozhatsky, Borislav Petkov,
	Tim Bird, Steven Rostedt, Grant Likely

Currently, printk relies on several indices that are declared as static
global variables in printk.c.  This patch collects those into a single
structure referenced by a pointer.  This allows easier replacement of
these indices and pinning to a specific locatino.

Signed-off-by: Sean Hudson <sean_hudson@mentor.com>
---
 kernel/printk/printk.c | 338 +++++++++++++++++++++++++------------------------
 1 file changed, 175 insertions(+), 163 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index eea6dbc..063f01d 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -351,30 +351,40 @@ __packed __aligned(4)
  */
 DEFINE_RAW_SPINLOCK(logbuf_lock);
 
+/*
+ * This control block collects tracking offsets for the log.
+ */
+struct lcb_t {
+	/* Pointer to log buffer space and length of space */
+	char *log_buf;
+	u32 log_buf_len;
+
+	/* index and sequence of the first record stored in the buffer */
+	u64 log_first_seq;
+	u32 log_first_idx;
+
+	/* index and sequence of the next record to store in the buffer */
+	u64 log_next_seq;
+	u32 log_next_idx;
+
+	/* the next printk record to read by syslog(READ) or /proc/kmsg */
+	u64 syslog_seq;
+	u32 syslog_idx;
+	enum log_flags syslog_prev;
+	size_t syslog_partial;
+
+	/* the next printk record to write to the console */
+	u64 console_seq;
+	u32 console_idx;
+	enum log_flags console_prev;
+
+	/* the next printk record to read after the last 'clear' command */
+	u64 clear_seq;
+	u32 clear_idx;
+};
+
 #ifdef CONFIG_PRINTK
 DECLARE_WAIT_QUEUE_HEAD(log_wait);
-/* the next printk record to read by syslog(READ) or /proc/kmsg */
-static u64 syslog_seq;
-static u32 syslog_idx;
-static enum log_flags syslog_prev;
-static size_t syslog_partial;
-
-/* index and sequence number of the first record stored in the buffer */
-static u64 log_first_seq;
-static u32 log_first_idx;
-
-/* index and sequence number of the next record to store in the buffer */
-static u64 log_next_seq;
-static u32 log_next_idx;
-
-/* the next printk record to write to the console */
-static u64 console_seq;
-static u32 console_idx;
-static enum log_flags console_prev;
-
-/* the next printk record to read after the last 'clear' command */
-static u64 clear_seq;
-static u32 clear_idx;
 
 #define PREFIX_MAX		32
 #define LOG_LINE_MAX		(1024 - PREFIX_MAX)
@@ -386,19 +396,22 @@ static u32 clear_idx;
 #define LOG_ALIGN __alignof__(struct printk_log)
 #define __LOG_BUF_LEN (1 << CONFIG_LOG_BUF_SHIFT)
 static char __log_buf[__LOG_BUF_LEN] __aligned(LOG_ALIGN);
-static char *log_buf = __log_buf;
-static u32 log_buf_len = __LOG_BUF_LEN;
+
+static struct lcb_t __lcb = {
+	__log_buf, __LOG_BUF_LEN, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 };
+
+static struct lcb_t *lcb = &__lcb;
 
 /* Return log buffer address */
 char *log_buf_addr_get(void)
 {
-	return log_buf;
+	return lcb->log_buf;
 }
 
 /* Return log buffer size */
 u32 log_buf_len_get(void)
 {
-	return log_buf_len;
+	return lcb->log_buf_len;
 }
 
 /* human readable text of the record */
@@ -416,21 +429,21 @@ static char *log_dict(const struct printk_log *msg)
 /* get record by index; idx must point to valid msg */
 static struct printk_log *log_from_idx(u32 idx)
 {
-	struct printk_log *msg = (struct printk_log *)(log_buf + idx);
+	struct printk_log *msg = (struct printk_log *)(lcb->log_buf + idx);
 
 	/*
 	 * A length == 0 record is the end of buffer marker. Wrap around and
 	 * read the message at the start of the buffer.
 	 */
 	if (!msg->len)
-		return (struct printk_log *)log_buf;
+		return (struct printk_log *)lcb->log_buf;
 	return msg;
 }
 
 /* get next record; idx must point to valid msg */
 static u32 log_next(u32 idx)
 {
-	struct printk_log *msg = (struct printk_log *)(log_buf + idx);
+	struct printk_log *msg = (struct printk_log *)(lcb->log_buf + idx);
 
 	/* length == 0 indicates the end of the buffer; wrap */
 	/*
@@ -439,7 +452,7 @@ static u32 log_next(u32 idx)
 	 * return the one after that.
 	 */
 	if (!msg->len) {
-		msg = (struct printk_log *)log_buf;
+		msg = (struct printk_log *)lcb->log_buf;
 		return msg->len;
 	}
 	return idx + msg->len;
@@ -458,10 +471,11 @@ static int logbuf_has_space(u32 msg_size, bool empty)
 {
 	u32 free;
 
-	if (log_next_idx > log_first_idx || empty)
-		free = max(log_buf_len - log_next_idx, log_first_idx);
+	if (lcb->log_next_idx > lcb->log_first_idx || empty)
+		free = max(lcb->log_buf_len - lcb->log_next_idx,
+				lcb->log_first_idx);
 	else
-		free = log_first_idx - log_next_idx;
+		free = lcb->log_first_idx - lcb->log_next_idx;
 
 	/*
 	 * We need space also for an empty header that signalizes wrapping
@@ -472,20 +486,20 @@ static int logbuf_has_space(u32 msg_size, bool empty)
 
 static int log_make_free_space(u32 msg_size)
 {
-	while (log_first_seq < log_next_seq &&
+	while (lcb->log_first_seq < lcb->log_next_seq &&
 	       !logbuf_has_space(msg_size, false)) {
 		/* drop old messages until we have enough contiguous space */
-		log_first_idx = log_next(log_first_idx);
-		log_first_seq++;
+		lcb->log_first_idx = log_next(lcb->log_first_idx);
+		lcb->log_first_seq++;
 	}
 
-	if (clear_seq < log_first_seq) {
-		clear_seq = log_first_seq;
-		clear_idx = log_first_idx;
+	if (lcb->clear_seq < lcb->log_first_seq) {
+		lcb->clear_seq = lcb->log_first_seq;
+		lcb->clear_idx = lcb->log_first_idx;
 	}
 
 	/* sequence numbers are equal, so the log buffer is empty */
-	if (logbuf_has_space(msg_size, log_first_seq == log_next_seq))
+	if (logbuf_has_space(msg_size, lcb->log_first_seq == lcb->log_next_seq))
 		return 0;
 
 	return -ENOMEM;
@@ -518,7 +532,7 @@ static u32 truncate_msg(u16 *text_len, u16 *trunc_msg_len,
 	 * The message should not take the whole buffer. Otherwise, it might
 	 * get removed too soon.
 	 */
-	u32 max_text_len = log_buf_len / MAX_LOG_TAKE_PART;
+	u32 max_text_len = lcb->log_buf_len / MAX_LOG_TAKE_PART;
 	if (*text_len > max_text_len)
 		*text_len = max_text_len;
 	/* enable the warning message */
@@ -551,18 +565,20 @@ static int log_store(int facility, int level,
 			return 0;
 	}
 
-	if (log_next_idx + size + sizeof(struct printk_log) > log_buf_len) {
+	if (lcb->log_next_idx + size + sizeof(struct printk_log) >
+			lcb->log_buf_len) {
 		/*
 		 * This message + an additional empty header does not fit
 		 * at the end of the buffer. Add an empty header with len == 0
 		 * to signify a wrap around.
 		 */
-		memset(log_buf + log_next_idx, 0, sizeof(struct printk_log));
-		log_next_idx = 0;
+		memset(lcb->log_buf + lcb->log_next_idx, 0,
+				sizeof(struct printk_log));
+		lcb->log_next_idx = 0;
 	}
 
 	/* fill message */
-	msg = (struct printk_log *)(log_buf + log_next_idx);
+	msg = (struct printk_log *)(lcb->log_buf + lcb->log_next_idx);
 	memcpy(log_text(msg), text, text_len);
 	msg->text_len = text_len;
 	if (trunc_msg_len) {
@@ -582,8 +598,8 @@ static int log_store(int facility, int level,
 	msg->len = size;
 
 	/* insert message */
-	log_next_idx += msg->len;
-	log_next_seq++;
+	lcb->log_next_idx += msg->len;
+	lcb->log_next_seq++;
 
 	return msg->text_len;
 }
@@ -801,7 +817,7 @@ static ssize_t devkmsg_read(struct file *file, char __user *buf,
 	if (ret)
 		return ret;
 	raw_spin_lock_irq(&logbuf_lock);
-	while (user->seq == log_next_seq) {
+	while (user->seq == lcb->log_next_seq) {
 		if (file->f_flags & O_NONBLOCK) {
 			ret = -EAGAIN;
 			raw_spin_unlock_irq(&logbuf_lock);
@@ -810,16 +826,16 @@ static ssize_t devkmsg_read(struct file *file, char __user *buf,
 
 		raw_spin_unlock_irq(&logbuf_lock);
 		ret = wait_event_interruptible(log_wait,
-					       user->seq != log_next_seq);
+					       user->seq != lcb->log_next_seq);
 		if (ret)
 			goto out;
 		raw_spin_lock_irq(&logbuf_lock);
 	}
 
-	if (user->seq < log_first_seq) {
+	if (user->seq < lcb->log_first_seq) {
 		/* our last seen message is gone, return error and reset */
-		user->idx = log_first_idx;
-		user->seq = log_first_seq;
+		user->idx = lcb->log_first_idx;
+		user->seq = lcb->log_first_seq;
 		ret = -EPIPE;
 		raw_spin_unlock_irq(&logbuf_lock);
 		goto out;
@@ -866,8 +882,8 @@ static loff_t devkmsg_llseek(struct file *file, loff_t offset, int whence)
 	switch (whence) {
 	case SEEK_SET:
 		/* the first record */
-		user->idx = log_first_idx;
-		user->seq = log_first_seq;
+		user->idx = lcb->log_first_idx;
+		user->seq = lcb->log_first_seq;
 		break;
 	case SEEK_DATA:
 		/*
@@ -875,13 +891,13 @@ static loff_t devkmsg_llseek(struct file *file, loff_t offset, int whence)
 		 * like issued by 'dmesg -c'. Reading /dev/kmsg itself
 		 * changes no global state, and does not clear anything.
 		 */
-		user->idx = clear_idx;
-		user->seq = clear_seq;
+		user->idx = lcb->clear_idx;
+		user->seq = lcb->clear_seq;
 		break;
 	case SEEK_END:
 		/* after the last record */
-		user->idx = log_next_idx;
-		user->seq = log_next_seq;
+		user->idx = lcb->log_next_idx;
+		user->seq = lcb->log_next_seq;
 		break;
 	default:
 		ret = -EINVAL;
@@ -901,9 +917,9 @@ static unsigned int devkmsg_poll(struct file *file, poll_table *wait)
 	poll_wait(file, &log_wait, wait);
 
 	raw_spin_lock_irq(&logbuf_lock);
-	if (user->seq < log_next_seq) {
+	if (user->seq < lcb->log_next_seq) {
 		/* return error when data has vanished underneath us */
-		if (user->seq < log_first_seq)
+		if (user->seq < lcb->log_first_seq)
 			ret = POLLIN|POLLRDNORM|POLLERR|POLLPRI;
 		else
 			ret = POLLIN|POLLRDNORM;
@@ -939,8 +955,8 @@ static int devkmsg_open(struct inode *inode, struct file *file)
 	mutex_init(&user->lock);
 
 	raw_spin_lock_irq(&logbuf_lock);
-	user->idx = log_first_idx;
-	user->seq = log_first_seq;
+	user->idx = lcb->log_first_idx;
+	user->seq = lcb->log_first_seq;
 	raw_spin_unlock_irq(&logbuf_lock);
 
 	file->private_data = user;
@@ -981,11 +997,11 @@ const struct file_operations kmsg_fops = {
  */
 void log_buf_kexec_setup(void)
 {
-	VMCOREINFO_SYMBOL(log_buf);
-	VMCOREINFO_SYMBOL(log_buf_len);
-	VMCOREINFO_SYMBOL(log_first_idx);
-	VMCOREINFO_SYMBOL(clear_idx);
-	VMCOREINFO_SYMBOL(log_next_idx);
+	VMCOREINFO_SYMBOL(lcb->log_buf);
+	VMCOREINFO_SYMBOL(lcb->log_buf_len);
+	VMCOREINFO_SYMBOL(lcb->log_first_idx);
+	VMCOREINFO_SYMBOL(lcb->clear_idx);
+	VMCOREINFO_SYMBOL(lcb->log_next_idx);
 	/*
 	 * Export struct printk_log size and field offsets. User space tools can
 	 * parse it and detect any changes to structure down the line.
@@ -1006,7 +1022,7 @@ static void __init log_buf_len_update(unsigned size)
 {
 	if (size)
 		size = roundup_pow_of_two(size);
-	if (size > log_buf_len)
+	if (size > lcb->log_buf_len)
 		new_log_buf_len = size;
 }
 
@@ -1060,7 +1076,7 @@ void __init setup_log_buf(int early)
 	char *new_log_buf;
 	int free;
 
-	if (log_buf != __log_buf)
+	if (!lcb || lcb->log_buf != __log_buf)
 		return;
 
 	if (!early && !new_log_buf_len)
@@ -1084,14 +1100,14 @@ void __init setup_log_buf(int early)
 	}
 
 	raw_spin_lock_irqsave(&logbuf_lock, flags);
-	log_buf_len = new_log_buf_len;
-	log_buf = new_log_buf;
+	lcb->log_buf_len = new_log_buf_len;
+	lcb->log_buf = new_log_buf;
 	new_log_buf_len = 0;
-	free = __LOG_BUF_LEN - log_next_idx;
-	memcpy(log_buf, __log_buf, __LOG_BUF_LEN);
+	free = __LOG_BUF_LEN - lcb->log_next_idx;
+	memcpy(lcb->log_buf, __log_buf, __LOG_BUF_LEN);
 	raw_spin_unlock_irqrestore(&logbuf_lock, flags);
 
-	pr_info("log_buf_len: %d bytes\n", log_buf_len);
+	pr_info("log_buf_len: %d bytes\n", lcb->log_buf_len);
 	pr_info("early log buf free: %d(%d%%)\n",
 		free, (free * 100) / __LOG_BUF_LEN);
 }
@@ -1287,33 +1303,33 @@ static int syslog_print(char __user *buf, int size)
 		size_t skip;
 
 		raw_spin_lock_irq(&logbuf_lock);
-		if (syslog_seq < log_first_seq) {
+		if (lcb->syslog_seq < lcb->log_first_seq) {
 			/* messages are gone, move to first one */
-			syslog_seq = log_first_seq;
-			syslog_idx = log_first_idx;
-			syslog_prev = 0;
-			syslog_partial = 0;
+			lcb->syslog_seq = lcb->log_first_seq;
+			lcb->syslog_idx = lcb->log_first_idx;
+			lcb->syslog_prev = 0;
+			lcb->syslog_partial = 0;
 		}
-		if (syslog_seq == log_next_seq) {
+		if (lcb->syslog_seq == lcb->log_next_seq) {
 			raw_spin_unlock_irq(&logbuf_lock);
 			break;
 		}
 
-		skip = syslog_partial;
-		msg = log_from_idx(syslog_idx);
-		n = msg_print_text(msg, syslog_prev, true, text,
+		skip = lcb->syslog_partial;
+		msg = log_from_idx(lcb->syslog_idx);
+		n = msg_print_text(msg, lcb->syslog_prev, true, text,
 				   LOG_LINE_MAX + PREFIX_MAX);
-		if (n - syslog_partial <= size) {
+		if (n - lcb->syslog_partial <= size) {
 			/* message fits into buffer, move forward */
-			syslog_idx = log_next(syslog_idx);
-			syslog_seq++;
-			syslog_prev = msg->flags;
-			n -= syslog_partial;
-			syslog_partial = 0;
+			lcb->syslog_idx = log_next(lcb->syslog_idx);
+			lcb->syslog_seq++;
+			lcb->syslog_prev = msg->flags;
+			n -= lcb->syslog_partial;
+			lcb->syslog_partial = 0;
 		} else if (!len){
 			/* partial read(), remember position */
 			n = size;
-			syslog_partial += n;
+			lcb->syslog_partial += n;
 		} else
 			n = 0;
 		raw_spin_unlock_irq(&logbuf_lock);
@@ -1356,10 +1372,10 @@ static int syslog_print_all(char __user *buf, int size, bool clear)
 		 * Find first record that fits, including all following records,
 		 * into the user-provided buffer for this dump.
 		 */
-		seq = clear_seq;
-		idx = clear_idx;
+		seq = lcb->clear_seq;
+		idx = lcb->clear_idx;
 		prev = 0;
-		while (seq < log_next_seq) {
+		while (seq < lcb->log_next_seq) {
 			struct printk_log *msg = log_from_idx(idx);
 
 			len += msg_print_text(msg, prev, true, NULL, 0);
@@ -1369,10 +1385,10 @@ static int syslog_print_all(char __user *buf, int size, bool clear)
 		}
 
 		/* move first record forward until length fits into the buffer */
-		seq = clear_seq;
-		idx = clear_idx;
+		seq = lcb->clear_seq;
+		idx = lcb->clear_idx;
 		prev = 0;
-		while (len > size && seq < log_next_seq) {
+		while (len > size && seq < lcb->log_next_seq) {
 			struct printk_log *msg = log_from_idx(idx);
 
 			len -= msg_print_text(msg, prev, true, NULL, 0);
@@ -1382,7 +1398,7 @@ static int syslog_print_all(char __user *buf, int size, bool clear)
 		}
 
 		/* last message fitting into this dump */
-		next_seq = log_next_seq;
+		next_seq = lcb->log_next_seq;
 
 		len = 0;
 		while (len >= 0 && seq < next_seq) {
@@ -1406,18 +1422,18 @@ static int syslog_print_all(char __user *buf, int size, bool clear)
 				len += textlen;
 			raw_spin_lock_irq(&logbuf_lock);
 
-			if (seq < log_first_seq) {
+			if (seq < lcb->log_first_seq) {
 				/* messages are gone, move to next one */
-				seq = log_first_seq;
-				idx = log_first_idx;
+				seq = lcb->log_first_seq;
+				idx = lcb->log_first_idx;
 				prev = 0;
 			}
 		}
 	}
 
 	if (clear) {
-		clear_seq = log_next_seq;
-		clear_idx = log_next_idx;
+		lcb->clear_seq = lcb->log_next_seq;
+		lcb->clear_idx = lcb->log_next_idx;
 	}
 	raw_spin_unlock_irq(&logbuf_lock);
 
@@ -1452,7 +1468,7 @@ int do_syslog(int type, char __user *buf, int len, int source)
 			goto out;
 		}
 		error = wait_event_interruptible(log_wait,
-						 syslog_seq != log_next_seq);
+				lcb->syslog_seq != lcb->log_next_seq);
 		if (error)
 			goto out;
 		error = syslog_print(buf, len);
@@ -1507,12 +1523,12 @@ int do_syslog(int type, char __user *buf, int len, int source)
 	/* Number of chars in the log buffer */
 	case SYSLOG_ACTION_SIZE_UNREAD:
 		raw_spin_lock_irq(&logbuf_lock);
-		if (syslog_seq < log_first_seq) {
+		if (lcb->syslog_seq < lcb->log_first_seq) {
 			/* messages are gone, move to first one */
-			syslog_seq = log_first_seq;
-			syslog_idx = log_first_idx;
-			syslog_prev = 0;
-			syslog_partial = 0;
+			lcb->syslog_seq = lcb->log_first_seq;
+			lcb->syslog_idx = lcb->log_first_idx;
+			lcb->syslog_prev = 0;
+			lcb->syslog_partial = 0;
 		}
 		if (source == SYSLOG_FROM_PROC) {
 			/*
@@ -1520,14 +1536,14 @@ int do_syslog(int type, char __user *buf, int len, int source)
 			 * for pending data, not the size; return the count of
 			 * records, not the length.
 			 */
-			error = log_next_seq - syslog_seq;
+			error = lcb->log_next_idx - lcb->syslog_idx;
 		} else {
-			u64 seq = syslog_seq;
-			u32 idx = syslog_idx;
-			enum log_flags prev = syslog_prev;
+			u64 seq = lcb->syslog_seq;
+			u32 idx = lcb->syslog_idx;
+			enum log_flags prev = lcb->syslog_prev;
 
 			error = 0;
-			while (seq < log_next_seq) {
+			while (seq < lcb->log_next_seq) {
 				struct printk_log *msg = log_from_idx(idx);
 
 				error += msg_print_text(msg, prev, true, NULL, 0);
@@ -1535,13 +1551,13 @@ int do_syslog(int type, char __user *buf, int len, int source)
 				seq++;
 				prev = msg->flags;
 			}
-			error -= syslog_partial;
+			error -= lcb->syslog_partial;
 		}
 		raw_spin_unlock_irq(&logbuf_lock);
 		break;
 	/* Size of the log buffer */
 	case SYSLOG_ACTION_SIZE_BUFFER:
-		error = log_buf_len;
+		error = lcb->log_buf_len;
 		break;
 	default:
 		error = -EINVAL;
@@ -1558,7 +1574,7 @@ SYSCALL_DEFINE3(syslog, int, type, char __user *, buf, int, len)
 
 /*
  * Call the console drivers, asking them to write out
- * log_buf[start] to log_buf[end - 1].
+ * lcb->log_buf[start] to lcb->log_buf[end - 1].
  * The console_lock must be held.
  */
 static void call_console_drivers(int level,
@@ -1710,7 +1726,7 @@ static size_t cont_print_text(char *text, size_t size)
 	size_t textlen = 0;
 	size_t len;
 
-	if (cont.cons == 0 && (console_prev & LOG_NEWLINE)) {
+	if (cont.cons == 0 && (lcb->console_prev & LOG_NEWLINE)) {
 		textlen += print_time(cont.ts_nsec, text);
 		size -= textlen;
 	}
@@ -1985,15 +2001,9 @@ EXPORT_SYMBOL(printk);
 #define LOG_LINE_MAX		0
 #define PREFIX_MAX		0
 
-static u64 syslog_seq;
-static u32 syslog_idx;
-static u64 console_seq;
-static u32 console_idx;
-static enum log_flags syslog_prev;
-static u64 log_first_seq;
-static u32 log_first_idx;
-static u64 log_next_seq;
-static enum log_flags console_prev;
+static struct lcb_t __lcb;
+static struct lcb_t *lcb = &__lcb;
+
 static struct cont {
 	size_t len;
 	size_t cons;
@@ -2308,7 +2318,7 @@ static void console_cont_flush(char *text, size_t size)
 	 * busy. The earlier ones need to be printed before this one, we
 	 * did not flush any fragment so far, so just let it queue up.
 	 */
-	if (console_seq < log_next_seq && !cont.cons)
+	if (lcb->console_seq < lcb->log_next_seq && !cont.cons)
 		goto out;
 
 	len = cont_print_text(text, size);
@@ -2385,27 +2395,28 @@ again:
 		int level;
 
 		raw_spin_lock_irqsave(&logbuf_lock, flags);
-		if (seen_seq != log_next_seq) {
+		if (seen_seq != lcb->log_next_seq) {
 			wake_klogd = true;
-			seen_seq = log_next_seq;
+			seen_seq = lcb->log_next_seq;
 		}
 
-		if (console_seq < log_first_seq) {
+		if (lcb->console_seq < lcb->log_first_seq) {
 			len = sprintf(text, "** %u printk messages dropped ** ",
-				      (unsigned)(log_first_seq - console_seq));
+				      (unsigned int)(lcb->log_first_seq -
+						 lcb->console_seq));
 
 			/* messages are gone, move to first one */
-			console_seq = log_first_seq;
-			console_idx = log_first_idx;
-			console_prev = 0;
+			lcb->console_seq = lcb->log_first_seq;
+			lcb->console_idx = lcb->log_first_idx;
+			lcb->console_prev = 0;
 		} else {
 			len = 0;
 		}
 skip:
-		if (console_seq == log_next_seq)
+		if (lcb->console_seq == lcb->log_next_seq)
 			break;
 
-		msg = log_from_idx(console_idx);
+		msg = log_from_idx(lcb->console_idx);
 		level = msg->level;
 		if ((msg->flags & LOG_NOCONS) ||
 				suppress_message_printing(level)) {
@@ -2414,32 +2425,33 @@ skip:
 			 * directly to the console when we received it, and
 			 * record that has level above the console loglevel.
 			 */
-			console_idx = log_next(console_idx);
-			console_seq++;
+			lcb->console_idx = log_next(lcb->console_idx);
+			lcb->console_seq++;
 			/*
 			 * We will get here again when we register a new
 			 * CON_PRINTBUFFER console. Clear the flag so we
 			 * will properly dump everything later.
 			 */
 			msg->flags &= ~LOG_NOCONS;
-			console_prev = msg->flags;
+			lcb->console_prev = msg->flags;
 			goto skip;
 		}
 
-		len += msg_print_text(msg, console_prev, false,
+		len += msg_print_text(msg, lcb->console_prev, false,
 				      text + len, sizeof(text) - len);
 		if (nr_ext_console_drivers) {
 			ext_len = msg_print_ext_header(ext_text,
 						sizeof(ext_text),
-						msg, console_seq, console_prev);
+						msg, lcb->console_seq,
+						lcb->console_prev);
 			ext_len += msg_print_ext_body(ext_text + ext_len,
 						sizeof(ext_text) - ext_len,
 						log_dict(msg), msg->dict_len,
 						log_text(msg), msg->text_len);
 		}
-		console_idx = log_next(console_idx);
-		console_seq++;
-		console_prev = msg->flags;
+		lcb->console_idx = log_next(lcb->console_idx);
+		lcb->console_seq++;
+		lcb->console_prev = msg->flags;
 		raw_spin_unlock(&logbuf_lock);
 
 		stop_critical_timings();	/* don't trace print latency */
@@ -2467,7 +2479,7 @@ skip:
 	 * flush, no worries.
 	 */
 	raw_spin_lock(&logbuf_lock);
-	retry = console_seq != log_next_seq;
+	retry = lcb->console_seq != lcb->log_next_seq;
 	raw_spin_unlock_irqrestore(&logbuf_lock, flags);
 
 	if (retry && console_trylock())
@@ -2732,9 +2744,9 @@ void register_console(struct console *newcon)
 		 * for us.
 		 */
 		raw_spin_lock_irqsave(&logbuf_lock, flags);
-		console_seq = syslog_seq;
-		console_idx = syslog_idx;
-		console_prev = syslog_prev;
+		lcb->console_seq = lcb->syslog_seq;
+		lcb->console_idx = lcb->syslog_idx;
+		lcb->console_prev = lcb->syslog_prev;
 		raw_spin_unlock_irqrestore(&logbuf_lock, flags);
 		/*
 		 * We're about to replay the log buffer.  Only do this to the
@@ -3029,10 +3041,10 @@ void kmsg_dump(enum kmsg_dump_reason reason)
 		dumper->active = true;
 
 		raw_spin_lock_irqsave(&logbuf_lock, flags);
-		dumper->cur_seq = clear_seq;
-		dumper->cur_idx = clear_idx;
-		dumper->next_seq = log_next_seq;
-		dumper->next_idx = log_next_idx;
+		dumper->cur_seq = lcb->clear_seq;
+		dumper->cur_idx = lcb->clear_idx;
+		dumper->next_seq = lcb->log_next_seq;
+		dumper->next_idx = lcb->log_next_idx;
 		raw_spin_unlock_irqrestore(&logbuf_lock, flags);
 
 		/* invoke dumper which will iterate over records */
@@ -3073,14 +3085,14 @@ bool kmsg_dump_get_line_nolock(struct kmsg_dumper *dumper, bool syslog,
 	if (!dumper->active)
 		goto out;
 
-	if (dumper->cur_seq < log_first_seq) {
+	if (dumper->cur_seq < lcb->log_first_seq) {
 		/* messages are gone, move to first available one */
-		dumper->cur_seq = log_first_seq;
-		dumper->cur_idx = log_first_idx;
+		dumper->cur_seq = lcb->log_first_seq;
+		dumper->cur_idx = lcb->log_first_idx;
 	}
 
 	/* last entry */
-	if (dumper->cur_seq >= log_next_seq)
+	if (dumper->cur_seq >= lcb->log_next_seq)
 		goto out;
 
 	msg = log_from_idx(dumper->cur_idx);
@@ -3161,10 +3173,10 @@ bool kmsg_dump_get_buffer(struct kmsg_dumper *dumper, bool syslog,
 		goto out;
 
 	raw_spin_lock_irqsave(&logbuf_lock, flags);
-	if (dumper->cur_seq < log_first_seq) {
+	if (dumper->cur_seq < lcb->log_first_seq) {
 		/* messages are gone, move to first available one */
-		dumper->cur_seq = log_first_seq;
-		dumper->cur_idx = log_first_idx;
+		dumper->cur_seq = lcb->log_first_seq;
+		dumper->cur_idx = lcb->log_first_idx;
 	}
 
 	/* last entry */
@@ -3236,10 +3248,10 @@ EXPORT_SYMBOL_GPL(kmsg_dump_get_buffer);
  */
 void kmsg_dump_rewind_nolock(struct kmsg_dumper *dumper)
 {
-	dumper->cur_seq = clear_seq;
-	dumper->cur_idx = clear_idx;
-	dumper->next_seq = log_next_seq;
-	dumper->next_idx = log_next_idx;
+	dumper->cur_seq = lcb->clear_seq;
+	dumper->cur_idx = lcb->clear_idx;
+	dumper->next_seq = lcb->log_next_seq;
+	dumper->next_idx = lcb->log_next_idx;
 }
 
 /**
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC PATCH v2 2/2] printk: external log buffer (CONFIG_LOGBUFFER)
  2016-10-04 21:08 ` [RFC PATCH v2 " Sean Hudson
  2016-10-04 21:08   ` [RFC PATCH v2 1/2] printk: collect offsets into replaceable structure Sean Hudson
@ 2016-10-04 21:08   ` Sean Hudson
  1 sibling, 0 replies; 16+ messages in thread
From: Sean Hudson @ 2016-10-04 21:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sean Hudson, Andrew Morton, Thomas Gleixner, Yang Shi,
	Ingo Molnar, Prarit Bhargava, Rasmus Villemoes, Paul E. McKenney,
	Yaowei Bai, Kees Cook, Andrey Ryabinin, Johannes Weiner,
	Tejun Heo, Arnd Bergmann, Ard Biesheuvel, Thomas Garnier,
	Petr Mladek, Nicolas Pitre, Sergey Senozhatsky, Borislav Petkov,
	Tim Bird, Steven Rostedt, Grant Likely

This debug feature provides a convenient way to collect log entries across
multiple, warmboot cycles and to share those entries with a boot loader.
It allows the kernel to use an external buffer for kernel log messages and
is controlled by an optional command line parameter. The buffer can contain
existing log messages from previous boot cycles and/or the bootloader. The
command line parameter was chosen for flexibility, cross arch portability,
and the ability to dynamically enable/disable this feature. The parameter
specifies the address of a control block used to replace the default log
buffer.  Existing bootloader and kernel log messages are kept, in order,
inside the new buffer.  After a boot that preserves the buffer contents, a
bootloader can display both kernel and bootloader log entries from
multiple, previous boots. It also allows the kernel to display bootloader
log entries along with its own messages.

This feature is intended for debug purposes and has no effect unless the
command line parameter is specified.  Further, it validates the passed
control block carefully and if any checks fail, it falls back to the
default behaviour.  As such, it can be left enabled by default.

Memory Reservation
------------------
This feature expects the bootloader to reserve/preserve the shared buffer
memory. This reservation needs to prevent the kernel from overwriting the
external log control block and log entries. In my testing, I've used the
'fdt' commands in uboot to dynamically inject reserved memory regions via
the DT to the kernel.

Based on the initial work of Wolfgang Denk and Igor Lisitsin [1].
Also based on work by: Alexander Streit <Alexander.Streit@continental-corporation.com>

[1] http://git.denx.de/?p=linux-2.6-denx.git;a=commitdiff;h=212f61c7fd3b952a81d1459dd32a86a32ddfd4ce

Signed-off-by: Sean Hudson <sean_hudson@mentor.com>
---
 init/Kconfig           |  12 +++
 init/main.c            |   7 ++
 kernel/printk/printk.c | 263 ++++++++++++++++++++++++++++++++++++++++++++++++-
 3 files changed, 277 insertions(+), 5 deletions(-)

diff --git a/init/Kconfig b/init/Kconfig
index cac3f09..746183b 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1493,6 +1493,18 @@ config PRINTK
 	  very difficult to diagnose system problems, saying N here is
 	  strongly discouraged.
 
+config LOGBUFFER
+	bool "External logbuffer" if PRINTK
+	default n
+	depends on PRINTK
+	help
+	  This option enables support for an alternative, "external" printk log
+	  buffer. If memory contents are preserved, e.g. after a warmboot, this
+	  provides a known location for the boot loader to read and display printk
+	  entries from the kernel.  If desired, the bootloader can write its own
+	  log entries which the kernel will display with its own log entries.
+	  Further, this capability can be used across multiple warmboot cycles.
+
 config PRINTK_NMI
 	def_bool y
 	depends on PRINTK
diff --git a/init/main.c b/init/main.c
index a8a58e2..a5a54b3 100644
--- a/init/main.c
+++ b/init/main.c
@@ -94,6 +94,12 @@ extern void init_IRQ(void);
 extern void fork_init(void);
 extern void radix_tree_init(void);
 
+#ifdef CONFIG_LOGBUFFER
+extern void setup_ext_logbuff(void);
+#else
+void setup_ext_logbuff(void) {}
+#endif
+
 /*
  * Debug helper: via this flag we know that we are in 'early bootup code'
  * where only the boot processor is running with IRQ disabled.  This means
@@ -535,6 +541,7 @@ asmlinkage __visible void __init start_kernel(void)
 	sort_main_extable();
 	trap_init();
 	mm_init();
+	setup_ext_logbuff();
 
 	/*
 	 * Set up the scheduler prior starting any interrupts (such as the
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 063f01d..5cf147e 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -331,17 +331,17 @@ enum log_flags {
 };
 
 struct printk_log {
-	u64 ts_nsec;		/* timestamp in nanoseconds */
+#ifdef CONFIG_LOGBUFFER
+	u32 log_magic;		/* sanity check number */
+#endif
 	u16 len;		/* length of entire record */
 	u16 text_len;		/* length of text buffer */
 	u16 dict_len;		/* length of dictionary buffer */
 	u8 facility;		/* syslog facility */
 	u8 flags:5;		/* internal record flags */
 	u8 level:3;		/* syslog level */
+	u64 ts_nsec;		/* timestamp in nanoseconds */
 }
-#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
-__packed __aligned(4)
-#endif
 ;
 
 /*
@@ -352,7 +352,14 @@ __packed __aligned(4)
 DEFINE_RAW_SPINLOCK(logbuf_lock);
 
 /*
- * This control block collects tracking offsets for the log.
+ * This control block collects tracking offsets for the log into a single
+ * place.  It also facilitates pointing the log to another location, and,
+ * when combined with the CONFIG_LOGBUFFER feature, it allows log sharing
+ * between the bootloader and the kernel.
+ *
+ * NOTE:
+ *   By convention, the control block and the log buffer are contiguous.
+ *   This requirement can be relaxed, if desired.
  */
 struct lcb_t {
 	/* Pointer to log buffer space and length of space */
@@ -381,6 +388,15 @@ struct lcb_t {
 	/* the next printk record to read after the last 'clear' command */
 	u64 clear_seq;
 	u32 clear_idx;
+
+#ifdef CONFIG_LOGBUFFER
+	u32 log_version;
+	u32 lcb_padded_len;
+	u32 lcb_size;
+	u32 log_hdr_size;
+	phys_addr_t log_phys_addr;
+	u32 lcb_magic;
+#endif
 };
 
 #ifdef CONFIG_PRINTK
@@ -397,8 +413,26 @@ DECLARE_WAIT_QUEUE_HEAD(log_wait);
 #define __LOG_BUF_LEN (1 << CONFIG_LOG_BUF_SHIFT)
 static char __log_buf[__LOG_BUF_LEN] __aligned(LOG_ALIGN);
 
+/* Log control block */
+#ifdef CONFIG_LOGBUFFER
+
+/*
+ * The bootloader may write an unexpected version/format of the log, so
+ * these values provide a way to check that the format agrees.
+ */
+#define LOGBUFF_MAGIC			0xc0de4ced
+#define LOGBUFF_LOG_VERSION		3
+#define LOGBUFF_CB_PADDED_LENGTH	1024
+
+static struct lcb_t __lcb = {
+	__log_buf, __LOG_BUF_LEN, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+	LOGBUFF_LOG_VERSION, LOGBUFF_CB_PADDED_LENGTH, sizeof(struct lcb_t),
+	sizeof(struct printk_log), 0, LOGBUFF_MAGIC
+};
+#else
 static struct lcb_t __lcb = {
 	__log_buf, __LOG_BUF_LEN, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 };
+#endif /* CONFIG_LOGBUFFER */
 
 static struct lcb_t *lcb = &__lcb;
 
@@ -596,6 +630,9 @@ static int log_store(int facility, int level,
 		msg->ts_nsec = local_clock();
 	memset(log_dict(msg) + dict_len, 0, pad_len);
 	msg->len = size;
+#ifdef CONFIG_LOGBUFFER
+	msg->log_magic = LOGBUFF_MAGIC;
+#endif
 
 	/* insert message */
 	lcb->log_next_idx += msg->len;
@@ -1070,6 +1107,21 @@ static void __init log_buf_add_cpu(void)
 static inline void log_buf_add_cpu(void) {}
 #endif /* CONFIG_SMP */
 
+#ifdef CONFIG_LOGBUFFER
+/* Bootloader log control block address passed in on the kernel cmdline */
+phys_addr_t lcb_phys_addr __initdata;
+
+/* save requested lcb_phys_addr since it's too early to process it */
+static int __init lcb_phys_addr_setup(char *str)
+{
+	lcb_phys_addr = memparse(str, &str);
+
+	return 0;
+}
+
+early_param("lcb_phys_addr", lcb_phys_addr_setup);
+#endif
+
 void __init setup_log_buf(int early)
 {
 	unsigned long flags;
@@ -3331,4 +3383,205 @@ void show_regs_print_info(const char *log_lvl)
 	       log_lvl, current, task_stack_page(current));
 }
 
+#ifdef CONFIG_LOGBUFFER
+
+static struct printk_log *log_from_idx_and_base(const char *const base,
+						const u32 idx)
+{
+	struct printk_log *msg = (struct printk_log *)(base + idx);
+
+	if (!msg->len || msg->log_magic != LOGBUFF_MAGIC)
+		return (struct printk_log *)base;
+	return msg;
+}
+
+static u32 log_next_idx_from_idx(const char *const base, const u32 idx)
+{
+	struct printk_log const *const msg = (struct printk_log *)(base + idx);
+
+	if (!msg->len || msg->log_magic != LOGBUFF_MAGIC)
+		return 0;
+	return idx + msg->len;
+}
+
+/*
+ * The bootloader may write a different version of the log than expected.
+ * So, check that the incoming lcb matches that values that the kernel
+ * expects.
+ */
+static bool is_valid_external_cb(const struct lcb_t * const new_lcb)
+{
+	/* Validate the new control buffer */
+	return (new_lcb &&
+		new_lcb->log_version == LOGBUFF_LOG_VERSION &&
+		new_lcb->lcb_magic == LOGBUFF_MAGIC &&
+		new_lcb->lcb_padded_len == LOGBUFF_CB_PADDED_LENGTH &&
+		new_lcb->lcb_size == sizeof(struct lcb_t) &&
+		new_lcb->log_hdr_size == sizeof(struct printk_log) &&
+		new_lcb->lcb_size <= new_lcb->lcb_padded_len &&
+		new_lcb->log_first_idx <= new_lcb->log_buf_len &&
+		new_lcb->log_next_idx <= new_lcb->log_buf_len &&
+		new_lcb->syslog_idx <= new_lcb->log_buf_len &&
+		new_lcb->console_idx <= new_lcb->log_buf_len &&
+		new_lcb->clear_idx <= new_lcb->log_buf_len &&
+		new_lcb->log_phys_addr != 0);
+}
+
+/*
+ * Copy the kernel log entries that have occurred so far in this boot cycle
+ * into the new log buffer location found in the log control block specified
+ * on the kernel command line. Then switch to using that location from this
+ * point onward.
+ * To ensure correctness, we must consider :
+ *  - the bootloader may be using a different structure
+ *  - size of the external buffer may not match the size of the default
+ *  - an unknown number of bootloader log entries may be present
+ *  - an unknown number of kernel log entries may be present
+ *  - wrapping of the bootloader log may have already occurred (unlikely)
+ *  - wrapping of the kernel log may have already occurred (unlikely)
+ *
+ * Memory Reservation
+ * ------------------
+ *  This feature expects the bootloader to reserve/preserve the shared buffer
+ *  memory. This reservation needs to prevent the kernel from overwriting the
+ *  external log control block and log entries. In my testing, I've used the
+ *  'fdt' commands in uboot to dynamically inject reserved memory regions via
+ *  the DT to the kernel.
+ *
+ */
+void __init setup_ext_logbuff(void)
+{
+	struct printk_log *nextblmsg, *currkmsg;
+	u64 currkmsg_seq;
+	u32 currkmsg_idx, offset_idx;
+	struct lcb_t *new_lcb;
+	unsigned long flags;
+	int ret;
+
+	if (!lcb_phys_addr)
+		return;
+
+	new_lcb = phys_to_virt(lcb_phys_addr);
+	new_lcb->log_buf = phys_to_virt(new_lcb->log_phys_addr);
+
+	/* Validate the new control buffer */
+	if (!is_valid_external_cb(new_lcb)) {
+		pr_warn("log : External log cb failed validation. Ignored\n");
+		pr_debug
+		    ("Kernel logbuffer of size %u at 0x%p (virtual address)\n",
+		     lcb->log_buf_len, lcb->log_buf);
+		return;
+	}
+
+	if (!new_log_buf_len)
+		pr_warn("log_buf_len ignored in favor of external log\n");
+
+	pr_debug
+	    ("log : Replacing cb @ 0x%p of %lu bytes with cb @ 0x%p of %d/%d bytes\n",
+	     lcb, sizeof(struct lcb_t), new_lcb, new_lcb->lcb_size,
+	     new_lcb->lcb_padded_len);
+
+	/* Capture the virtual address for the buffer */
+	new_lcb->log_buf = phys_to_virt(new_lcb->log_phys_addr);
+	pr_debug("log : Replacing buffer @ 0x%p (VA) of %d bytes with one @ 0x%p (VA) of %d bytes\n",
+		 lcb->log_buf, lcb->log_buf_len, new_lcb->log_buf,
+		 new_lcb->log_buf_len);
+
+	raw_spin_lock_irqsave(&logbuf_lock, flags);
+
+	/*
+	 * Offset sequence numbers by the number of existing bootloader msgs
+	 * NOTE: Functions above handle if the first seq becomes > than these
+	 */
+	if (lcb->syslog_seq != new_lcb->log_first_seq)
+		new_lcb->syslog_seq = lcb->syslog_seq + new_lcb->log_next_seq;
+	if (lcb->console_seq != new_lcb->log_first_seq)
+		new_lcb->console_seq = lcb->console_seq + new_lcb->log_next_seq;
+	if (lcb->clear_seq != new_lcb->log_first_seq)
+		new_lcb->clear_seq = lcb->clear_seq + new_lcb->log_next_seq;
+
+	/* Copy kernel specific data into the new/shared control block. */
+	new_lcb->syslog_prev = lcb->syslog_prev;
+	new_lcb->syslog_partial = lcb->syslog_partial;
+	new_lcb->console_prev = lcb->console_prev;
+
+	/*
+	 * Copy existing entries one at a time into the external buffer space.
+	 * The new CB will incrementally update as each kernel entry is added.
+	 * The existing CB will not be modified.
+	 */
+	for (currkmsg_seq = lcb->log_first_seq,
+	     currkmsg_idx = lcb->log_first_idx;
+	     currkmsg_seq <= lcb->log_next_seq &&
+	     currkmsg_idx < lcb->log_buf_len;
+	     currkmsg_idx = log_next(currkmsg_idx), currkmsg_seq++) {
+		currkmsg = log_from_idx_and_base(lcb->log_buf, currkmsg_idx);
+
+		/* If necessary, free bl entries to fit kernel entries */
+		while (new_lcb->log_first_seq < new_lcb->log_next_seq) {
+			u32 free;
+
+			if (new_lcb->log_next_idx > new_lcb->log_first_idx)
+				free = max(new_lcb->log_buf_len -
+					   new_lcb->log_next_idx,
+					   new_lcb->log_first_idx);
+			else
+				free = new_lcb->log_first_idx -
+				    new_lcb->log_next_idx;
+
+			if (free > currkmsg->len + sizeof(struct printk_log))
+				break;
+
+			/* Drop oldest messages until there is enough space */
+			new_lcb->log_first_idx =
+			    log_next_idx_from_idx(new_lcb->log_buf,
+						  new_lcb->log_first_idx);
+			new_lcb->log_first_seq++;
+		}
+
+		nextblmsg = (struct printk_log *)
+		    (new_lcb->log_buf + new_lcb->log_next_idx);
+
+		/* Check if the next record will fit along with a wrap record */
+		if (new_lcb->log_next_idx + currkmsg->len +
+		    sizeof(struct printk_log) >= new_lcb->log_buf_len) {
+			memset(nextblmsg, 0, sizeof(struct printk_log));
+			nextblmsg = (struct printk_log *)new_lcb->log_buf;
+			new_lcb->log_next_idx = 0;
+		}
+		/* Copy kernel message to bootloader buffer */
+		memcpy(nextblmsg, currkmsg, currkmsg->len);
+
+		/* Before moving to the next kernel entry, adjust indices */
+		if (currkmsg_seq == lcb->log_first_seq)
+			offset_idx = new_lcb->log_first_seq;
+		else
+			offset_idx = new_lcb->log_next_idx;
+
+		if (currkmsg_seq == lcb->syslog_seq)
+			new_lcb->syslog_idx = offset_idx;
+		if (currkmsg_seq == lcb->console_seq)
+			new_lcb->console_idx = offset_idx;
+		if (currkmsg_seq == lcb->clear_seq)
+			new_lcb->clear_idx = offset_idx;
+
+		/* Update next offset and sequence */
+		if (currkmsg_seq < lcb->log_next_seq) {
+			new_lcb->log_next_idx += currkmsg->len;
+			new_lcb->log_next_seq++;
+		}
+	}
+
+	/* Overwrite the default lcb pointer, buffer pointer, and length. */
+	lcb = new_lcb;
+	lcb->log_buf = (char *)new_lcb->log_buf;
+	lcb->log_buf_len = new_lcb->log_buf_len;
+
+	raw_spin_unlock_irqrestore(&logbuf_lock, flags);
+
+	pr_info("Kernel logbuffer of size %u at 0x%p (virtual address)\n",
+		lcb->log_buf_len, lcb->log_buf);
+}
+#endif
+
 #endif
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH v1 2/2] printk: external log buffer (CONFIG_LOGBUFFER)
  2016-10-04 17:55         ` Sean Hudson
@ 2016-10-05 12:48           ` Petr Mladek
  2016-10-12  1:12             ` Sean Hudson
  0 siblings, 1 reply; 16+ messages in thread
From: Petr Mladek @ 2016-10-05 12:48 UTC (permalink / raw)
  To: Sean Hudson
  Cc: linux-kernel, Andrew Morton, Thomas Gleixner, Yang Shi,
	Ingo Molnar, Prarit Bhargava, Rasmus Villemoes, Paul E. McKenney,
	Yaowei Bai, Kees Cook, Andrey Ryabinin, Johannes Weiner,
	Tejun Heo, Arnd Bergmann, Ard Biesheuvel, Thomas Garnier,
	Nicolas Pitre, Sergey Senozhatsky, Borislav Petkov, Tim Bird,
	Steven Rostedt, Grant Likely

On Tue 2016-10-04 12:55:35, Sean Hudson wrote:
> On 10/4/2016 6:27 AM, Petr Mladek wrote:
> > On Fri 2016-09-30 23:06:49, Sean Hudson wrote:
> >> On 9/30/2016 7:39 AM, Petr Mladek wrote:
> >>>
> >>> Is the bootloader allowed to write to the shared buffer from this
> >>> point on? How will be synchronized the writes from the kernel and
> >>> the bootloader?
> >>
> >> I am a bit confused by this question.  I assume that the kernel and
> >> bootloader will not be accessing the same log space simultaneously.
> >> While each one is running, they will utilize the same data structures to
> >> keep track of current entries.
> > 
> > They both will write new messages at the end of the buffer. Therefore
> > they both might want to write to exactly the same location. Also if
> > the buffer is full, the oldest messages are overwritten. Therefore
> > even reading must be synchronized against writes.
> > 
> > These operations are synchronized using logbuf_lock on the kernel side.
> > I do not know much about bootloaders but I guess that the bootloader
> > will not be able to use the logbuf_lock. So the question is how
> > the bootloader and kernel will synchronize the access to the shared
> > buffer.
> > 
> > Or did I miss anything?
> 
> The state of the log is preserved in the log control block, or lcb,
> which is passed when the boot loader passes control to the kernel.  In
> order to make the transition as seamless as possible, the boot loader
> appends entries in the same way that the kernel does and overwrites the
> oldest entries once the ring buffer is full.  It also updates tracking
> indices in the lcb.  When the boot loader passes control to the kernel,
> the kernel adopts the lcb and appends entries using that structure.  In
> this way, the lcb provides continuity between the boot loader and the
> kernel for the state of the buffer.

I see. The bootloader and the kernel never runs in parallel. So they
do not need to be synchronized using a lock or so. Huh, I do not
know what I have imagined.


> Note, there is a small window of time during early boot in which the
> kernel puts entries into the default location, a staticlly declared
> char[].  These entries are appended to the external location once
> setup_ext_logbuff() is called from init/main.c.  (I'd love to close that
> gap entirely, but all solutions I've tried involve arch specific code,
> which I'm trying to avoid.)

IMHO, a copying of few messages is fine. Well, we need to make sure
that they are not printed twice on the console.


> >>> How protected will the address of the shared buffer? Would not
> >>> it open a security hole? We need to make sure that non-privileged
> >>> user or non-trusted application must not read sensitive data
> >>> from the log or break the structure of the log by writing
> >>> invalid data.
> >>
> >> There is no notion of protection of the address.  However, additional
> >> checking could be added. Currently, there is potential for corrupt log
> >> entries from the bootloader causing the kernel to crash.
> >> I can't think of a way to prevent that from happening.
> > 
> > IMHO, the only way is to revisit all locations when the log buffer
> > is accessed and add all the needed consistency checks. We must make
> > sure that we do not access outside of the buffer. Also the log_*_seq
> > numbers are supposed to grow linearly...
> 
> Adding an explicit walk of the external log entries should be trivial.
> That will only validate that the passed in log entries are consistent
> internally.  I will be submitting a v2 patch shortly that fixes an error
> that kernel-ci caught. (if CONFIG_PRINTK is disabled, compilation fails)
> I'll see about adding that check at the same time.

The question is if a one time walk might find all potential problems.
We might want to add checks into the existing code. For example,
log_from_idx() is rather sensitive but it does not check if
all the values are sane.


> >>  Also, as you point out, a bootloader could read log contents from
> >> the shared region. Of course, in order for either to happen,
> >> the bootloader would already be compromised and I'm not sure that
> >>  reading log entries or crashing the kernel is such a big consideration.
> > 
> > The log might contain addresses of some kernel structures, for example
> > in some error messages. For this reason, only a privileged users are
> > able to read it. We must make sure that they will not get accessible
> > for a non-privileged user, for example via an address to the shared
> > buffer in the bootloader config.
> 
> Ahhh, if I am understanding correctly, your concern is that an
> unprivileged program in user space would be able to gain access directly
> to the log entries via the external buffer location.  I haven't tested
> that scenario, so I'm not sure if the memory reservation via the DT
> protects that memory from user space.  I'll have to check on that.  In
> the meantime, if you know of standard ways to protect memory from user
> space access, please let me know.

I am not expert. I think that the main trick on the memory
management level is that the userspace and kernel have different
address spaces. IMHO, the pages that are used by the kernel log buffer
cannot be addressed from userspace. In fact, each user have only very
limited number of pages that are mapped to his address space.

I do not know how this works with the persistent memory.

In each case, we need to make sure that the address of the
shared buffer is not readable in the booloaded config file
for a normal user.


> > BTW: You did not answered the question about how the bootloader would
> > know the right version of the log format. I am afraid that we do not
> > want to maintain backward compatibility on the kernel side. The printk
> > code already is too complex.
> 
> This is a matter of magic numbers.  Uboot has two preexisting log
> structure versions. Since the kernel did not have a version, I used
> LOGBUFF_LOG_VERSION = 3, if CONFIG_LOGBUFFER is enabled.  So, if the
> boot loader passes the command line parameter 'lcb_phys_addr' and the
> contents of that address pass the validation checks found in
> is_valid_external_cb(), then the external buffer is used.  Those checks
> include matching the version number, structure sizes match, etc.
> Otherwise, the external buffer is ignored.  This should allow for the
> kernel to change in the future, e.g. v4, and continue to run correctly
> even if an older boot loader passes a v3 version of the log entries.

There might be a problem how to bootstrap this. Someone has to define
the initial log control block. If it is done by bootloader and it uses
a wrong version, the kernel will not know the size of the external
buffer and will ignore and never use it.

> > Also you did not answered the question about your plans. I wonder
> > which bootloader would be able to use such a feature.
> 
> Sorry about that, I missed the question in your previous reply.  I am
> preparing to submit a patch set to mainline uBoot that adds this capability.


I have discussed this with colleagues. All this looked to complicated
to them.

This patch set actually implements two features:

1. Possibility to preserve the kernel log buffer between boots.

   But this is already supported by pstore/ramoops.


2. Possibility to store bootloader messages in the kernel
   log buffer.

   The question is who wants it and if it is worth the effort.

   The bootloader will need to implement log storing in the
   non-trivial format. It will need to maintain compatibility
   of this format in the future.

   I guess that the bootloader is running on a single CPU
   and is basically one process. Therefore it does not need
   to solve any races. Therefore it does not need such a complex
   format.

   It might be more effective to write just a simple
   text into the persistent memory and provide a userspace tool
   to dump it.


Best Regards,
Petr

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH v1 2/2] printk: external log buffer (CONFIG_LOGBUFFER)
  2016-10-05 12:48           ` Petr Mladek
@ 2016-10-12  1:12             ` Sean Hudson
  2016-10-14  9:52               ` Petr Mladek
  0 siblings, 1 reply; 16+ messages in thread
From: Sean Hudson @ 2016-10-12  1:12 UTC (permalink / raw)
  To: Petr Mladek
  Cc: linux-kernel, Andrew Morton, Thomas Gleixner, Yang Shi,
	Ingo Molnar, Prarit Bhargava, Rasmus Villemoes, Paul E. McKenney,
	Yaowei Bai, Kees Cook, Andrey Ryabinin, Johannes Weiner,
	Tejun Heo, Arnd Bergmann, Ard Biesheuvel, Thomas Garnier,
	Nicolas Pitre, Sergey Senozhatsky, Borislav Petkov, Tim Bird,
	Steven Rostedt, Grant Likely

On 10/5/2016 2:48 PM, Petr Mladek wrote:
> On Tue 2016-10-04 12:55:35, Sean Hudson wrote:
>> On 10/4/2016 6:27 AM, Petr Mladek wrote:
>>> On Fri 2016-09-30 23:06:49, Sean Hudson wrote:
>>>> On 9/30/2016 7:39 AM, Petr Mladek wrote:
>>>>>

<snip>

>> Note, there is a small window of time during early boot in which the
>> kernel puts entries into the default location, a staticlly declared
>> char[].  These entries are appended to the external location once
>> setup_ext_logbuff() is called from init/main.c.  (I'd love to close that
>> gap entirely, but all solutions I've tried involve arch specific code,
>> which I'm trying to avoid.)
> 
> IMHO, a copying of few messages is fine. Well, we need to make sure
> that they are not printed twice on the console.

I happen to agree since this is a debug feature.  It would be nice to
avoid it, but every strategy that I've considered makes the code less
portable.

>>>>> How protected will the address of the shared buffer? Would not
>>>>> it open a security hole? We need to make sure that non-privileged
>>>>> user or non-trusted application must not read sensitive data
>>>>> from the log or break the structure of the log by writing
>>>>> invalid data.
>>>>
>>>> There is no notion of protection of the address.  However, additional
>>>> checking could be added. Currently, there is potential for corrupt log
>>>> entries from the bootloader causing the kernel to crash.
>>>> I can't think of a way to prevent that from happening.
>>>
>>> IMHO, the only way is to revisit all locations when the log buffer
>>> is accessed and add all the needed consistency checks. We must make
>>> sure that we do not access outside of the buffer. Also the log_*_seq
>>> numbers are supposed to grow linearly...
>>
>> Adding an explicit walk of the external log entries should be trivial.
>> That will only validate that the passed in log entries are consistent
>> internally.  I will be submitting a v2 patch shortly that fixes an error
>> that kernel-ci caught. (if CONFIG_PRINTK is disabled, compilation fails)
>> I'll see about adding that check at the same time.
> 
> The question is if a one time walk might find all potential problems.
> We might want to add checks into the existing code. For example,
> log_from_idx() is rather sensitive but it does not check if
> all the values are sane.

Checking for every possible problem isn't practical and would adversely
impact the performance of the logging code.  In the case of the function
you point out, the patch series doesn't change the logic, rather it
replaces the structure being used.

>>>>  Also, as you point out, a bootloader could read log contents from
>>>> the shared region. Of course, in order for either to happen,
>>>> the bootloader would already be compromised and I'm not sure that
>>>>  reading log entries or crashing the kernel is such a big consideration.
>>>
>>> The log might contain addresses of some kernel structures, for example
>>> in some error messages. For this reason, only a privileged users are
>>> able to read it. We must make sure that they will not get accessible
>>> for a non-privileged user, for example via an address to the shared
>>> buffer in the bootloader config.
>>
>> Ahhh, if I am understanding correctly, your concern is that an
>> unprivileged program in user space would be able to gain access directly
>> to the log entries via the external buffer location.  I haven't tested
>> that scenario, so I'm not sure if the memory reservation via the DT
>> protects that memory from user space.  I'll have to check on that.  In
>> the meantime, if you know of standard ways to protect memory from user
>> space access, please let me know.
> 
> I am not expert. I think that the main trick on the memory
> management level is that the userspace and kernel have different
> address spaces. IMHO, the pages that are used by the kernel log buffer
> cannot be addressed from userspace. In fact, each user have only very
> limited number of pages that are mapped to his address space.
> 
> I do not know how this works with the persistent memory.
> 
> In each case, we need to make sure that the address of the
> shared buffer is not readable in the booloaded config file
> for a normal user.

I do not agree with that last point.  If you cannot trust the bootloader
then the ability of a userspace program to read log contents seems to be
the least of your problems.

>>> BTW: You did not answered the question about how the bootloader would
>>> know the right version of the log format. I am afraid that we do not
>>> want to maintain backward compatibility on the kernel side. The printk
>>> code already is too complex.
>>
>> This is a matter of magic numbers.  Uboot has two preexisting log
>> structure versions. Since the kernel did not have a version, I used
>> LOGBUFF_LOG_VERSION = 3, if CONFIG_LOGBUFFER is enabled.  So, if the
>> boot loader passes the command line parameter 'lcb_phys_addr' and the
>> contents of that address pass the validation checks found in
>> is_valid_external_cb(), then the external buffer is used.  Those checks
>> include matching the version number, structure sizes match, etc.
>> Otherwise, the external buffer is ignored.  This should allow for the
>> kernel to change in the future, e.g. v4, and continue to run correctly
>> even if an older boot loader passes a v3 version of the log entries.
> 
> There might be a problem how to bootstrap this. Someone has to define
> the initial log control block. If it is done by bootloader and it uses
> a wrong version, the kernel will not know the size of the external
> buffer and will ignore and never use it.

This is already defined in the code.  Using the wrong structure/version
will behave as expected, and be ignored.  That simply means that the
bootloader can't 'accidentally' use the wrong version and seems sane to
me.

> I have discussed this with colleagues. All this looked to complicated
> to them.

It seems like the complexity comes from the problem itself and I'm
adding very little in the solution that I'm presenting.  How would they
suggest reducing that complexity?

> 
> This patch set actually implements two features:
> 
> 1. Possibility to preserve the kernel log buffer between boots.
> 
>    But this is already supported by pstore/ramoops.

I have not looked into either feature extensively.  However, I believe
that both work slightly differently.  For instance, this feature affords
more flexibility about the buffer location and size to the kernel in
that it doesn't require the kernel to reserve a chunk of RAM ahead of
time.  However, it might be possible to integrate these features
together in some manner in the future.

> 
> 
> 2. Possibility to store bootloader messages in the kernel
>    log buffer.

Additionally, this feature allows for multiple boot cycles to co-exist.
It also allows the _bootloader_ to understand/display _kernel_ log entries.

> 
>    The question is who wants it and if it is worth the effort.

This feature existed before I picked it up in Wolfgang Denk's kernels.
It was broken by the changes to the internal log structures that came in
with 3.10.

> 
>    The bootloader will need to implement log storing in the
>    non-trivial format. It will need to maintain compatibility
>    of this format in the future.

yes, in order to match what the kernel uses, the bootloader will have to
use the same format.  FWIW, the kernel structure has been very stable
over time, with the last major change occurring in the 3.10 time frame.
Of course, any bootloader that wants to pass information to the kernel
in a common format will have to keep up to date.

> 
>    I guess that the bootloader is running on a single CPU
>    and is basically one process. Therefore it does not need
>    to solve any races. Therefore it does not need such a complex
>    format.

The log structures do not provide protection for multiple CPUs.
Additional fields in the log structure are only there to provide
compatibility with the kernel.

> 
>    It might be more effective to write just a simple
>    text into the persistent memory and provide a userspace tool
>    to dump it.

That would not provide any benefit if the kernel doesn't successfully
start up user space.

--
Sean

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH v1 2/2] printk: external log buffer (CONFIG_LOGBUFFER)
  2016-10-12  1:12             ` Sean Hudson
@ 2016-10-14  9:52               ` Petr Mladek
  0 siblings, 0 replies; 16+ messages in thread
From: Petr Mladek @ 2016-10-14  9:52 UTC (permalink / raw)
  To: Sean Hudson
  Cc: linux-kernel, Andrew Morton, Thomas Gleixner, Yang Shi,
	Ingo Molnar, Prarit Bhargava, Rasmus Villemoes, Paul E. McKenney,
	Yaowei Bai, Kees Cook, Andrey Ryabinin, Johannes Weiner,
	Tejun Heo, Arnd Bergmann, Ard Biesheuvel, Thomas Garnier,
	Nicolas Pitre, Sergey Senozhatsky, Borislav Petkov, Tim Bird,
	Steven Rostedt, Grant Likely

On Wed 2016-10-12 03:12:56, Sean Hudson wrote:
> On 10/5/2016 2:48 PM, Petr Mladek wrote:
> > On Tue 2016-10-04 12:55:35, Sean Hudson wrote:
> >> On 10/4/2016 6:27 AM, Petr Mladek wrote:
> >>> IMHO, the only way is to revisit all locations when the log buffer
> >>> is accessed and add all the needed consistency checks. We must make
> >>> sure that we do not access outside of the buffer. Also the log_*_seq
> >>> numbers are supposed to grow linearly...
> >>
> >> Adding an explicit walk of the external log entries should be trivial.
> >> That will only validate that the passed in log entries are consistent
> >> internally.  I will be submitting a v2 patch shortly that fixes an error
> >> that kernel-ci caught. (if CONFIG_PRINTK is disabled, compilation fails)
> >> I'll see about adding that check at the same time.
> > 
> > The question is if a one time walk might find all potential problems.
> > We might want to add checks into the existing code. For example,
> > log_from_idx() is rather sensitive but it does not check if
> > all the values are sane.
> 
> Checking for every possible problem isn't practical and would adversely
> impact the performance of the logging code.  In the case of the function
> you point out, the patch series doesn't change the logic, rather it
> replaces the structure being used.

I do not have strong opinion about it. In each case, printk() is not
a hot path. The question is if even the current code would not
profit from some consistency checks. I have sometimes broken
the consistency of the buffer when working on the printk() code.
And such checks might help to debug such mistakes.


> > In each case, we need to make sure that the address of the
> > shared buffer is not readable in the booloaded config file
> > for a normal user.
> 
> I do not agree with that last point.  If you cannot trust the bootloader
> then the ability of a userspace program to read log contents seems to be
> the least of your problems.

I see. As I said, I am neither security nor bootloader expert. I just
shared my paranoid uncertainties.


> >>> BTW: You did not answered the question about how the bootloader would
> >>> know the right version of the log format. I am afraid that we do not
> >>> want to maintain backward compatibility on the kernel side. The printk
> >>> code already is too complex.
> >>
> >> This is a matter of magic numbers.  Uboot has two preexisting log
> >> structure versions. Since the kernel did not have a version, I used
> >> LOGBUFF_LOG_VERSION = 3, if CONFIG_LOGBUFFER is enabled.  So, if the
> >> boot loader passes the command line parameter 'lcb_phys_addr' and the
> >> contents of that address pass the validation checks found in
> >> is_valid_external_cb(), then the external buffer is used.  Those checks
> >> include matching the version number, structure sizes match, etc.
> >> Otherwise, the external buffer is ignored.  This should allow for the
> >> kernel to change in the future, e.g. v4, and continue to run correctly
> >> even if an older boot loader passes a v3 version of the log entries.
> > 
> > There might be a problem how to bootstrap this. Someone has to define
> > the initial log control block. If it is done by bootloader and it uses
> > a wrong version, the kernel will not know the size of the external
> > buffer and will ignore and never use it.
> 
> This is already defined in the code.  Using the wrong structure/version
> will behave as expected, and be ignored.  That simply means that the
> bootloader can't 'accidentally' use the wrong version and seems sane to
> me.

I see the oposite in the code. The check makes sure that kernel will
ignore wrong version of the buffer. It suggests that kernel will never
use the buffer until bootloader initializes it to the right version.

It might make sense that kernel reinitialize the buffer to its own
version, so that the bootloader will use the right version next time.
But for this, the kernel must know the size of the buffer. So, it
might be a bad idea after all.


> > I have discussed this with colleagues. All this looked to complicated
> > to them.
> 
> It seems like the complexity comes from the problem itself and I'm
> adding very little in the solution that I'm presenting.  How would they
> suggest reducing that complexity?

They suggest to do not mix bootloaded and kernel messages.


> > This patch set actually implements two features:
> > 
> > 1. Possibility to preserve the kernel log buffer between boots.
> > 
> >    But this is already supported by pstore/ramoops.
> 
> I have not looked into either feature extensively.  However, I believe
> that both work slightly differently.  For instance, this feature affords
> more flexibility about the buffer location and size to the kernel in
> that it doesn't require the kernel to reserve a chunk of RAM ahead of
> time.  However, it might be possible to integrate these features
> together in some manner in the future.

It seems that pstore has many capabilities. It allows to store
various types of messages, e.g. only oops/panic messages,
or all messages, or ftrace logs.

The intergration with printk() is transparent. One way is to
register itself as a console and get messages via console
flush. Another way is to flush all messages via the
kmsg API. It means copying messages but it does not
need to touch the printk() code.


> > 2. Possibility to store bootloader messages in the kernel
> >    log buffer.
> 
> Additionally, this feature allows for multiple boot cycles to co-exist.

It might be useful but it also could cause confusion. One might
get easily lost in all the logs.


> It also allows the _bootloader_ to understand/display _kernel_ log entries.

I do not have much experience with debugging these early boot
problems. So, I am not sure how the would be useful and what
are the other possibilities. 

Note that the early kernel messages do not appear in
the shared buffer until setup_ext_logbuff() is called.
The early serial console provides a better job here.


> >    The question is who wants it and if it is worth the effort.
> 
> This feature existed before I picked it up in Wolfgang Denk's kernels.
> It was broken by the changes to the internal log structures that came in
> with 3.10.

Well, this is not much encouraging. It took 3 years to update it
for a new kernel API.

> > 
> >    The bootloader will need to implement log storing in the
> >    non-trivial format. It will need to maintain compatibility
> >    of this format in the future.
> 
> yes, in order to match what the kernel uses, the bootloader will have to
> use the same format.  FWIW, the kernel structure has been very stable
> over time, with the last major change occurring in the 3.10 time frame.
> Of course, any bootloader that wants to pass information to the kernel
> in a common format will have to keep up to date.

Well, the API is hard to touch because 3rd party tools depend on it.
Adding yet another dependecy will make cleaning/fixing printk() code
even more complicated.


> >    I guess that the bootloader is running on a single CPU
> >    and is basically one process. Therefore it does not need
> >    to solve any races. Therefore it does not need such a complex
> >    format.
> 
> The log structures do not provide protection for multiple CPUs.
> Additional fields in the log structure are only there to provide
> compatibility with the kernel.

I wanted to say that a plain text would be enough for
the bootloader log.

kernel stored plain text into the logbuffer before 3.10 and
many things were much easier. The commit
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=7ff9554bb578ba02166071d2d487b7fc7d860d62
converted it to the variable-length record buffer. This change
caused many problems and complications. Many of them are not
solved until today.

I am not sure that you really want to repeat the "mistake"
and add such complexity to the bootloader.

> >    It might be more effective to write just a simple
> >    text into the persistent memory and provide a userspace tool
> >    to dump it.
> 
> That would not provide any benefit if the kernel doesn't successfully
> start up user space.

IMHO, people mostly use serial console to debug these things.
Your feature will help only if the kernel reaches
setup_ext_logbuff() and the shared buffer is initialized
the right way.

In each case, I do not feel having enough experience to decide
about this feature. It would be great to hear from others.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2016-10-14  9:52 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-30  0:55 [RFC PATCH v1 0/2] printk: Shared kernel logging Sean Hudson
2016-09-30  0:55 ` [RFC PATCH v1 1/2] printk: collect offsets into replaceable structure Sean Hudson
2016-09-30  0:55 ` [RFC PATCH v1 2/2] printk: external log buffer (CONFIG_LOGBUFFER) Sean Hudson
2016-09-30 12:39   ` Petr Mladek
2016-10-01  4:06     ` Sean Hudson
2016-10-04 11:27       ` Petr Mladek
2016-10-04 17:55         ` Sean Hudson
2016-10-05 12:48           ` Petr Mladek
2016-10-12  1:12             ` Sean Hudson
2016-10-14  9:52               ` Petr Mladek
2016-09-30  1:36 ` [RFC PATCH v1 0/2] printk: Shared kernel logging Kees Cook
2016-10-01  1:56   ` Sean Hudson
2016-10-01  2:42     ` Kees Cook
2016-10-04 21:08 ` [RFC PATCH v2 " Sean Hudson
2016-10-04 21:08   ` [RFC PATCH v2 1/2] printk: collect offsets into replaceable structure Sean Hudson
2016-10-04 21:08   ` [RFC PATCH v2 2/2] printk: external log buffer (CONFIG_LOGBUFFER) Sean Hudson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.