linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/2] printk: Add context information to kernel messages from /dev/kmsg
@ 2014-05-19 10:28 Yoshihiro YUNOMAE
  2014-05-19 10:28 ` [RFC PATCH 1/2] printk: Add context information to the header of /dev/kmsg Yoshihiro YUNOMAE
  2014-05-19 10:28 ` [RFC PATCH 2/2] printk: Simplify fragmented line information of " Yoshihiro YUNOMAE
  0 siblings, 2 replies; 7+ messages in thread
From: Yoshihiro YUNOMAE @ 2014-05-19 10:28 UTC (permalink / raw)
  To: Andrew Morton, Kay Sievers, linux-kernel
  Cc: Kees Cook, Randy Dunlap, Petr Mladek, Hidehiro Kawai,
	yrl.pp-manager.tt, Masami Hiramatsu, Joe Perches, Arun KS

Hi all,

This patch set adds context information to kernel messages from /dev/kmsg.

Two printk messages connected with KERN_CONT can be divided in multiple
lines by a different process context message. This induces two problems.

a) The owner problem
If the different context message seems like the 1st divided message, it is
difficult to understand which the 2nd divided message belongs to. This problem
can also occur for the situation where multiple message lines without KERN_CONT 
are broken into by similar messages. When user tools handle printk messages,
those mixed messages will create some inconveniences.

  - Example of the owner problem
	[110781.736171] sd 2:0:0:0: [sdb]  
	[110781.736170] sd 3:0:0:0: [sdc] Unhandled sense code
	[110781.736172] sd 3:0:0:0: [sdc]  
	[110781.736175] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
	[110781.736177] sd 3:0:0:0: [sdc]  
	[110781.736178] Sense Key : Medium Error [current] 
	[110781.736187] Sense Key : Recovered Error 

b) The reconstruction problem
Even if we solve the owner problem, we don't know the consecutive lines owned
by a particular context are whether divided messages with KERN_CONT or simply
separate lines from the beginning.

  - Example of the reconstruction problem
	[110781.736198] Read(10): 28 00
	 ...[snip]...
	[110781.736204]  00
	[110781.736205]  00 11 e0 00 01 00 00

	Which ones should be merged into on line? You may say the right answer,
	but it's difficult to automate. The expected result is as follows:

	[110781.736198] Read(10): 28 00 00 00 11 e0 00 01 00 00

We can consider three solutions for these problems:

1. Store continuous messages in a local buffer, and execute printk
 This does not induce fragment of messages. There are some related work about
 error messages in a SCSI layer.
	http://marc.info/?l=linux-scsi&m=135003083808290&w=2
	http://comments.gmane.org/gmane.linux.scsi/88768
 This approach will take time for acceptance activities because continuous
 printk is used in a lot of places. Moreover, this approach
 cannot solve the fragment message problem fundamentally, so kernel developers
 must always take care about continuous printk. Therefore, this solution takes a
 maintenance cost.

2. Add context information to all kernel messages
 This solves a) the owner problem because user tools can separate each
 owner message from mixed messages by the context information. If multiple
 message lines without KERN_CONT are broken into by similar messages, this is
 a good solution. However, this does not indicate which messages with KERN_CONT
 are fragmented, so this does not solve b) the reconstruction problem for those.

3. Apply 2. and add fragment flag to all fragmented kernel messages
 This solves a) and b) problems for multiple messages connected with KERN_CONT.
 The fragmented flag shows which messages with KERN_CONT are fragmented, so
 user tools can reconstruct to its original state by it after separating each
 owner message from mixed messages by the context information.
 
 This solves the fragment problem fundamentally by merging those fragment
 messages on userland, so this solution seems to be a right way. We choose this 
 solution.

Following description makes mention of only fragmented messages with KERN_CONT.
This is because if the problems for fragmented messages with KERN_CONT are
solved, the owner problem for the mixed messages without KERN_CONT is
also solved.

Fragment is induced by different process context messages, so we want to add
process context information to each message in order to connect fragmented
messages. CPU number is not sufficient because it can change while calling
a series of printks in the preemptive kernel. On the other hand, we don't want
to change results of dmesg and syslog messages. How do we tackle this problem?

The key idea is to change the header of /dev/kmsg. According to the
documentation of /dev/kmsg, we can add comma separated values before ';' as
the header information of /dev/kmsg. So, we can change the header information
based on the rule.

This patch set solves this problem by following two approaches:

1. Add process context information (PID/interrupt)
 Fragment can occur when the process context is different from the previous
 context stored in the cont buffer. By applying this 1st patch, the context
 information is indicated to connect fragment messages. Moreover, if the
 message is output in an interrupt context, the information is also output as
 s(softIRQ)/h(hardIRQ)/n(NMI).

2. Introduce an exact fragment flag
 Current kernel tries to add 'c' flag to the 1st fragmented line and '+' to
 the following fragmented lines. However, these flags are just a hint; it
 doesn't guarantee that they are always correct because the kernel can't know
 the right answer without context information. Current kernel adds fragment
 flags by guess work, but a user tool can connect fragmented lines exactly
 by utilizing context information. So, current fragment flag is not appropriate
 after applying this patch, and we introduced new fragment flag 'f' instead of
 'c' and '+'.

To understand how to connect fragmented messages by changing the header of
/dev/kmsg, we assume that each two kernel threads executes the following
3 printk():

	pr_info("A1");
    	printk("A2");
    	printk("A3\n");

In current implementation, these threads can output as follows:

	6,3321,31629088,-;A1    <== Who output this message?
    	6,3322,31629091,c;A1         |
    	4,3323,31629094,+;A2    <----+
    	4,3324,31629095,+;A3         | Which?
    	4,3325,31629097,c;A2    <----+
    	4,3326,31629099,+;A3

A1, A2, and A3 messages of two threads are clearly fragmented in this result,
but we cannot understand who output the 1st line message and which messages were
output by whom. So, the relation of divided messages is unclear from this test.
Moreover, although the 4th fields of the header indicate fragment flags,
the flag in the 1st line shows not 'c' (fragment) but '-' (no fragment).

By applying this patch set, we can get a following result from /dev/kmsg:

	6,7868,2070756113,f,1185/-;A1
	6,7869,2070756116,f,1186/-;A1
	4,7870,2070756118,f,1186/-;A2
	4,7871,2070756119,-,1186/-;A3
	4,7872,2070756120,f,1185/-;A2
	4,7873,2070756122,-,1185/-;A3

1st line indicates fragmented line, PID=1185, and no interrupt context.
2nd line indicates fragmented line, PID=1186, and no interrupt context.
So, 1st line will connect to 6th line and 7th line by fragment flags and PID.
On the other hand, 2nd line will connect to 3rd line and 4th line.
A message merging tool can convert these messages as follows:

	6,7868,2070756113,-,1185/-;A1A2A3	/* thread1 */
	6,7869,2070756116,-,1186/-;A1A2A3	/* thread2 */

Thanks!

---

Yoshihiro YUNOMAE (2):
      printk: Add context information to the header of /dev/kmsg
      printk: Simplify fragmented line information of the header of /dev/kmsg


 Documentation/ABI/testing/dev-kmsg |   34 ++++++++----
 kernel/printk/printk.c             |  104 +++++++++++++++++++++++++-----------
 2 files changed, 95 insertions(+), 43 deletions(-)

-- 
Yoshihiro YUNOMAE
Software Platform Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: yoshihiro.yunomae.ez@hitachi.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [RFC PATCH 1/2] printk: Add context information to the header of /dev/kmsg
  2014-05-19 10:28 [RFC PATCH 0/2] printk: Add context information to kernel messages from /dev/kmsg Yoshihiro YUNOMAE
@ 2014-05-19 10:28 ` Yoshihiro YUNOMAE
  2014-05-19 11:35   ` Kay Sievers
  2014-05-20  3:36   ` Valdis.Kletnieks
  2014-05-19 10:28 ` [RFC PATCH 2/2] printk: Simplify fragmented line information of " Yoshihiro YUNOMAE
  1 sibling, 2 replies; 7+ messages in thread
From: Yoshihiro YUNOMAE @ 2014-05-19 10:28 UTC (permalink / raw)
  To: Andrew Morton, Kay Sievers, linux-kernel
  Cc: Kees Cook, Randy Dunlap, Petr Mladek, Hidehiro Kawai,
	yrl.pp-manager.tt, Masami Hiramatsu, Joe Perches, Arun KS

Add context information to the header of /dev/kmsg.

Two printk messages connected with KERN_CONT can be divided in multiple lines
by a different process context message. If the different context message seems
like the 1st divided message, it is difficult to understand which the 2nd
divided message belongs to. This problem can also occur for the situation where
multiple message lines without KERN_CONT are broken into by similar messages.
For example, SCSI disk error messages can be show as follows:

	[110781.736171] sd 2:0:0:0: [sdb]
	[110781.736170] sd 3:0:0:0: [sdc] Unhandled sense code
	[110781.736172] sd 3:0:0:0: [sdc]
	[110781.736175] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
	[110781.736177] sd 3:0:0:0: [sdc]
	[110781.736178] Sense Key : Medium Error [current]
	[110781.736187] Sense Key : Recovered Error
	[110781.736189] [current]

This patch adds PID and interrupt context flag to the header of /dev/kmsg as
the context information in order to understand relation of output messages. If
PID values of two messages and the interrupt context flags are same, it means
that those messages are same context, so those message have some relation. If
not so, it means that those messages are different context, so users do not
need to take care about the relation of the messages.

The header information is changed as follows:
   <prefix>,<seq#>,<timestamp>,<fragmented>,<PID>/<interrupt context>;<msg>
                                            |                       |
                                            +-----------------------+
                                                   context info.
Interrupt context indicates as follows:
   n: NMI context
   h: Hard IRQ context
   s: Soft IRQ context
   -: No interrupt context

<Examples>
 - Output same 3 printk messages in two kernel threads.
   Run two kernel threads in same time in order to make the messages compete.
	pr_info("A1");
	printk("A2");
	printk("A3\n");

 - Current implementation (without context information)
	6,3321,31629088,-;A1    <== Who output this message?
	6,3322,31629091,c;A1         |
	4,3323,31629094,+;A2    <----+
	4,3324,31629095,+;A3         | Which?
	4,3325,31629097,c;A2    <----+
	4,3326,31629099,+;A3
 A1, A2, and A3 messages of two threads are clearly fragmented in this result,
 but we cannot understand who output the 1st line message and which messages
 were output by whom. So, the relation of divided messages is unclear from this
 test.

 - Applying this patch (with context information)
	6,2308,36687114,-,622/-;A1    <== Output by PID=622
	6,2309,36687118,c,621/-;A1         |
	4,2310,36687121,+,621/-;A2         | Relate
	4,2311,36687123,+,621/-;A3         |
	4,2312,36687124,c,622/-;A2    <----+   (1)
	4,2313,36687126,+,622/-;A3    <----+   (2)
   We can understand the relation of divided messages. For example,
   the 1st line indicates PID=622 and no interrupt context. The same context
   is the lines of (1) and (2). So, users can understand these three messages
   are output by the same thread.

Note:
According to Documentation/ABI/testing/dev-kmsg, we can add comma separated
values before ';' as the header information of /dev/kmsg. So, this change does
not affect ABI.

Signed-off-by: Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@hitachi.com>
Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Kay Sievers <kay@vrfy.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Joe Perches <joe@perches.com>
Cc: Petr Mladek <pmladek@suse.cz>
Cc: Arun KS <arunks.linux@gmail.com>
Cc: Kees Cook <keescook@chromium.org>
---
 Documentation/ABI/testing/dev-kmsg |   21 ++++++++--
 kernel/printk/printk.c             |   77 ++++++++++++++++++++++++++++--------
 2 files changed, 76 insertions(+), 22 deletions(-)

diff --git a/Documentation/ABI/testing/dev-kmsg b/Documentation/ABI/testing/dev-kmsg
index bb820be..d8128ac 100644
--- a/Documentation/ABI/testing/dev-kmsg
+++ b/Documentation/ABI/testing/dev-kmsg
@@ -77,11 +77,11 @@ Description:	The /dev/kmsg character device node provides userspace access
 		userspace.
 
 		Example:
-		7,160,424069,-;pci_root PNP0A03:00: host bridge window [io  0x0000-0x0cf7] (ignored)
+		7,160,424069,-,1/-;pci_root PNP0A03:00: host bridge window [io  0x0000-0x0cf7] (ignored)
 		 SUBSYSTEM=acpi
 		 DEVICE=+acpi:PNP0A03:00
-		6,339,5140900,-;NET: Registered protocol family 10
-		30,340,5690716,-;udevd[80]: starting version 181
+		6,339,5140900,-,1/-;NET: Registered protocol family 10
+		30,340,5690716,-,80/-;udevd[80]: starting version 181
 
 		The DEVICE= key uniquely identifies devices the following way:
 		  b12:8        - block dev_t
@@ -89,7 +89,7 @@ Description:	The /dev/kmsg character device node provides userspace access
 		  n8           - netdev ifindex
 		  +sound:card0 - subsystem:devname
 
-		The flags field carries '-' by default. A 'c' indicates a
+		The 4th flags field carries '-' by default. A 'c' indicates a
 		fragment of a line. All following fragments are flagged with
 		'+'. Note, that these hints about continuation lines are not
 		necessarily correct, and the stream could be interleaved with
@@ -98,4 +98,17 @@ Description:	The /dev/kmsg character device node provides userspace access
 		logic is used internally when messages are printed to the
 		console, /proc/kmsg or the syslog() syscall.
 
+		The 5th field is context information to separate mixed messages
+		from different contexts. The field indicates PID and interrupt
+		context information like '<PID>/<interrupt context info.>'. In
+		interrupt context information, 'n' indicates NMI context, 'h'
+		indicates hard IRQ context, 's' indicates soft IRQ context, and
+		'-' indicates no interrupt context. This 5th context field tells
+		users which messages are output in the same context. If PID
+		values of two messages and interrupt context are same, it means
+		that those messages are output by the same process, so those
+		message may have any relation. If not so, it means that those
+		messages are different context, so users do not need to care
+		about the relation of the messages.
+
 Users:		dmesg(1), userspace kernel log consumers
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 7228258..8105431 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -197,6 +197,8 @@ enum log_flags {
 
 struct printk_log {
 	u64 ts_nsec;		/* timestamp in nanoseconds */
+	pid_t pid;		/* identify PID */
+	u32 irq_count;		/* identify irq_count */
 	u16 len;		/* length of entire record */
 	u16 text_len;		/* length of text buffer */
 	u16 dict_len;		/* length of dictionary buffer */
@@ -298,10 +300,11 @@ static u32 log_next(u32 idx)
 }
 
 /* insert record into the buffer, discard old ones, update heads */
-static void log_store(int facility, int level,
-		      enum log_flags flags, u64 ts_nsec,
-		      const char *dict, u16 dict_len,
-		      const char *text, u16 text_len)
+static void __log_store(int facility, int level,
+			enum log_flags flags, u64 ts_nsec,
+			const char *dict, u16 dict_len,
+			const char *text, u16 text_len,
+			pid_t pid, u32 irq_count)
 {
 	struct printk_log *msg;
 	u32 size, pad_len;
@@ -339,6 +342,8 @@ static void log_store(int facility, int level,
 
 	/* fill message */
 	msg = (struct printk_log *)(log_buf + log_next_idx);
+	msg->pid = pid;
+	msg->irq_count = irq_count;
 	memcpy(log_text(msg), text, text_len);
 	msg->text_len = text_len;
 	memcpy(log_dict(msg), dict, dict_len);
@@ -470,6 +475,18 @@ out:
 	return ret;
 }
 
+static char show_irq_count_flag(u32 irq_count)
+{
+	if (irq_count & NMI_MASK)
+		return 'n';
+	else if (irq_count & HARDIRQ_MASK)
+		return 'h';
+	else if (irq_count & SOFTIRQ_MASK)
+		return 's';
+	else
+		return '-';
+}
+
 static ssize_t devkmsg_read(struct file *file, char __user *buf,
 			    size_t count, loff_t *ppos)
 {
@@ -530,9 +547,10 @@ static ssize_t devkmsg_read(struct file *file, char __user *buf,
 		 ((user->prev & LOG_CONT) && !(msg->flags & LOG_PREFIX)))
 		cont = '+';
 
-	len = sprintf(user->buf, "%u,%llu,%llu,%c;",
+	len = sprintf(user->buf, "%u,%llu,%llu,%c,%d/%c;",
 		      (msg->facility << 3) | msg->level,
-		      user->seq, ts_usec, cont);
+		      user->seq, ts_usec, cont, msg->pid,
+		      show_irq_count_flag(msg->irq_count));
 	user->prev = msg->flags;
 
 	/* escape non-printable characters */
@@ -1391,12 +1409,31 @@ static struct cont {
 	size_t cons;			/* bytes written to console */
 	struct task_struct *owner;	/* task of first print*/
 	u64 ts_nsec;			/* time of first print */
+	u32 irq_count;			/* irq_count of first print */
 	u8 level;			/* log level of first message */
 	u8 facility;			/* log level of first message */
 	enum log_flags flags;		/* prefix, newline flags */
 	bool flushed:1;			/* buffer sealed and committed */
 } cont;
 
+static inline void log_store_cont(int facility, int level,
+				  enum log_flags flags, u64 ts_nsec,
+				  const char *dict, u16 dict_len,
+				  const char *text, u16 text_len)
+{
+	__log_store(facility, level, flags, ts_nsec, dict, dict_len,
+		    text, text_len, cont.owner->tgid, cont.irq_count);
+}
+
+static inline void log_store_current(int facility, int level,
+				     enum log_flags flags, u64 ts_nsec,
+				     const char *dict, u16 dict_len,
+				     const char *text, u16 text_len)
+{
+	__log_store(facility, level, flags, ts_nsec, dict, dict_len,
+		    text, text_len, current->tgid, irq_count());
+}
+
 static void cont_flush(enum log_flags flags)
 {
 	if (cont.flushed)
@@ -1410,8 +1447,8 @@ static void cont_flush(enum log_flags flags)
 		 * console; wait for the console to pick up the rest of the
 		 * line. LOG_NOCONS suppresses a duplicated output.
 		 */
-		log_store(cont.facility, cont.level, flags | LOG_NOCONS,
-			  cont.ts_nsec, NULL, 0, cont.buf, cont.len);
+		log_store_cont(cont.facility, cont.level, flags | LOG_NOCONS,
+			       cont.ts_nsec, NULL, 0, cont.buf, cont.len);
 		cont.flags = flags;
 		cont.flushed = true;
 	} else {
@@ -1419,8 +1456,8 @@ static void cont_flush(enum log_flags flags)
 		 * If no fragment of this line ever reached the console,
 		 * just submit it to the store and free the buffer.
 		 */
-		log_store(cont.facility, cont.level, flags, 0,
-			  NULL, 0, cont.buf, cont.len);
+		log_store_cont(cont.facility, cont.level, flags, 0,
+			       NULL, 0, cont.buf, cont.len);
 		cont.len = 0;
 	}
 }
@@ -1441,6 +1478,7 @@ static bool cont_add(int facility, int level, const char *text, size_t len)
 		cont.level = level;
 		cont.owner = current;
 		cont.ts_nsec = local_clock();
+		cont.irq_count = irq_count();
 		cont.flags = 0;
 		cont.cons = 0;
 		cont.flushed = false;
@@ -1532,8 +1570,8 @@ asmlinkage int vprintk_emit(int facility, int level,
 		recursion_bug = 0;
 		printed_len += strlen(recursion_msg);
 		/* emit KERN_CRIT message */
-		log_store(0, 2, LOG_PREFIX|LOG_NEWLINE, 0,
-			  NULL, 0, recursion_msg, printed_len);
+		log_store_current(0, 2, LOG_PREFIX|LOG_NEWLINE, 0,
+				  NULL, 0, recursion_msg, printed_len);
 	}
 
 	/*
@@ -1582,13 +1620,14 @@ asmlinkage int vprintk_emit(int facility, int level,
 		 * Flush the conflicting buffer. An earlier newline was missing,
 		 * or another task also prints continuation lines.
 		 */
-		if (cont.len && (lflags & LOG_PREFIX || cont.owner != current))
+		if (cont.len && (lflags & LOG_PREFIX || cont.owner != current
+				 || cont.irq_count != irq_count()))
 			cont_flush(LOG_NEWLINE);
 
 		/* buffer line if possible, otherwise store it right away */
 		if (!cont_add(facility, level, text, text_len))
-			log_store(facility, level, lflags | LOG_CONT, 0,
-				  dict, dictlen, text, text_len);
+			log_store_current(facility, level, lflags | LOG_CONT, 0,
+					  dict, dictlen, text, text_len);
 	} else {
 		bool stored = false;
 
@@ -1601,15 +1640,17 @@ asmlinkage int vprintk_emit(int facility, int level,
 		 * a newline, flush and append the newline.
 		 */
 		if (cont.len) {
-			if (cont.owner == current && !(lflags & LOG_PREFIX))
+			if ((cont.owner == current &&
+			     cont.irq_count == irq_count())
+			    && !(lflags & LOG_PREFIX))
 				stored = cont_add(facility, level, text,
 						  text_len);
 			cont_flush(LOG_NEWLINE);
 		}
 
 		if (!stored)
-			log_store(facility, level, lflags, 0,
-				  dict, dictlen, text, text_len);
+			log_store_current(facility, level, lflags, 0,
+					  dict, dictlen, text, text_len);
 	}
 	printed_len += text_len;
 


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [RFC PATCH 2/2] printk: Simplify fragmented line information of the header of /dev/kmsg
  2014-05-19 10:28 [RFC PATCH 0/2] printk: Add context information to kernel messages from /dev/kmsg Yoshihiro YUNOMAE
  2014-05-19 10:28 ` [RFC PATCH 1/2] printk: Add context information to the header of /dev/kmsg Yoshihiro YUNOMAE
@ 2014-05-19 10:28 ` Yoshihiro YUNOMAE
  1 sibling, 0 replies; 7+ messages in thread
From: Yoshihiro YUNOMAE @ 2014-05-19 10:28 UTC (permalink / raw)
  To: Andrew Morton, Kay Sievers, linux-kernel
  Cc: Kees Cook, Randy Dunlap, Petr Mladek, Hidehiro Kawai,
	yrl.pp-manager.tt, Masami Hiramatsu, Joe Perches, Arun KS

Simplify fragmented line information of the 4th flag of the header of /dev/kmsg.

One of the hint of the printk() linebreak is the header information of
/dev/kmsg. 4th flag in the header adds 'c' or '+' when a message is divided.
Users can roughly understand where the printk message was divided, but they
cannot understand exactly which messages are divided or how those messages
should be connected.

According to a comment in printk.c, 'c' is marked for the first fragment of a
line. However, 'c' is not always marked as the first fragment of the line in
current kernel. The kernel outputs the 'c' flag when current message flags has
LOG_CONT and previous message flag does not have LOG_CONT. So, LOG_CONT flag
is needed to add 'c'. However, when the previous message A is being stored in
cont buffer and the current message B is not added to cont buffer, LOG_CONT
flag is added not to the message A but to the message B. In this situation,
'c' flag is added to the message B even though the message A is actually
the first fragmented line. As the result, it seems that the message A is not
divided because the message A is output with '-' (no fragment).

In addition, all following fragments are flagged with '+' according to
Documentation/ABI/testing/dev-kmsg. In kernel, when current message and the
previous message has LOG_CONT flag, '+' flag is added to the current message.
However, if those messages are output in different context and those messages
have LOG_CONT flag, '+' flag is added to the current message but it is
sometimes meaningless. For example, when previous message with LOG_CONT is
output by process C and current message (1st continuous message) with LOG_CONT
is output by process D, adding '+' flag to the current message of process D is
incorrect. This occurs when both return values of both messages without '\n'
are false in cont_add().

This patch simplifies the fragmented line flag policy.
 1) Add 'f' flag if the kernel flushes a fragment message
 2) Delete 'c' and '+'
When fragmentation occurs, the current message flushes the previous message
with LOG_CONT. When the kernel outputs the messages to /dev/kmsg, it
just checks whether messages have LOG_CONT flag or not. If the messages have
LOG_CONT flag, it adds 'f' flag in the header of /dev/kmsg. Then, users can
understand which message was divided by 'f' flag and how these messages
should be connected by checking the fragmented flag and the context information.

<Example>
 - Environment
   Execute 3 kernel threads A, B, and C. Each thread outputs following
  messages:
	/* for {A, B, C} */
	pr_info("{A, B, C}1");
	printk("{A, B, C}2");
	printk("{A, B, C}3\n");

 - Current result of /dev/kmsg
	6,225186,93838029,-,2288/-;B1    <== No fragment?
	6,225187,93838091,c,2289/-;C1
	4,225188,93838092,+,2289/-;C2
	4,225189,93838092,+,2289/-;C3
	6,225190,93842059,c,2289/-;C1
	4,225191,93842409,+,2289/-;C2C3
	4,225192,93842713,-,2288/-;B2    <== No fragment?
	6,225193,93842944,c,2287/-;A1
	4,225194,93842944,+,2287/-;A2
	4,225195,93842945,+,2287/-;A3
	4,225196,93846199,-,2288/-;B3    <== No fragment?
   In this example, fragmented flags of the B thread are incorrect because
   B1, B2, and B3 are actually divided, but there are no flags. So, users cannot
   understand which messages are divided or how those messages should be
   connected in current implementation.

 - Apply this patch
   This patch introduces 'f' flag policy and deletes 'c' and '+' flags. If a
   message is forcibly flushed in the middle, just add 'f'. Moreover, if a
   message includes '\n' meaning the end of the messages, add no fragment flag
   ('-'). So, if users find 'f' flag for a message, they can connect the
   message to next 'f' flag (forcibly flushed) or '-' (the end of the messages)
   in same context.

	6,8219,543073110,f,538/-;B1    <== This message is fragmented.
	6,8220,543073113,f,539/-;C1
	4,8221,543073115,f,539/-;C2
	4,8222,543073116,-,539/-;C3
	4,8223,543073117,f,538/-;B2    <== Stil fragmented
	6,8224,543073120,f,537/-;A1
	4,8225,543073120,-,538/-;B3    <== This is the end of the messages.
	4,8226,543073123,-,537/-;A2A3

   A message merging tool can convert these messages as follows:

	6,8219,543073110,-,538/-;B1B2B3
	6,8220,543073113,-,539/-;C1C2C3
	6,8224,543073120,-,537/-;A1A2A3

Signed-off-by: Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@hitachi.com>
Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Kay Sievers <kay@vrfy.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Joe Perches <joe@perches.com>
Cc: Petr Mladek <pmladek@suse.cz>
Cc: Arun KS <arunks.linux@gmail.com>
Cc: Kees Cook <keescook@chromium.org>
---
 Documentation/ABI/testing/dev-kmsg |   15 +++++++--------
 kernel/printk/printk.c             |   31 +++++++++++++++----------------
 2 files changed, 22 insertions(+), 24 deletions(-)

diff --git a/Documentation/ABI/testing/dev-kmsg b/Documentation/ABI/testing/dev-kmsg
index d8128ac..ccb7cf3 100644
--- a/Documentation/ABI/testing/dev-kmsg
+++ b/Documentation/ABI/testing/dev-kmsg
@@ -89,14 +89,13 @@ Description:	The /dev/kmsg character device node provides userspace access
 		  n8           - netdev ifindex
 		  +sound:card0 - subsystem:devname
 
-		The 4th flags field carries '-' by default. A 'c' indicates a
-		fragment of a line. All following fragments are flagged with
-		'+'. Note, that these hints about continuation lines are not
-		necessarily correct, and the stream could be interleaved with
-		unrelated messages, but merging the lines in the output
-		usually produces better human readable results. A similar
-		logic is used internally when messages are printed to the
-		console, /proc/kmsg or the syslog() syscall.
+		The 4th flags field carries '-' by default. A 'f' indicates a
+		fragment of a line. If a message is forcibly flushed in the
+		middle, just add 'f'. Moreover, if a message includes '\n'
+		meaning the end of the messages, no add flag ('-'). So, if
+		users find 'f' flag for a message, they can connect the message
+		to next 'f' flag (forcibly flushed) or '-' (the end of the
+		messages) in same context information output in the 5th field.
 
 		The 5th field is context information to separate mixed messages
 		from different contexts. The field indicates PID and interrupt
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 8105431..d8c3716 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -414,7 +414,6 @@ static int check_syslog_permissions(int type, bool from_file)
 struct devkmsg_user {
 	u64 seq;
 	u32 idx;
-	enum log_flags prev;
 	struct mutex lock;
 	char buf[8192];
 };
@@ -535,23 +534,16 @@ static ssize_t devkmsg_read(struct file *file, char __user *buf,
 
 	/*
 	 * If we couldn't merge continuation line fragments during the print,
-	 * export the stored flags to allow an optional external merge of the
-	 * records. Merging the records isn't always neccessarily correct, like
-	 * when we hit a race during printing. In most cases though, it produces
-	 * better readable output. 'c' in the record flags mark the first
-	 * fragment of a line, '+' the following.
+	 * export the stored flags 'f' to allow an optional external merge of
+	 * the records.
 	 */
-	if (msg->flags & LOG_CONT && !(user->prev & LOG_CONT))
-		cont = 'c';
-	else if ((msg->flags & LOG_CONT) ||
-		 ((user->prev & LOG_CONT) && !(msg->flags & LOG_PREFIX)))
-		cont = '+';
+	if (msg->flags & LOG_CONT)
+		cont = 'f';
 
 	len = sprintf(user->buf, "%u,%llu,%llu,%c,%d/%c;",
 		      (msg->facility << 3) | msg->level,
 		      user->seq, ts_usec, cont, msg->pid,
 		      show_irq_count_flag(msg->irq_count));
-	user->prev = msg->flags;
 
 	/* escape non-printable characters */
 	for (i = 0; i < msg->text_len; i++) {
@@ -1620,9 +1612,13 @@ asmlinkage int vprintk_emit(int facility, int level,
 		 * Flush the conflicting buffer. An earlier newline was missing,
 		 * or another task also prints continuation lines.
 		 */
-		if (cont.len && (lflags & LOG_PREFIX || cont.owner != current
-				 || cont.irq_count != irq_count()))
-			cont_flush(LOG_NEWLINE);
+		if (cont.len) {
+			if (cont.owner != current ||
+			    cont.irq_count != irq_count())
+				cont_flush(LOG_CONT);
+			else if (lflags & LOG_PREFIX)
+				cont_flush(LOG_NEWLINE);
+		}
 
 		/* buffer line if possible, otherwise store it right away */
 		if (!cont_add(facility, level, text, text_len))
@@ -1645,7 +1641,10 @@ asmlinkage int vprintk_emit(int facility, int level,
 			    && !(lflags & LOG_PREFIX))
 				stored = cont_add(facility, level, text,
 						  text_len);
-			cont_flush(LOG_NEWLINE);
+			if (!stored)
+				cont_flush(LOG_CONT);
+			else
+				cont_flush(LOG_NEWLINE);
 		}
 
 		if (!stored)


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH 1/2] printk: Add context information to the header of /dev/kmsg
  2014-05-19 10:28 ` [RFC PATCH 1/2] printk: Add context information to the header of /dev/kmsg Yoshihiro YUNOMAE
@ 2014-05-19 11:35   ` Kay Sievers
  2014-05-26  7:16     ` Yoshihiro YUNOMAE
  2014-05-20  3:36   ` Valdis.Kletnieks
  1 sibling, 1 reply; 7+ messages in thread
From: Kay Sievers @ 2014-05-19 11:35 UTC (permalink / raw)
  To: Yoshihiro YUNOMAE
  Cc: Andrew Morton, LKML, Kees Cook, Randy Dunlap, Petr Mladek,
	Hidehiro Kawai, yrl.pp-manager.tt, Masami Hiramatsu, Joe Perches,
	Arun KS

On Mon, May 19, 2014 at 7:28 PM, Yoshihiro YUNOMAE
<yoshihiro.yunomae.ez@hitachi.com> wrote:
> Add context information to the header of /dev/kmsg.
>
> Two printk messages connected with KERN_CONT can be divided in multiple lines
> by a different process context message. If the different context message seems
> like the 1st divided message, it is difficult to understand which the 2nd
> divided message belongs to. This problem can also occur for the situation where
> multiple message lines without KERN_CONT are broken into by similar messages.
> For example, SCSI disk error messages can be show as follows:
>
>         [110781.736171] sd 2:0:0:0: [sdb]
>         [110781.736170] sd 3:0:0:0: [sdc] Unhandled sense code
>         [110781.736172] sd 3:0:0:0: [sdc]
>         [110781.736175] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
>         [110781.736177] sd 3:0:0:0: [sdc]
>         [110781.736178] Sense Key : Medium Error [current]
>         [110781.736187] Sense Key : Recovered Error
>         [110781.736189] [current]

[...]

> This patch adds PID and interrupt context flag to the header of /dev/kmsg as
> the context information in order to understand relation of output messages. If
> PID values of two messages and the interrupt context flags are same, it means
> that those messages are same context, so those message have some relation. If
> not so, it means that those messages are different context, so users do not
> need to take care about the relation of the messages.

[...]

> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -197,6 +197,8 @@ enum log_flags {
>
>  struct printk_log {
>         u64 ts_nsec;            /* timestamp in nanoseconds */
> +       pid_t pid;              /* identify PID */
> +       u32 irq_count;          /* identify irq_count */

I don't think it is worth to blow-up this heavily used struct up even
more, just because SCSI cannot log in simple single calls. How about
fixing SCSI to log to a local buffer if it cannot safely print one
line at once.

I'm not convinced, that turning the /dev/kmsg format into an
"annotation language', and requiring a rather complex state machine to
re-construct the "broken" logging makes too much sense here. If tools
rely on properly formatted logging messages, the logging should be
fixed at the source while it is logged, not be reconstructed later.

It is not not my call, but I don't this makes much sense. Continuation
lines are "best effort" not a facility that is or ever was reliable in
the past. I think the proper fix is how the log is created, not how it
is exported.

Thanks,
Kay

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH 1/2] printk: Add context information to the header of /dev/kmsg
  2014-05-19 10:28 ` [RFC PATCH 1/2] printk: Add context information to the header of /dev/kmsg Yoshihiro YUNOMAE
  2014-05-19 11:35   ` Kay Sievers
@ 2014-05-20  3:36   ` Valdis.Kletnieks
  2014-05-26  7:18     ` Yoshihiro YUNOMAE
  1 sibling, 1 reply; 7+ messages in thread
From: Valdis.Kletnieks @ 2014-05-20  3:36 UTC (permalink / raw)
  To: Yoshihiro YUNOMAE
  Cc: Andrew Morton, Kay Sievers, linux-kernel, Kees Cook,
	Randy Dunlap, Petr Mladek, Hidehiro Kawai, yrl.pp-manager.tt,
	Masami Hiramatsu, Joe Perches, Arun KS

[-- Attachment #1: Type: text/plain, Size: 504 bytes --]

On Mon, 19 May 2014 19:28:38 +0900, Yoshihiro YUNOMAE said:

> The header information is changed as follows:
>    <prefix>,<seq#>,<timestamp>,<fragmented>,<PID>/<interrupt context>;<msg>

Has this been tested with popular userspace that reads /dev/kmsg, to make
sure that it actually plays nice with a 5th field being added?  Yes, I know
that Documentation/ABI/dev-kmsg says that added fields should be gracefully
ignored - but we all know how good userspace programmers are about that
sort of thing. :)

[-- Attachment #2: Type: application/pgp-signature, Size: 848 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Re: [RFC PATCH 1/2] printk: Add context information to the header of /dev/kmsg
  2014-05-19 11:35   ` Kay Sievers
@ 2014-05-26  7:16     ` Yoshihiro YUNOMAE
  0 siblings, 0 replies; 7+ messages in thread
From: Yoshihiro YUNOMAE @ 2014-05-26  7:16 UTC (permalink / raw)
  To: Kay Sievers
  Cc: Andrew Morton, LKML, Kees Cook, Randy Dunlap, Petr Mladek,
	Hidehiro Kawai, yrl.pp-manager.tt, Masami Hiramatsu, Joe Perches,
	Arun KS, Valdis.Kletnieks

Hi Kay,

Thank you for your reply.

(2014/05/19 20:35), Kay Sievers wrote:
> On Mon, May 19, 2014 at 7:28 PM, Yoshihiro YUNOMAE
> <yoshihiro.yunomae.ez@hitachi.com> wrote:
>> Add context information to the header of /dev/kmsg.
>>
>> Two printk messages connected with KERN_CONT can be divided in multiple lines
>> by a different process context message. If the different context message seems
>> like the 1st divided message, it is difficult to understand which the 2nd
>> divided message belongs to. This problem can also occur for the situation where
>> multiple message lines without KERN_CONT are broken into by similar messages.
>> For example, SCSI disk error messages can be show as follows:
>>
>>          [110781.736171] sd 2:0:0:0: [sdb]
>>          [110781.736170] sd 3:0:0:0: [sdc] Unhandled sense code
>>          [110781.736172] sd 3:0:0:0: [sdc]
>>          [110781.736175] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
>>          [110781.736177] sd 3:0:0:0: [sdc]
>>          [110781.736178] Sense Key : Medium Error [current]
>>          [110781.736187] Sense Key : Recovered Error
>>          [110781.736189] [current]
>
> [...]
>
>> This patch adds PID and interrupt context flag to the header of /dev/kmsg as
>> the context information in order to understand relation of output messages. If
>> PID values of two messages and the interrupt context flags are same, it means
>> that those messages are same context, so those message have some relation. If
>> not so, it means that those messages are different context, so users do not
>> need to take care about the relation of the messages.
>
> [...]
>
>> --- a/kernel/printk/printk.c
>> +++ b/kernel/printk/printk.c
>> @@ -197,6 +197,8 @@ enum log_flags {
>>
>>   struct printk_log {
>>          u64 ts_nsec;            /* timestamp in nanoseconds */
>> +       pid_t pid;              /* identify PID */
>> +       u32 irq_count;          /* identify irq_count */
>
> I don't think it is worth to blow-up this heavily used struct up even
> more, just because SCSI cannot log in simple single calls. How about
> fixing SCSI to log to a local buffer if it cannot safely print one
> line at once.
>
> I'm not convinced, that turning the /dev/kmsg format into an
> "annotation language', and requiring a rather complex state machine to
> re-construct the "broken" logging makes too much sense here. If tools
> rely on properly formatted logging messages, the logging should be
> fixed at the source while it is logged, not be reconstructed later.
>
> It is not not my call, but I don't this makes much sense. Continuation
> lines are "best effort" not a facility that is or ever was reliable in
> the past. I think the proper fix is how the log is created, not how it
> is exported.

As we discussed about this issue in LinuxCon Japan, mixed messages will
be fixed in each subsystem. In the SCSI layer, tracing feature will be
used. (http://www.spinics.net/lists/linux-scsi/msg74428.html)
If we found the same issue in other subsystem, we would like to add
new generic printk API to avoid mixed messages.

Thank you,
Yoshihiro YUNOMAE

-- 
Yoshihiro YUNOMAE
Software Platform Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: yoshihiro.yunomae.ez@hitachi.com



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Re: [RFC PATCH 1/2] printk: Add context information to the header of /dev/kmsg
  2014-05-20  3:36   ` Valdis.Kletnieks
@ 2014-05-26  7:18     ` Yoshihiro YUNOMAE
  0 siblings, 0 replies; 7+ messages in thread
From: Yoshihiro YUNOMAE @ 2014-05-26  7:18 UTC (permalink / raw)
  To: Valdis.Kletnieks
  Cc: Andrew Morton, Kay Sievers, linux-kernel, Kees Cook,
	Randy Dunlap, Petr Mladek, Hidehiro Kawai, yrl.pp-manager.tt,
	Masami Hiramatsu, Joe Perches, Arun KS

Hi,

(2014/05/20 12:36), Valdis.Kletnieks@vt.edu wrote:
> On Mon, 19 May 2014 19:28:38 +0900, Yoshihiro YUNOMAE said:
>
>> The header information is changed as follows:
>>     <prefix>,<seq#>,<timestamp>,<fragmented>,<PID>/<interrupt context>;<msg>
>
> Has this been tested with popular userspace that reads /dev/kmsg, to make
> sure that it actually plays nice with a 5th field being added?  Yes, I know

I checked it for systemd-journald and rsyslog on the source code level,
those tools don't handle new flag and field before ';' in the header of
/dev/kmsg. Those tools use only log level, sequence#, and timestamp.
So, even if we add these new flag and field for each message,
those tools will be able to work.

But, we'll fix this issue by other method if we found it in any
subsystems except for the SCSI layer.
(In the SCSI layer, we'll tackle this issue by using tracing feature.)

Thank you,
Yoshihiro YUNOMAE

-- 
Yoshihiro YUNOMAE
Software Platform Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: yoshihiro.yunomae.ez@hitachi.com



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2014-05-26  7:18 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-05-19 10:28 [RFC PATCH 0/2] printk: Add context information to kernel messages from /dev/kmsg Yoshihiro YUNOMAE
2014-05-19 10:28 ` [RFC PATCH 1/2] printk: Add context information to the header of /dev/kmsg Yoshihiro YUNOMAE
2014-05-19 11:35   ` Kay Sievers
2014-05-26  7:16     ` Yoshihiro YUNOMAE
2014-05-20  3:36   ` Valdis.Kletnieks
2014-05-26  7:18     ` Yoshihiro YUNOMAE
2014-05-19 10:28 ` [RFC PATCH 2/2] printk: Simplify fragmented line information of " Yoshihiro YUNOMAE

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).