linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] printk: don't unconditionally shortcut print_time()
@ 2018-11-24 10:50 Tetsuo Handa
  2018-11-24 14:28 ` [PATCH] printk: deduplicate print_prefix() calls Tetsuo Handa
  2018-11-29 14:19 ` [PATCH] printk: don't unconditionally shortcut print_time() Petr Mladek
  0 siblings, 2 replies; 18+ messages in thread
From: Tetsuo Handa @ 2018-11-24 10:50 UTC (permalink / raw)
  To: Petr Mladek, Sergey Senozhatsky, Steven Rostedt
  Cc: linux-kernel, Tetsuo Handa

Since /sys/module/printk/parameters/time can change from N to Y between
"msg_print_text() called print_prefix() with buf == NULL" and
"msg_print_text() again calls print_prefix() with buf != NULL", it is not
safe for print_time() to unconditionally return 0 if printk_time == false.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
 kernel/printk/printk.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 5c2079d..df7ced0 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1230,7 +1230,7 @@ static size_t print_time(u64 ts, char *buf)
 {
 	unsigned long rem_nsec;
 
-	if (!printk_time)
+	if (!printk_time && buf)
 		return 0;
 
 	rem_nsec = do_div(ts, 1000000000);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH] printk: deduplicate print_prefix() calls
  2018-11-24 10:50 [PATCH] printk: don't unconditionally shortcut print_time() Tetsuo Handa
@ 2018-11-24 14:28 ` Tetsuo Handa
  2018-11-26  7:56   ` Sergey Senozhatsky
  2018-11-29 14:35   ` Petr Mladek
  2018-11-29 14:19 ` [PATCH] printk: don't unconditionally shortcut print_time() Petr Mladek
  1 sibling, 2 replies; 18+ messages in thread
From: Tetsuo Handa @ 2018-11-24 14:28 UTC (permalink / raw)
  To: Petr Mladek, Sergey Senozhatsky, Steven Rostedt
  Cc: linux-kernel, Tetsuo Handa

Since /sys/module/printk/parameters/time can change from N to Y between
"msg_print_text() called print_prefix() with buf == NULL" and
"msg_print_text() again calls print_prefix() with buf != NULL", it is not
safe for print_time() to unconditionally return 0 if printk_time == false.

But print_prefix() is called by only msg_print_text(), and print_prefix()
is called once more for calculating output length of prefix, and there is
no need to recalculate it when one log entry contains multiple lines.

Since max output length for syslog marker and timestamp are known to be
small enough, let's close this race by deduplicating print_prefix() calls.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
 kernel/printk/printk.c | 64 +++++++++++++++-----------------------------------
 1 file changed, 19 insertions(+), 45 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 1b2a029..eac4c80 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1213,50 +1213,26 @@ static inline void boot_delay_msec(int level)
 static bool printk_time = IS_ENABLED(CONFIG_PRINTK_TIME);
 module_param_named(time, printk_time, bool, S_IRUGO | S_IWUSR);
 
-static size_t print_time(u64 ts, char *buf)
-{
-	unsigned long rem_nsec;
-
-	if (!printk_time)
-		return 0;
-
-	rem_nsec = do_div(ts, 1000000000);
-
-	if (!buf)
-		return snprintf(NULL, 0, "[%5lu.000000] ", (unsigned long)ts);
-
-	return sprintf(buf, "[%5lu.%06lu] ",
-		       (unsigned long)ts, rem_nsec / 1000);
-}
-
-static size_t print_prefix(const struct printk_log *msg, bool syslog, char *buf)
-{
-	size_t len = 0;
-	unsigned int prefix = (msg->facility << 3) | msg->level;
-
-	if (syslog) {
-		if (buf) {
-			len += sprintf(buf, "<%u>", prefix);
-		} else {
-			len += 3;
-			if (prefix > 999)
-				len += 3;
-			else if (prefix > 99)
-				len += 2;
-			else if (prefix > 9)
-				len++;
-		}
-	}
-
-	len += print_time(msg->ts_nsec, buf ? buf + len : NULL);
-	return len;
-}
-
 static size_t msg_print_text(const struct printk_log *msg, bool syslog, char *buf, size_t size)
 {
 	const char *text = log_text(msg);
 	size_t text_size = msg->text_len;
 	size_t len = 0;
+	char prefix[32]; /* "<2047>[18446744073.709551] " */
+	size_t prefix_len = 0;
+
+	if (syslog)
+		prefix_len += snprintf(prefix, sizeof(prefix), "<%u>",
+				       (msg->facility << 3) | msg->level);
+	if (printk_time) {
+		u64 ts = msg->ts_nsec;
+		unsigned long rem_nsec = do_div(ts, 1000000000);
+
+		prefix_len += snprintf(prefix + prefix_len,
+				       sizeof(prefix) - prefix_len,
+				       "[%5lu.%06lu] ", (unsigned long)ts,
+				       rem_nsec / 1000);
+	}
 
 	do {
 		const char *next = memchr(text, '\n', text_size);
@@ -1271,19 +1247,17 @@ static size_t msg_print_text(const struct printk_log *msg, bool syslog, char *bu
 		}
 
 		if (buf) {
-			if (print_prefix(msg, syslog, NULL) +
-			    text_len + 1 >= size - len)
+			if (prefix_len + text_len + 1 >= size - len)
 				break;
 
-			len += print_prefix(msg, syslog, buf + len);
+			memcpy(buf + len, prefix, prefix_len);
+			len += prefix_len;
 			memcpy(buf + len, text, text_len);
 			len += text_len;
 			buf[len++] = '\n';
 		} else {
 			/* SYSLOG_ACTION_* buffer size only calculation */
-			len += print_prefix(msg, syslog, NULL);
-			len += text_len;
-			len++;
+			len += prefix_len + text_len + 1;
 		}
 
 		text = next;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH] printk: deduplicate print_prefix() calls
  2018-11-24 14:28 ` [PATCH] printk: deduplicate print_prefix() calls Tetsuo Handa
@ 2018-11-26  7:56   ` Sergey Senozhatsky
  2018-11-26 10:21     ` Tetsuo Handa
  2018-11-29 14:35   ` Petr Mladek
  1 sibling, 1 reply; 18+ messages in thread
From: Sergey Senozhatsky @ 2018-11-26  7:56 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Petr Mladek, Sergey Senozhatsky, Steven Rostedt, linux-kernel

On (11/24/18 23:28), Tetsuo Handa wrote:
> Since /sys/module/printk/parameters/time can change from N to Y between
> "msg_print_text() called print_prefix() with buf == NULL" and
> "msg_print_text() again calls print_prefix() with buf != NULL", it is not
> safe for print_time() to unconditionally return 0 if printk_time == false.
> 
> But print_prefix() is called by only msg_print_text(), and print_prefix()
> is called once more for calculating output length of prefix, and there is
> no need to recalculate it when one log entry contains multiple lines.

Good catch.

> Since max output length for syslog marker and timestamp are known to be
> small enough, let's close this race by deduplicating print_prefix() calls.
> 
> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>

Or we can change printk_time under logbuf lock? So msg_print_text will
not race with it.

Or we can "copy" the value of printk_time and use it in print_prefix(),
and we will pick up an updated value next time we do msg_print_text()?

Example:

---

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 1b2a029360b7..4c57f5d5b0b6 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1211,13 +1211,14 @@ static inline void boot_delay_msec(int level)
 #endif
 
 static bool printk_time = IS_ENABLED(CONFIG_PRINTK_TIME);
+static bool printk_time_saved;
 module_param_named(time, printk_time, bool, S_IRUGO | S_IWUSR);
 
 static size_t print_time(u64 ts, char *buf)
 {
 	unsigned long rem_nsec;
 
-	if (!printk_time)
+	if (!printk_time_saved)
 		return 0;
 
 	rem_nsec = do_div(ts, 1000000000);
@@ -1258,6 +1259,8 @@ static size_t msg_print_text(const struct printk_log *msg, bool syslog, char *bu
 	size_t text_size = msg->text_len;
 	size_t len = 0;
 
+	printk_time_saved = printk_time;
+
 	do {
 		const char *next = memchr(text, '\n', text_size);
 		size_t text_len;
 

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH] printk: deduplicate print_prefix() calls
  2018-11-26  7:56   ` Sergey Senozhatsky
@ 2018-11-26 10:21     ` Tetsuo Handa
  0 siblings, 0 replies; 18+ messages in thread
From: Tetsuo Handa @ 2018-11-26 10:21 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Petr Mladek, Sergey Senozhatsky, Steven Rostedt, linux-kernel

On 2018/11/26 16:56, Sergey Senozhatsky wrote:
> Or we can "copy" the value of printk_time and use it in print_prefix(),
> and we will pick up an updated value next time we do msg_print_text()?

Yes, that is also possible. But it would be better to pass as a function
argument to print_prefix() rather than adding a global variable, and omit
calling print_time() if the passed argument is false. But however, removing
print_prefix() reduces object size, for it eliminates redundant instructions.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] printk: don't unconditionally shortcut print_time()
  2018-11-24 10:50 [PATCH] printk: don't unconditionally shortcut print_time() Tetsuo Handa
  2018-11-24 14:28 ` [PATCH] printk: deduplicate print_prefix() calls Tetsuo Handa
@ 2018-11-29 14:19 ` Petr Mladek
  2018-11-30  2:26   ` Tetsuo Handa
  2018-11-30  4:24   ` Sergey Senozhatsky
  1 sibling, 2 replies; 18+ messages in thread
From: Petr Mladek @ 2018-11-29 14:19 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: Sergey Senozhatsky, Steven Rostedt, linux-kernel

On Sat 2018-11-24 19:50:13, Tetsuo Handa wrote:
> Since /sys/module/printk/parameters/time can change from N to Y between
> "msg_print_text() called print_prefix() with buf == NULL" and
> "msg_print_text() again calls print_prefix() with buf != NULL", it is not
> safe for print_time() to unconditionally return 0 if printk_time == false.
>
> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> ---
>  kernel/printk/printk.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index 5c2079d..df7ced0 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -1230,7 +1230,7 @@ static size_t print_time(u64 ts, char *buf)
>  {
>  	unsigned long rem_nsec;
>  
> -	if (!printk_time)
> +	if (!printk_time && buf)

This fixes a real problem. It might even avoid a crash.

See syslog_print_all() and kmsg_dump_get_buffer(). They
start with calling msg_print_text() many times to calculate
how many messages fit into the given buffer. Then they
blindly copy the messages into the buffer.

Now, the buffer might overflow if the size is calculated
with printk_time disabled and the messages are copied with
printk_time enabled.


Well, the time stamps are 22% of the printed characters on
my test systems:

$> dmesg | wc -l
524
$> dmesg | wc -c
33013
>$ bc -l
524 * 14 / 33013
.22221549086723411989

It seems worth supporting the disabled printk_time() properly.
I am not sure who is disabling the timestamps. But people might
worry if we waste 22% of the given buffer.

IMHO, passing it as parameter to msg_print_text(), print_prefix(),
and print_time() looks acceptable.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] printk: deduplicate print_prefix() calls
  2018-11-24 14:28 ` [PATCH] printk: deduplicate print_prefix() calls Tetsuo Handa
  2018-11-26  7:56   ` Sergey Senozhatsky
@ 2018-11-29 14:35   ` Petr Mladek
  1 sibling, 0 replies; 18+ messages in thread
From: Petr Mladek @ 2018-11-29 14:35 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: Sergey Senozhatsky, Steven Rostedt, linux-kernel

On Sat 2018-11-24 23:28:36, Tetsuo Handa wrote:
> Since /sys/module/printk/parameters/time can change from N to Y between
> "msg_print_text() called print_prefix() with buf == NULL" and
> "msg_print_text() again calls print_prefix() with buf != NULL", it is not
> safe for print_time() to unconditionally return 0 if printk_time == false.
> 
> But print_prefix() is called by only msg_print_text(), and print_prefix()
> is called once more for calculating output length of prefix, and there is
> no need to recalculate it when one log entry contains multiple lines.
> 
> Since max output length for syslog marker and timestamp are known to be
> small enough, let's close this race by deduplicating print_prefix() calls.
> 
> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> ---
>  kernel/printk/printk.c | 64 +++++++++++++++-----------------------------------
>  1 file changed, 19 insertions(+), 45 deletions(-)
> 
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index 1b2a029..eac4c80 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -1213,50 +1213,26 @@ static inline void boot_delay_msec(int level)
>  static bool printk_time = IS_ENABLED(CONFIG_PRINTK_TIME);
>  module_param_named(time, printk_time, bool, S_IRUGO | S_IWUSR);
>  
> -static size_t print_time(u64 ts, char *buf)
> -{
> -	unsigned long rem_nsec;
> -
> -	if (!printk_time)
> -		return 0;
> -
> -	rem_nsec = do_div(ts, 1000000000);
> -
> -	if (!buf)
> -		return snprintf(NULL, 0, "[%5lu.000000] ", (unsigned long)ts);
> -
> -	return sprintf(buf, "[%5lu.%06lu] ",
> -		       (unsigned long)ts, rem_nsec / 1000);
> -}
> -
> -static size_t print_prefix(const struct printk_log *msg, bool syslog, char *buf)
> -{
> -	size_t len = 0;
> -	unsigned int prefix = (msg->facility << 3) | msg->level;
> -
> -	if (syslog) {
> -		if (buf) {
> -			len += sprintf(buf, "<%u>", prefix);
> -		} else {
> -			len += 3;
> -			if (prefix > 999)
> -				len += 3;
> -			else if (prefix > 99)
> -				len += 2;
> -			else if (prefix > 9)
> -				len++;
> -		}
> -	}
> -
> -	len += print_time(msg->ts_nsec, buf ? buf + len : NULL);
> -	return len;
> -}
> -
>  static size_t msg_print_text(const struct printk_log *msg, bool syslog, char *buf, size_t size)
>  {
>  	const char *text = log_text(msg);
>  	size_t text_size = msg->text_len;
>  	size_t len = 0;
> +	char prefix[32]; /* "<2047>[18446744073.709551] " */
> +	size_t prefix_len = 0;
> +
> +	if (syslog)
> +		prefix_len += snprintf(prefix, sizeof(prefix), "<%u>",
> +				       (msg->facility << 3) | msg->level);
> +	if (printk_time) {

This does not solve the real problem (buffer overflow) in
syslog_print_all() and kmsg_dump_get_buffer().

> +		u64 ts = msg->ts_nsec;
> +		unsigned long rem_nsec = do_div(ts, 1000000000);
> +
> +		prefix_len += snprintf(prefix + prefix_len,
> +				       sizeof(prefix) - prefix_len,
> +				       "[%5lu.%06lu] ", (unsigned long)ts,
> +				       rem_nsec / 1000);
> +	}

I am not against removing some optimizations when buf == NULL.
It is a crazy code and the win is questionable. But the helper
functions still might make sense. msg_print_text() is complicated
enough as is.

Finally, I have mixed feelings about the memcpy() of the prefix.
On one hand, it makes some conditions a bit more readable.
On the other hand, it looks strange to copy it when most
messages have only one line and one prefix. It does not
look like a win in the end.

Let's fix the buffer overflow first and do any code clean up
later in a separate patch.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] printk: don't unconditionally shortcut print_time()
  2018-11-29 14:19 ` [PATCH] printk: don't unconditionally shortcut print_time() Petr Mladek
@ 2018-11-30  2:26   ` Tetsuo Handa
  2018-11-30 12:30     ` Petr Mladek
  2018-11-30  4:24   ` Sergey Senozhatsky
  1 sibling, 1 reply; 18+ messages in thread
From: Tetsuo Handa @ 2018-11-30  2:26 UTC (permalink / raw)
  To: Petr Mladek; +Cc: Sergey Senozhatsky, Steven Rostedt, linux-kernel

Petr Mladek wrote:
> See syslog_print_all() and kmsg_dump_get_buffer(). They
> start with calling msg_print_text() many times to calculate
> how many messages fit into the given buffer. Then they
> blindly copy the messages into the buffer.
> 
> Now, the buffer might overflow if the size is calculated
> with printk_time disabled and the messages are copied with
> printk_time enabled.
> 

Indeed, SYSLOG_ACTION_READ_ALL case can write more bytes than userspace
has supplied, resulting in possible memory corruption in userpsace. And
obtaining a snapshot of printk_time under logbuf_lock won't help, for we
need to drop logbuf_lock before calling copy_to_user().

----------
@@ -1365,6 +1365,7 @@ static int syslog_print_all(char __user *buf, int size, bool clear)
 	if (!text)
 		return -ENOMEM;
 
+	printk_time = false;
 	logbuf_lock_irq();
 	/*
 	 * Find first record that fits, including all following records,
@@ -1409,6 +1410,7 @@ static int syslog_print_all(char __user *buf, int size, bool clear)
 		seq++;
 
 		logbuf_unlock_irq();
+		printk_time = true;
 		if (copy_to_user(buf + len, text, textlen))
 			len = -EFAULT;
 		else
----------

----------
#include <sys/klog.h>
#define SYSLOG_ACTION_READ_ALL 3

int main(int argc, char *argv[])
{
        char buffer[4096];
        int i;
        for (i = 0; i < sizeof(buffer); i++)
                if (klogctl(SYSLOG_ACTION_READ_ALL, buffer, i) > i)
                        return 1;
        return 0;
}
----------

----------
syslog(SYSLOG_ACTION_READ_ALL, "", 0)   = 0
(...snipped...)
syslog(SYSLOG_ACTION_READ_ALL, "", 59)  = 0
syslog(SYSLOG_ACTION_READ_ALL, "<6>IPv6: ADDRCONF(NETDEV_CHANGE): ens32: link becomes ready\n", 60) = 60
(...snipped...)
syslog(SYSLOG_ACTION_READ_ALL, "<6>IPv6: ADDRCONF(NETDEV_CHANGE): ens32: link becomes ready\n", 114) = 60
syslog(SYSLOG_ACTION_READ_ALL, "<6>IPv6: ADDRCONF(NETDEV_UP): ens32: link is not ready\n<6>[   10.517132] IPv6: ADDRCONF(NETDEV_CHANGE): ens32: link becomes ready\n", 115) = 130
----------

> IMHO, passing it as parameter to msg_print_text(), print_prefix(),
> and print_time() looks acceptable.

Below is a patch for passing a snapshot of printk_time to msg_print_text(),
but we are still failing to close a race explained at bottom.

 kernel/printk/printk.c | 59 ++++++++++++++++++++++++--------------------------
 1 file changed, 28 insertions(+), 31 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 1b2a029..3838a6b 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1215,12 +1215,7 @@ static inline void boot_delay_msec(int level)
 
 static size_t print_time(u64 ts, char *buf)
 {
-	unsigned long rem_nsec;
-
-	if (!printk_time)
-		return 0;
-
-	rem_nsec = do_div(ts, 1000000000);
+	unsigned long rem_nsec = do_div(ts, 1000000000);
 
 	if (!buf)
 		return snprintf(NULL, 0, "[%5lu.000000] ", (unsigned long)ts);
@@ -1229,7 +1224,8 @@ static size_t print_time(u64 ts, char *buf)
 		       (unsigned long)ts, rem_nsec / 1000);
 }
 
-static size_t print_prefix(const struct printk_log *msg, bool syslog, char *buf)
+static size_t print_prefix(const struct printk_log *msg, bool syslog,
+			   bool time, char *buf)
 {
 	size_t len = 0;
 	unsigned int prefix = (msg->facility << 3) | msg->level;
@@ -1248,11 +1244,13 @@ static size_t print_prefix(const struct printk_log *msg, bool syslog, char *buf)
 		}
 	}
 
-	len += print_time(msg->ts_nsec, buf ? buf + len : NULL);
+	if (time)
+		len += print_time(msg->ts_nsec, buf ? buf + len : NULL);
 	return len;
 }
 
-static size_t msg_print_text(const struct printk_log *msg, bool syslog, char *buf, size_t size)
+static size_t msg_print_text(const struct printk_log *msg, bool syslog,
+			     bool time, char *buf, size_t size)
 {
 	const char *text = log_text(msg);
 	size_t text_size = msg->text_len;
@@ -1271,17 +1269,17 @@ static size_t msg_print_text(const struct printk_log *msg, bool syslog, char *bu
 		}
 
 		if (buf) {
-			if (print_prefix(msg, syslog, NULL) +
+			if (print_prefix(msg, syslog, time, NULL) +
 			    text_len + 1 >= size - len)
 				break;
 
-			len += print_prefix(msg, syslog, buf + len);
+			len += print_prefix(msg, syslog, time, buf + len);
 			memcpy(buf + len, text, text_len);
 			len += text_len;
 			buf[len++] = '\n';
 		} else {
 			/* SYSLOG_ACTION_* buffer size only calculation */
-			len += print_prefix(msg, syslog, NULL);
+			len += print_prefix(msg, syslog, time, NULL);
 			len += text_len;
 			len++;
 		}
@@ -1320,7 +1318,8 @@ static int syslog_print(char __user *buf, int size)
 
 		skip = syslog_partial;
 		msg = log_from_idx(syslog_idx);
-		n = msg_print_text(msg, true, text, LOG_LINE_MAX + PREFIX_MAX);
+		n = msg_print_text(msg, true, printk_time, text,
+				   LOG_LINE_MAX + PREFIX_MAX);
 		if (n - syslog_partial <= size) {
 			/* message fits into buffer, move forward */
 			syslog_idx = log_next(syslog_idx);
@@ -1360,11 +1359,13 @@ static int syslog_print_all(char __user *buf, int size, bool clear)
 	u64 next_seq;
 	u64 seq;
 	u32 idx;
+	bool time;
 
 	text = kmalloc(LOG_LINE_MAX + PREFIX_MAX, GFP_KERNEL);
 	if (!text)
 		return -ENOMEM;
 
+	time = printk_time;
 	logbuf_lock_irq();
 	/*
 	 * Find first record that fits, including all following records,
@@ -1375,7 +1376,7 @@ static int syslog_print_all(char __user *buf, int size, bool clear)
 	while (seq < log_next_seq) {
 		struct printk_log *msg = log_from_idx(idx);
 
-		len += msg_print_text(msg, true, NULL, 0);
+		len += msg_print_text(msg, true, time, NULL, 0);
 		idx = log_next(idx);
 		seq++;
 	}
@@ -1386,7 +1387,7 @@ static int syslog_print_all(char __user *buf, int size, bool clear)
 	while (len > size && seq < log_next_seq) {
 		struct printk_log *msg = log_from_idx(idx);
 
-		len -= msg_print_text(msg, true, NULL, 0);
+		len -= msg_print_text(msg, true, time, NULL, 0);
 		idx = log_next(idx);
 		seq++;
 	}
@@ -1397,14 +1398,9 @@ static int syslog_print_all(char __user *buf, int size, bool clear)
 	len = 0;
 	while (len >= 0 && seq < next_seq) {
 		struct printk_log *msg = log_from_idx(idx);
-		int textlen;
+		int textlen = msg_print_text(msg, true, time, text,
+					     LOG_LINE_MAX + PREFIX_MAX);
 
-		textlen = msg_print_text(msg, true, text,
-					 LOG_LINE_MAX + PREFIX_MAX);
-		if (textlen < 0) {
-			len = textlen;
-			break;
-		}
 		idx = log_next(idx);
 		seq++;
 
@@ -1532,7 +1528,8 @@ int do_syslog(int type, char __user *buf, int len, int source)
 			while (seq < log_next_seq) {
 				struct printk_log *msg = log_from_idx(idx);
 
-				error += msg_print_text(msg, true, NULL, 0);
+				error += msg_print_text(msg, true, printk_time,
+							NULL, 0);
 				idx = log_next(idx);
 				seq++;
 			}
@@ -2028,8 +2025,8 @@ static void console_lock_spinning_enable(void) { }
 static int console_lock_spinning_disable_and_check(void) { return 0; }
 static void call_console_drivers(const char *ext_text, size_t ext_len,
 				 const char *text, size_t len) {}
-static size_t msg_print_text(const struct printk_log *msg,
-			     bool syslog, char *buf, size_t size) { return 0; }
+static size_t msg_print_text(const struct printk_log *msg, bool syslog,
+			     bool time, char *buf, size_t size) { return 0; }
 static bool suppress_message_printing(int level) { return false; }
 
 #endif /* CONFIG_PRINTK */
@@ -2387,8 +2384,7 @@ void console_unlock(void)
 
 		len += msg_print_text(msg,
 				console_msg_format & MSG_FORMAT_SYSLOG,
-				text + len,
-				sizeof(text) - len);
+				printk_time, text + len, sizeof(text) - len);
 		if (nr_ext_console_drivers) {
 			ext_len = msg_print_ext_header(ext_text,
 						sizeof(ext_text),
@@ -3112,7 +3108,7 @@ bool kmsg_dump_get_line_nolock(struct kmsg_dumper *dumper, bool syslog,
 		goto out;
 
 	msg = log_from_idx(dumper->cur_idx);
-	l = msg_print_text(msg, syslog, line, size);
+	l = msg_print_text(msg, syslog, printk_time, line, size);
 
 	dumper->cur_idx = log_next(dumper->cur_idx);
 	dumper->cur_seq++;
@@ -3183,6 +3179,7 @@ bool kmsg_dump_get_buffer(struct kmsg_dumper *dumper, bool syslog,
 	u32 next_idx;
 	size_t l = 0;
 	bool ret = false;
+	bool time = printk_time;
 
 	if (!dumper->active)
 		goto out;
@@ -3206,7 +3203,7 @@ bool kmsg_dump_get_buffer(struct kmsg_dumper *dumper, bool syslog,
 	while (seq < dumper->next_seq) {
 		struct printk_log *msg = log_from_idx(idx);
 
-		l += msg_print_text(msg, true, NULL, 0);
+		l += msg_print_text(msg, true, time, NULL, 0);
 		idx = log_next(idx);
 		seq++;
 	}
@@ -3217,7 +3214,7 @@ bool kmsg_dump_get_buffer(struct kmsg_dumper *dumper, bool syslog,
 	while (l > size && seq < dumper->next_seq) {
 		struct printk_log *msg = log_from_idx(idx);
 
-		l -= msg_print_text(msg, true, NULL, 0);
+		l -= msg_print_text(msg, true, time, NULL, 0);
 		idx = log_next(idx);
 		seq++;
 	}
@@ -3230,7 +3227,7 @@ bool kmsg_dump_get_buffer(struct kmsg_dumper *dumper, bool syslog,
 	while (seq < dumper->next_seq) {
 		struct printk_log *msg = log_from_idx(idx);
 
-		l += msg_print_text(msg, syslog, buf + l, size - l);
+		l += msg_print_text(msg, syslog, time, buf + l, size - l);
 		idx = log_next(idx);
 		seq++;
 	}
-- 
1.8.3.1

The remaining race is that klogctl(SYSLOG_ACTION_SIZE_UNREAD) after klogctl(SYSLOG_ACTION_READ)
can return negative value upon success, for SYSLOG_ACTION_SIZE_UNREAD depends that printk_time
did not change (from true to false) after SYSLOG_ACTION_READ updated syslog_partial. To close
this race, we need to make sure that printk_time is evaluated only once for each record.
That is, evaluate printk_time upon e.g. log_store() rather than upon msg_print_text().

 kernel/printk/printk.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 1b2a029..c692fbc 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -421,6 +421,8 @@ __packed __aligned(4)
 static u64 clear_seq;
 static u32 clear_idx;
 
+static bool printk_time = IS_ENABLED(CONFIG_PRINTK_TIME);
+
 #define PREFIX_MAX		32
 #define LOG_LINE_MAX		(1024 - PREFIX_MAX)
 
@@ -620,7 +622,9 @@ static int log_store(int facility, int level,
 	msg->facility = facility;
 	msg->level = level & 7;
 	msg->flags = flags & 0x1f;
-	if (ts_nsec > 0)
+	if (!printk_time)
+		msg->ts_nsec = (u64) -1ULL;
+	else if (ts_nsec > 0)
 		msg->ts_nsec = ts_nsec;
 	else
 		msg->ts_nsec = local_clock();
@@ -1210,14 +1214,13 @@ static inline void boot_delay_msec(int level)
 }
 #endif
 
-static bool printk_time = IS_ENABLED(CONFIG_PRINTK_TIME);
 module_param_named(time, printk_time, bool, S_IRUGO | S_IWUSR);
 
 static size_t print_time(u64 ts, char *buf)
 {
 	unsigned long rem_nsec;
 
-	if (!printk_time)
+	if (ts == (u64) -1ULL)
 		return 0;
 
 	rem_nsec = do_div(ts, 1000000000);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH] printk: don't unconditionally shortcut print_time()
  2018-11-29 14:19 ` [PATCH] printk: don't unconditionally shortcut print_time() Petr Mladek
  2018-11-30  2:26   ` Tetsuo Handa
@ 2018-11-30  4:24   ` Sergey Senozhatsky
  1 sibling, 0 replies; 18+ messages in thread
From: Sergey Senozhatsky @ 2018-11-30  4:24 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Tetsuo Handa, Sergey Senozhatsky, Steven Rostedt, linux-kernel

On (11/29/18 15:19), Petr Mladek wrote:
> This fixes a real problem. It might even avoid a crash.
> 
> See syslog_print_all() and kmsg_dump_get_buffer(). They
> start with calling msg_print_text() many times to calculate
> how many messages fit into the given buffer. Then they
> blindly copy the messages into the buffer.

Hmm. Interesting find.

So can we just use a pessimistic approach then?

If IS_ENABLED(CONFIG_PRINTK_TIME) then always calculate
buffer size as if printk_time was true; but use the actual
printk_time when we copy out messages. Yes, the buffer can
be larger than needed, but at least we don't care about any
races anymore. And we also don't need to pass printk_time
snapshot while we copy out messages.

I really don't know who and why would enable CONFIG_PRINTK_TIME
but then disable printk_time via param. So most of the time that
pessimistic approach should calculate the right buffer size.
Probably.

[..]
> It seems worth supporting the disabled printk_time() properly.
> I am not sure who is disabling the timestamps.

Yeah, I'm curious as well. Can't see why would anyone switch
it back and forth. Kconfig and boot param work just fine.

	-ss

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] printk: don't unconditionally shortcut print_time()
  2018-11-30  2:26   ` Tetsuo Handa
@ 2018-11-30 12:30     ` Petr Mladek
  2018-11-30 14:33       ` Tetsuo Handa
  0 siblings, 1 reply; 18+ messages in thread
From: Petr Mladek @ 2018-11-30 12:30 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: Sergey Senozhatsky, Steven Rostedt, linux-kernel

On Fri 2018-11-30 11:26:02, Tetsuo Handa wrote:
> Petr Mladek wrote:
> > See syslog_print_all() and kmsg_dump_get_buffer(). They
> > start with calling msg_print_text() many times to calculate
> > how many messages fit into the given buffer. Then they
> > blindly copy the messages into the buffer.
> > 
> > Now, the buffer might overflow if the size is calculated
> > with printk_time disabled and the messages are copied with
> > printk_time enabled.
> > 
> 
> Indeed, SYSLOG_ACTION_READ_ALL case can write more bytes than userspace
> has supplied, resulting in possible memory corruption in userpsace. And
> obtaining a snapshot of printk_time under logbuf_lock won't help, for we
> need to drop logbuf_lock before calling copy_to_user().
> 
> > IMHO, passing it as parameter to msg_print_text(), print_prefix(),
> > and print_time() looks acceptable.
> 
> Below is a patch for passing a snapshot of printk_time to msg_print_text(),
> but we are still failing to close a race explained at bottom.

The patch looks fine to me.
 
> The remaining race is that klogctl(SYSLOG_ACTION_SIZE_UNREAD) after klogctl(SYSLOG_ACTION_READ)
> can return negative value upon success, for SYSLOG_ACTION_SIZE_UNREAD depends that printk_time
> did not change (from true to false) after SYSLOG_ACTION_READ updated syslog_partial. To close
> this race, we need to make sure that printk_time is evaluated only once for each record.
> That is, evaluate printk_time upon e.g. log_store() rather than upon msg_print_text().

Great catch!

>  kernel/printk/printk.c | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index 1b2a029..c692fbc 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -421,6 +421,8 @@ __packed __aligned(4)
>  static u64 clear_seq;
>  static u32 clear_idx;
>  
> +static bool printk_time = IS_ENABLED(CONFIG_PRINTK_TIME);
> +
>  #define PREFIX_MAX		32
>  #define LOG_LINE_MAX		(1024 - PREFIX_MAX)
>  
> @@ -620,7 +622,9 @@ static int log_store(int facility, int level,
>  	msg->facility = facility;
>  	msg->level = level & 7;
>  	msg->flags = flags & 0x1f;
> -	if (ts_nsec > 0)
> +	if (!printk_time)
> +		msg->ts_nsec = (u64) -1ULL;
> +	else if (ts_nsec > 0)
>  		msg->ts_nsec = ts_nsec;
>  	else
>  		msg->ts_nsec = local_clock();
> @@ -1210,14 +1214,13 @@ static inline void boot_delay_msec(int level)
>  }
>  #endif
>  
> -static bool printk_time = IS_ENABLED(CONFIG_PRINTK_TIME);
>  module_param_named(time, printk_time, bool, S_IRUGO | S_IWUSR);
>  
>  static size_t print_time(u64 ts, char *buf)
>  {
>  	unsigned long rem_nsec;
>  
> -	if (!printk_time)
> +	if (ts == (u64) -1ULL)

Sigh, we need to always store the timestamp. /dev/kmsg prints it
timestamps even when printk.time parameter is disabled.

Instead, we could use one bit in log_flags instead.

But I feel a bit uneasy that we change the semantic and could break
something. For example, SYSLOG_ACTION_READ_ALL callers might want
to have a consistent output (dmesg -Sr). Also we would need to sync
early boot messages with printk.time kernel parameter.


I am not fully happy with the solution passing time parameter.
It is less secure. But it would not break compatibility.
This particular race might be solved the following way:


diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 1b2a029360b7..5b489988e9b7 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1294,6 +1294,7 @@ static size_t msg_print_text(const struct printk_log *msg, bool syslog, char *bu
 
 static int syslog_print(char __user *buf, int size)
 {
+	static bool time;
 	char *text;
 	struct printk_log *msg;
 	int len = 0;
@@ -1318,9 +1319,13 @@ static int syslog_print(char __user *buf, int size)
 			break;
 		}
 
+		/* Keep partial line consistent */
+		if (!syslog_partial)
+			time = printk_time;
+
 		skip = syslog_partial;
 		msg = log_from_idx(syslog_idx);
-		n = msg_print_text(msg, true, text, LOG_LINE_MAX + PREFIX_MAX);
+		n = msg_print_text(msg, true, time, text, LOG_LINE_MAX + PREFIX_MAX);
 		if (n - syslog_partial <= size) {
 			/* message fits into buffer, move forward */
 			syslog_idx = log_next(syslog_idx);


Best Regards,
Petr

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH] printk: don't unconditionally shortcut print_time()
  2018-11-30 12:30     ` Petr Mladek
@ 2018-11-30 14:33       ` Tetsuo Handa
  2018-12-01 11:30         ` Tetsuo Handa
  2018-12-03 14:14         ` Petr Mladek
  0 siblings, 2 replies; 18+ messages in thread
From: Tetsuo Handa @ 2018-11-30 14:33 UTC (permalink / raw)
  To: Petr Mladek; +Cc: Sergey Senozhatsky, Steven Rostedt, linux-kernel

On 2018/11/30 21:30, Petr Mladek wrote:
> I am not fully happy with the solution passing time parameter.
> It is less secure. But it would not break compatibility.
> This particular race might be solved the following way:
> 
> 
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index 1b2a029360b7..5b489988e9b7 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -1294,6 +1294,7 @@ static size_t msg_print_text(const struct printk_log *msg, bool syslog, char *bu
>  
>  static int syslog_print(char __user *buf, int size)
>  {
> +	static bool time;
>  	char *text;
>  	struct printk_log *msg;
>  	int len = 0;
> @@ -1318,9 +1319,13 @@ static int syslog_print(char __user *buf, int size)
>  			break;
>  		}
>  
> +		/* Keep partial line consistent */
> +		if (!syslog_partial)
> +			time = printk_time;
> +
>  		skip = syslog_partial;
>  		msg = log_from_idx(syslog_idx);
> -		n = msg_print_text(msg, true, text, LOG_LINE_MAX + PREFIX_MAX);
> +		n = msg_print_text(msg, true, time, text, LOG_LINE_MAX + PREFIX_MAX);
>  		if (n - syslog_partial <= size) {

OK. This should avoid making "n < syslog_partial" true.

Running below test code on current linux.git reads garbage bytes

----------------------------------------
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/klog.h>
#define SYSLOG_ACTION_READ 2
#define SYSLOG_ACTION_SIZE_UNREAD 9
int main(int argc, char *argv[])
{
	static char buffer[1024];
	int fd1 = open("/sys/module/printk/parameters/time", O_WRONLY);
	int fd2 = open("/proc/sysrq-trigger", O_WRONLY);
	int size;
	write(fd1, "1\n", 2);
	write(fd2, "h\n", 2);
	while ((size = klogctl(SYSLOG_ACTION_SIZE_UNREAD, NULL, 0)) > 2)
		klogctl(SYSLOG_ACTION_READ, buffer, (size > sizeof(buffer) ? sizeof(buffer) : size) - 1);
	write(fd1, "0\n", 2);
	klogctl(SYSLOG_ACTION_READ, buffer, sizeof(buffer));
	return 0;
}
----------------------------------------

and (I can't find the reproducer but) sometimes crashes like

[  451.988242] BUG: pagefault on kernel address 0xffff8881337fa000 in non-whitelisted uaccess
[  451.990465] BUG: unable to handle kernel paging request at ffff8881337fa000
[  451.992254] PGD 3602067 P4D 3602067 PUD 3605067 PMD 13ff65067 PTE 800ffffecc805060
[  451.994066] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
[  451.995284] CPU: 0 PID: 8147 Comm: a.out Kdump: loaded Not tainted 4.20.0-rc4+ #537
[  451.997268] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017
[  451.999857] RIP: 0010:copy_user_generic_unrolled+0x54/0xc0
[  452.001312] Code: 83 e2 3f c1 e9 06 74 4a 4c 8b 06 4c 8b 4e 08 4c 8b 56 10 4c 8b 5e 18 4c 89 07 4c 89 4f 08 4c 89 57 10 4c 89 5f 18 4c 8b 46 20 <4c> 8b 4e 28 4c 8b 56 30 4c 8b 5e 38 4c 89 47 20 4c 89 4f 28 4c 89
[  452.006119] RSP: 0018:ffffc90002573e90 EFLAGS: 00010202
[  452.007230] RAX: 0000000000701060 RBX: 0000000000100000 RCX: 0000000000002f02
[  452.008792] RDX: 0000000000000000 RSI: ffff8881337f9fd4 RDI: 0000000000644fe0
[  452.010399] RBP: 0000000000601060 R08: 8b494e7401000000 R09: 00000b88858b4900
[  452.011916] R10: 0f02000000c980f6 R11: b4bb830000025d85 R12: ffff8881337b6054
[  452.013351] R13: 0000000000601060 R14: ffff8881337b5fa8 R15: 00000000000000ac
[  452.014807] FS:  00007fb3c9eec740(0000) GS:ffff88813b200000(0000) knlGS:0000000000000000
[  452.016931] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  452.018234] CR2: ffff8881337fa000 CR3: 0000000139153002 CR4: 00000000001606f0
[  452.020038] Call Trace:
[  452.020708]  _copy_to_user+0x56/0x70
[  452.021580]  do_syslog+0x47d/0x7b0
[  452.022359]  ? syscall_trace_enter+0x1c3/0x360
[  452.023395]  ? trace_hardirqs_on_thunk+0x1a/0x1c
[  452.024438]  ? lockdep_hardirqs_off+0x73/0xe0
[  452.025433]  __x64_sys_syslog+0x13/0x20
[  452.026363]  do_syscall_64+0x4f/0x1f0
[  452.027235]  entry_SYSCALL_64_after_hwframe+0x49/0xbe

>  			/* message fits into buffer, move forward */
>  			syslog_idx = log_next(syslog_idx);

But we need to make this "static bool time;" accessible from SYSLOG_ACTION_SIZE_UNREAD case
because syslog_partial subtraction assumes that msg_print_text() used printk_time value
as of syslog_print() updated syslog_partial.

	while (seq < log_next_seq) {
		struct printk_log *msg = log_from_idx(idx);

		error += msg_print_text(msg, true, NULL, 0);
		idx = log_next(idx);
		seq++;
	}
	error -= syslog_partial;

So, we would use a global variable like below?

/*
 * To keep partial line consistent, pass this value to msg_print_text()
 * when accessing syslog_partial.
 */
static bool syslog_partial_printk_time;


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] printk: don't unconditionally shortcut print_time()
  2018-11-30 14:33       ` Tetsuo Handa
@ 2018-12-01 11:30         ` Tetsuo Handa
  2018-12-02  5:02           ` Tetsuo Handa
  2018-12-03 14:14         ` Petr Mladek
  1 sibling, 1 reply; 18+ messages in thread
From: Tetsuo Handa @ 2018-12-01 11:30 UTC (permalink / raw)
  To: Petr Mladek, Sergey Senozhatsky, Steven Rostedt; +Cc: linux-kernel

From bc9ea8e7878a618fd4490ec287eb691fee144acb Mon Sep 17 00:00:00 2001
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Date: Sat, 1 Dec 2018 19:37:42 +0900
Subject: [PATCH] printk: fix printk_time race.

Since printk_time can be toggled via /sys/module/printk/parameters/time ,
it is not safe to assume that output length does not change across
multiple msg_print_text() calls. If we hit this race, we can observe
failures such as SYSLOG_ACTION_READ_ALL writes more bytes than userspace
has supplied, SYSLOG_ACTION_SIZE_UNREAD returns -EFAULT when succeeded,
SYSLOG_ACTION_READ reads garbage memory or even triggers an kernel oops
at _copy_to_user() due to integer overflow.

To close this race, get a snapshot value of printk_time and pass it to
SYSLOG_ACTION_READ, SYSLOG_ACTION_READ_ALL, SYSLOG_ACTION_SIZE_UNREAD and
kmsg_dump_get_buffer().

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Petr Mladek <pmladek@suse.com>
---
 kernel/printk/printk.c | 68 +++++++++++++++++++++++++++-----------------------
 1 file changed, 37 insertions(+), 31 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 5c2079d..64ed21d 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -403,6 +403,7 @@ __packed __aligned(4)
 static u64 syslog_seq;
 static u32 syslog_idx;
 static size_t syslog_partial;
+static bool syslog_time;
 
 /* index and sequence number of the first record stored in the buffer */
 static u64 log_first_seq;
@@ -1228,12 +1229,7 @@ static inline void boot_delay_msec(int level)
 
 static size_t print_time(u64 ts, char *buf)
 {
-	unsigned long rem_nsec;
-
-	if (!printk_time)
-		return 0;
-
-	rem_nsec = do_div(ts, 1000000000);
+	unsigned long rem_nsec = do_div(ts, 1000000000);
 
 	if (!buf)
 		return snprintf(NULL, 0, "[%5lu.000000] ", (unsigned long)ts);
@@ -1242,7 +1238,8 @@ static size_t print_time(u64 ts, char *buf)
 		       (unsigned long)ts, rem_nsec / 1000);
 }
 
-static size_t print_prefix(const struct printk_log *msg, bool syslog, char *buf)
+static size_t print_prefix(const struct printk_log *msg, bool syslog,
+			   bool time, char *buf)
 {
 	size_t len = 0;
 	unsigned int prefix = (msg->facility << 3) | msg->level;
@@ -1261,11 +1258,13 @@ static size_t print_prefix(const struct printk_log *msg, bool syslog, char *buf)
 		}
 	}
 
-	len += print_time(msg->ts_nsec, buf ? buf + len : NULL);
+	if (time)
+		len += print_time(msg->ts_nsec, buf ? buf + len : NULL);
 	return len;
 }
 
-static size_t msg_print_text(const struct printk_log *msg, bool syslog, char *buf, size_t size)
+static size_t msg_print_text(const struct printk_log *msg, bool syslog,
+			     bool time, char *buf, size_t size)
 {
 	const char *text = log_text(msg);
 	size_t text_size = msg->text_len;
@@ -1284,17 +1283,17 @@ static size_t msg_print_text(const struct printk_log *msg, bool syslog, char *bu
 		}
 
 		if (buf) {
-			if (print_prefix(msg, syslog, NULL) +
+			if (print_prefix(msg, syslog, time, NULL) +
 			    text_len + 1 >= size - len)
 				break;
 
-			len += print_prefix(msg, syslog, buf + len);
+			len += print_prefix(msg, syslog, time, buf + len);
 			memcpy(buf + len, text, text_len);
 			len += text_len;
 			buf[len++] = '\n';
 		} else {
 			/* SYSLOG_ACTION_* buffer size only calculation */
-			len += print_prefix(msg, syslog, NULL);
+			len += print_prefix(msg, syslog, time, NULL);
 			len += text_len;
 			len++;
 		}
@@ -1331,9 +1330,17 @@ static int syslog_print(char __user *buf, int size)
 			break;
 		}
 
+		/*
+		 * To keep reading/counting partial line consistent,
+		 * use printk_time value as of the biginning of a line.
+		 */
+		if (!syslog_partial)
+			syslog_time = printk_time;
+
 		skip = syslog_partial;
 		msg = log_from_idx(syslog_idx);
-		n = msg_print_text(msg, true, text, LOG_LINE_MAX + PREFIX_MAX);
+		n = msg_print_text(msg, true, syslog_time, text,
+				   LOG_LINE_MAX + PREFIX_MAX);
 		if (n - syslog_partial <= size) {
 			/* message fits into buffer, move forward */
 			syslog_idx = log_next(syslog_idx);
@@ -1373,11 +1380,13 @@ static int syslog_print_all(char __user *buf, int size, bool clear)
 	u64 next_seq;
 	u64 seq;
 	u32 idx;
+	bool time;
 
 	text = kmalloc(LOG_LINE_MAX + PREFIX_MAX, GFP_KERNEL);
 	if (!text)
 		return -ENOMEM;
 
+	time = printk_time;
 	logbuf_lock_irq();
 	/*
 	 * Find first record that fits, including all following records,
@@ -1388,7 +1397,7 @@ static int syslog_print_all(char __user *buf, int size, bool clear)
 	while (seq < log_next_seq) {
 		struct printk_log *msg = log_from_idx(idx);
 
-		len += msg_print_text(msg, true, NULL, 0);
+		len += msg_print_text(msg, true, time, NULL, 0);
 		idx = log_next(idx);
 		seq++;
 	}
@@ -1399,7 +1408,7 @@ static int syslog_print_all(char __user *buf, int size, bool clear)
 	while (len > size && seq < log_next_seq) {
 		struct printk_log *msg = log_from_idx(idx);
 
-		len -= msg_print_text(msg, true, NULL, 0);
+		len -= msg_print_text(msg, true, time, NULL, 0);
 		idx = log_next(idx);
 		seq++;
 	}
@@ -1410,14 +1419,9 @@ static int syslog_print_all(char __user *buf, int size, bool clear)
 	len = 0;
 	while (len >= 0 && seq < next_seq) {
 		struct printk_log *msg = log_from_idx(idx);
-		int textlen;
+		int textlen = msg_print_text(msg, true, time, text,
+					     LOG_LINE_MAX + PREFIX_MAX);
 
-		textlen = msg_print_text(msg, true, text,
-					 LOG_LINE_MAX + PREFIX_MAX);
-		if (textlen < 0) {
-			len = textlen;
-			break;
-		}
 		idx = log_next(idx);
 		seq++;
 
@@ -1541,11 +1545,13 @@ int do_syslog(int type, char __user *buf, int len, int source)
 		} else {
 			u64 seq = syslog_seq;
 			u32 idx = syslog_idx;
+			bool f = syslog_partial ? syslog_time : printk_time;
 
 			while (seq < log_next_seq) {
 				struct printk_log *msg = log_from_idx(idx);
 
-				error += msg_print_text(msg, true, NULL, 0);
+				error += msg_print_text(msg, true, f, NULL, 0);
+				f = printk_time;
 				idx = log_next(idx);
 				seq++;
 			}
@@ -2026,8 +2032,8 @@ static void console_lock_spinning_enable(void) { }
 static int console_lock_spinning_disable_and_check(void) { return 0; }
 static void call_console_drivers(const char *ext_text, size_t ext_len,
 				 const char *text, size_t len) {}
-static size_t msg_print_text(const struct printk_log *msg,
-			     bool syslog, char *buf, size_t size) { return 0; }
+static size_t msg_print_text(const struct printk_log *msg, bool syslog,
+			     bool time, char *buf, size_t size) { return 0; }
 static bool suppress_message_printing(int level) { return false; }
 
 #endif /* CONFIG_PRINTK */
@@ -2385,8 +2391,7 @@ void console_unlock(void)
 
 		len += msg_print_text(msg,
 				console_msg_format & MSG_FORMAT_SYSLOG,
-				text + len,
-				sizeof(text) - len);
+				printk_time, text + len, sizeof(text) - len);
 		if (nr_ext_console_drivers) {
 			ext_len = msg_print_ext_header(ext_text,
 						sizeof(ext_text),
@@ -3110,7 +3115,7 @@ bool kmsg_dump_get_line_nolock(struct kmsg_dumper *dumper, bool syslog,
 		goto out;
 
 	msg = log_from_idx(dumper->cur_idx);
-	l = msg_print_text(msg, syslog, line, size);
+	l = msg_print_text(msg, syslog, printk_time, line, size);
 
 	dumper->cur_idx = log_next(dumper->cur_idx);
 	dumper->cur_seq++;
@@ -3181,6 +3186,7 @@ bool kmsg_dump_get_buffer(struct kmsg_dumper *dumper, bool syslog,
 	u32 next_idx;
 	size_t l = 0;
 	bool ret = false;
+	bool time = printk_time;
 
 	if (!dumper->active)
 		goto out;
@@ -3204,7 +3210,7 @@ bool kmsg_dump_get_buffer(struct kmsg_dumper *dumper, bool syslog,
 	while (seq < dumper->next_seq) {
 		struct printk_log *msg = log_from_idx(idx);
 
-		l += msg_print_text(msg, true, NULL, 0);
+		l += msg_print_text(msg, true, time, NULL, 0);
 		idx = log_next(idx);
 		seq++;
 	}
@@ -3215,7 +3221,7 @@ bool kmsg_dump_get_buffer(struct kmsg_dumper *dumper, bool syslog,
 	while (l > size && seq < dumper->next_seq) {
 		struct printk_log *msg = log_from_idx(idx);
 
-		l -= msg_print_text(msg, true, NULL, 0);
+		l -= msg_print_text(msg, true, time, NULL, 0);
 		idx = log_next(idx);
 		seq++;
 	}
@@ -3228,7 +3234,7 @@ bool kmsg_dump_get_buffer(struct kmsg_dumper *dumper, bool syslog,
 	while (seq < dumper->next_seq) {
 		struct printk_log *msg = log_from_idx(idx);
 
-		l += msg_print_text(msg, syslog, buf + l, size - l);
+		l += msg_print_text(msg, syslog, time, buf + l, size - l);
 		idx = log_next(idx);
 		seq++;
 	}
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH] printk: don't unconditionally shortcut print_time()
  2018-12-01 11:30         ` Tetsuo Handa
@ 2018-12-02  5:02           ` Tetsuo Handa
  2018-12-03 13:16             ` Petr Mladek
  2018-12-04  2:56             ` Sergey Senozhatsky
  0 siblings, 2 replies; 18+ messages in thread
From: Tetsuo Handa @ 2018-12-02  5:02 UTC (permalink / raw)
  To: Petr Mladek, Sergey Senozhatsky, Steven Rostedt; +Cc: linux-kernel

On 2018/12/02 8:49, kbuild test robot wrote:
>> kernel/printk/printk.c:2396:5: error: 'printk_time' undeclared (first use in this function)
        ^~~~~~~~~~~
Thanks. printk_time depends on CONFIG_PRINTK=y. Added a dummy definition.



From f903b9fa36159472a207f93e2405e45e3999f650 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Date: Sun, 2 Dec 2018 13:57:01 +0900
Subject: [PATCH v2] printk: fix printk_time race.

Since printk_time can be toggled via /sys/module/printk/parameters/time ,
it is not safe to assume that output length does not change across
multiple msg_print_text() calls. If we hit this race, we can observe
failures such as SYSLOG_ACTION_READ_ALL writes more bytes than userspace
has supplied, SYSLOG_ACTION_SIZE_UNREAD returns -EFAULT when succeeded,
SYSLOG_ACTION_READ reads garbage memory or even triggers an kernel oops
at _copy_to_user() due to integer overflow.

To close this race, get a snapshot value of printk_time and pass it to
SYSLOG_ACTION_READ, SYSLOG_ACTION_READ_ALL, SYSLOG_ACTION_SIZE_UNREAD and
kmsg_dump_get_buffer().

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Petr Mladek <pmladek@suse.com>
---
 kernel/printk/printk.c | 69 +++++++++++++++++++++++++++-----------------------
 1 file changed, 38 insertions(+), 31 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 5c2079d..0b47e34 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -403,6 +403,7 @@ __packed __aligned(4)
 static u64 syslog_seq;
 static u32 syslog_idx;
 static size_t syslog_partial;
+static bool syslog_time;
 
 /* index and sequence number of the first record stored in the buffer */
 static u64 log_first_seq;
@@ -1228,12 +1229,7 @@ static inline void boot_delay_msec(int level)
 
 static size_t print_time(u64 ts, char *buf)
 {
-	unsigned long rem_nsec;
-
-	if (!printk_time)
-		return 0;
-
-	rem_nsec = do_div(ts, 1000000000);
+	unsigned long rem_nsec = do_div(ts, 1000000000);
 
 	if (!buf)
 		return snprintf(NULL, 0, "[%5lu.000000] ", (unsigned long)ts);
@@ -1242,7 +1238,8 @@ static size_t print_time(u64 ts, char *buf)
 		       (unsigned long)ts, rem_nsec / 1000);
 }
 
-static size_t print_prefix(const struct printk_log *msg, bool syslog, char *buf)
+static size_t print_prefix(const struct printk_log *msg, bool syslog,
+			   bool time, char *buf)
 {
 	size_t len = 0;
 	unsigned int prefix = (msg->facility << 3) | msg->level;
@@ -1261,11 +1258,13 @@ static size_t print_prefix(const struct printk_log *msg, bool syslog, char *buf)
 		}
 	}
 
-	len += print_time(msg->ts_nsec, buf ? buf + len : NULL);
+	if (time)
+		len += print_time(msg->ts_nsec, buf ? buf + len : NULL);
 	return len;
 }
 
-static size_t msg_print_text(const struct printk_log *msg, bool syslog, char *buf, size_t size)
+static size_t msg_print_text(const struct printk_log *msg, bool syslog,
+			     bool time, char *buf, size_t size)
 {
 	const char *text = log_text(msg);
 	size_t text_size = msg->text_len;
@@ -1284,17 +1283,17 @@ static size_t msg_print_text(const struct printk_log *msg, bool syslog, char *bu
 		}
 
 		if (buf) {
-			if (print_prefix(msg, syslog, NULL) +
+			if (print_prefix(msg, syslog, time, NULL) +
 			    text_len + 1 >= size - len)
 				break;
 
-			len += print_prefix(msg, syslog, buf + len);
+			len += print_prefix(msg, syslog, time, buf + len);
 			memcpy(buf + len, text, text_len);
 			len += text_len;
 			buf[len++] = '\n';
 		} else {
 			/* SYSLOG_ACTION_* buffer size only calculation */
-			len += print_prefix(msg, syslog, NULL);
+			len += print_prefix(msg, syslog, time, NULL);
 			len += text_len;
 			len++;
 		}
@@ -1331,9 +1330,17 @@ static int syslog_print(char __user *buf, int size)
 			break;
 		}
 
+		/*
+		 * To keep reading/counting partial line consistent,
+		 * use printk_time value as of the biginning of a line.
+		 */
+		if (!syslog_partial)
+			syslog_time = printk_time;
+
 		skip = syslog_partial;
 		msg = log_from_idx(syslog_idx);
-		n = msg_print_text(msg, true, text, LOG_LINE_MAX + PREFIX_MAX);
+		n = msg_print_text(msg, true, syslog_time, text,
+				   LOG_LINE_MAX + PREFIX_MAX);
 		if (n - syslog_partial <= size) {
 			/* message fits into buffer, move forward */
 			syslog_idx = log_next(syslog_idx);
@@ -1373,11 +1380,13 @@ static int syslog_print_all(char __user *buf, int size, bool clear)
 	u64 next_seq;
 	u64 seq;
 	u32 idx;
+	bool time;
 
 	text = kmalloc(LOG_LINE_MAX + PREFIX_MAX, GFP_KERNEL);
 	if (!text)
 		return -ENOMEM;
 
+	time = printk_time;
 	logbuf_lock_irq();
 	/*
 	 * Find first record that fits, including all following records,
@@ -1388,7 +1397,7 @@ static int syslog_print_all(char __user *buf, int size, bool clear)
 	while (seq < log_next_seq) {
 		struct printk_log *msg = log_from_idx(idx);
 
-		len += msg_print_text(msg, true, NULL, 0);
+		len += msg_print_text(msg, true, time, NULL, 0);
 		idx = log_next(idx);
 		seq++;
 	}
@@ -1399,7 +1408,7 @@ static int syslog_print_all(char __user *buf, int size, bool clear)
 	while (len > size && seq < log_next_seq) {
 		struct printk_log *msg = log_from_idx(idx);
 
-		len -= msg_print_text(msg, true, NULL, 0);
+		len -= msg_print_text(msg, true, time, NULL, 0);
 		idx = log_next(idx);
 		seq++;
 	}
@@ -1410,14 +1419,9 @@ static int syslog_print_all(char __user *buf, int size, bool clear)
 	len = 0;
 	while (len >= 0 && seq < next_seq) {
 		struct printk_log *msg = log_from_idx(idx);
-		int textlen;
+		int textlen = msg_print_text(msg, true, time, text,
+					     LOG_LINE_MAX + PREFIX_MAX);
 
-		textlen = msg_print_text(msg, true, text,
-					 LOG_LINE_MAX + PREFIX_MAX);
-		if (textlen < 0) {
-			len = textlen;
-			break;
-		}
 		idx = log_next(idx);
 		seq++;
 
@@ -1541,11 +1545,13 @@ int do_syslog(int type, char __user *buf, int len, int source)
 		} else {
 			u64 seq = syslog_seq;
 			u32 idx = syslog_idx;
+			bool f = syslog_partial ? syslog_time : printk_time;
 
 			while (seq < log_next_seq) {
 				struct printk_log *msg = log_from_idx(idx);
 
-				error += msg_print_text(msg, true, NULL, 0);
+				error += msg_print_text(msg, true, f, NULL, 0);
+				f = printk_time;
 				idx = log_next(idx);
 				seq++;
 			}
@@ -2003,6 +2009,7 @@ asmlinkage __visible int printk(const char *fmt, ...)
 
 #define LOG_LINE_MAX		0
 #define PREFIX_MAX		0
+#define printk_time		false
 
 static u64 syslog_seq;
 static u32 syslog_idx;
@@ -2026,8 +2033,8 @@ static void console_lock_spinning_enable(void) { }
 static int console_lock_spinning_disable_and_check(void) { return 0; }
 static void call_console_drivers(const char *ext_text, size_t ext_len,
 				 const char *text, size_t len) {}
-static size_t msg_print_text(const struct printk_log *msg,
-			     bool syslog, char *buf, size_t size) { return 0; }
+static size_t msg_print_text(const struct printk_log *msg, bool syslog,
+			     bool time, char *buf, size_t size) { return 0; }
 static bool suppress_message_printing(int level) { return false; }
 
 #endif /* CONFIG_PRINTK */
@@ -2385,8 +2392,7 @@ void console_unlock(void)
 
 		len += msg_print_text(msg,
 				console_msg_format & MSG_FORMAT_SYSLOG,
-				text + len,
-				sizeof(text) - len);
+				printk_time, text + len, sizeof(text) - len);
 		if (nr_ext_console_drivers) {
 			ext_len = msg_print_ext_header(ext_text,
 						sizeof(ext_text),
@@ -3110,7 +3116,7 @@ bool kmsg_dump_get_line_nolock(struct kmsg_dumper *dumper, bool syslog,
 		goto out;
 
 	msg = log_from_idx(dumper->cur_idx);
-	l = msg_print_text(msg, syslog, line, size);
+	l = msg_print_text(msg, syslog, printk_time, line, size);
 
 	dumper->cur_idx = log_next(dumper->cur_idx);
 	dumper->cur_seq++;
@@ -3181,6 +3187,7 @@ bool kmsg_dump_get_buffer(struct kmsg_dumper *dumper, bool syslog,
 	u32 next_idx;
 	size_t l = 0;
 	bool ret = false;
+	bool time = printk_time;
 
 	if (!dumper->active)
 		goto out;
@@ -3204,7 +3211,7 @@ bool kmsg_dump_get_buffer(struct kmsg_dumper *dumper, bool syslog,
 	while (seq < dumper->next_seq) {
 		struct printk_log *msg = log_from_idx(idx);
 
-		l += msg_print_text(msg, true, NULL, 0);
+		l += msg_print_text(msg, true, time, NULL, 0);
 		idx = log_next(idx);
 		seq++;
 	}
@@ -3215,7 +3222,7 @@ bool kmsg_dump_get_buffer(struct kmsg_dumper *dumper, bool syslog,
 	while (l > size && seq < dumper->next_seq) {
 		struct printk_log *msg = log_from_idx(idx);
 
-		l -= msg_print_text(msg, true, NULL, 0);
+		l -= msg_print_text(msg, true, time, NULL, 0);
 		idx = log_next(idx);
 		seq++;
 	}
@@ -3228,7 +3235,7 @@ bool kmsg_dump_get_buffer(struct kmsg_dumper *dumper, bool syslog,
 	while (seq < dumper->next_seq) {
 		struct printk_log *msg = log_from_idx(idx);
 
-		l += msg_print_text(msg, syslog, buf + l, size - l);
+		l += msg_print_text(msg, syslog, time, buf + l, size - l);
 		idx = log_next(idx);
 		seq++;
 	}
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH] printk: don't unconditionally shortcut print_time()
  2018-12-02  5:02           ` Tetsuo Handa
@ 2018-12-03 13:16             ` Petr Mladek
  2018-12-04  2:56             ` Sergey Senozhatsky
  1 sibling, 0 replies; 18+ messages in thread
From: Petr Mladek @ 2018-12-03 13:16 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: Sergey Senozhatsky, Steven Rostedt, linux-kernel

On Sun 2018-12-02 14:02:28, Tetsuo Handa wrote:
> On 2018/12/02 8:49, kbuild test robot wrote:
> >> kernel/printk/printk.c:2396:5: error: 'printk_time' undeclared (first use in this function)
>         ^~~~~~~~~~~
> Thanks. printk_time depends on CONFIG_PRINTK=y. Added a dummy definition.
> 
> 
> 
> >From f903b9fa36159472a207f93e2405e45e3999f650 Mon Sep 17 00:00:00 2001
> From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Date: Sun, 2 Dec 2018 13:57:01 +0900
> Subject: [PATCH v2] printk: fix printk_time race.
> 
> Since printk_time can be toggled via /sys/module/printk/parameters/time ,
> it is not safe to assume that output length does not change across
> multiple msg_print_text() calls. If we hit this race, we can observe
> failures such as SYSLOG_ACTION_READ_ALL writes more bytes than userspace
> has supplied, SYSLOG_ACTION_SIZE_UNREAD returns -EFAULT when succeeded,
> SYSLOG_ACTION_READ reads garbage memory or even triggers an kernel oops
> at _copy_to_user() due to integer overflow.
> 
> To close this race, get a snapshot value of printk_time and pass it to
> SYSLOG_ACTION_READ, SYSLOG_ACTION_READ_ALL, SYSLOG_ACTION_SIZE_UNREAD and
> kmsg_dump_get_buffer().
> 
> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Signed-off-by: Petr Mladek <pmladek@suse.com>

The patch looks good to me.

Acked-by: Petr Mladek <pmladek@suse.com>

I'll wait few more days for an eventual feedback before I push it
into printk.git.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] printk: don't unconditionally shortcut print_time()
  2018-11-30 14:33       ` Tetsuo Handa
  2018-12-01 11:30         ` Tetsuo Handa
@ 2018-12-03 14:14         ` Petr Mladek
  2018-12-03 15:00           ` Tetsuo Handa
  1 sibling, 1 reply; 18+ messages in thread
From: Petr Mladek @ 2018-12-03 14:14 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: Sergey Senozhatsky, Steven Rostedt, linux-kernel

On Fri 2018-11-30 23:33:39, Tetsuo Handa wrote:
> and (I can't find the reproducer but) sometimes crashes like
> 
> [  451.988242] BUG: pagefault on kernel address 0xffff8881337fa000 in non-whitelisted uaccess
> [  451.990465] BUG: unable to handle kernel paging request at ffff8881337fa000
> [  451.992254] PGD 3602067 P4D 3602067 PUD 3605067 PMD 13ff65067 PTE 800ffffecc805060
> [  451.994066] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
> [  451.995284] CPU: 0 PID: 8147 Comm: a.out Kdump: loaded Not tainted 4.20.0-rc4+ #537
> [  451.997268] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017
> [  451.999857] RIP: 0010:copy_user_generic_unrolled+0x54/0xc0
> [  452.001312] Code: 83 e2 3f c1 e9 06 74 4a 4c 8b 06 4c 8b 4e 08 4c 8b 56 10 4c 8b 5e 18 4c 89 07 4c 89 4f 08 4c 89 57 10 4c 89 5f 18 4c 8b 46 20 <4c> 8b 4e 28 4c 8b 56 30 4c 8b 5e 38 4c 89 47 20 4c 89 4f 28 4c 89
> [  452.006119] RSP: 0018:ffffc90002573e90 EFLAGS: 00010202
> [  452.007230] RAX: 0000000000701060 RBX: 0000000000100000 RCX: 0000000000002f02
> [  452.008792] RDX: 0000000000000000 RSI: ffff8881337f9fd4 RDI: 0000000000644fe0
> [  452.010399] RBP: 0000000000601060 R08: 8b494e7401000000 R09: 00000b88858b4900
> [  452.011916] R10: 0f02000000c980f6 R11: b4bb830000025d85 R12: ffff8881337b6054
> [  452.013351] R13: 0000000000601060 R14: ffff8881337b5fa8 R15: 00000000000000ac
> [  452.014807] FS:  00007fb3c9eec740(0000) GS:ffff88813b200000(0000) knlGS:0000000000000000
> [  452.016931] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  452.018234] CR2: ffff8881337fa000 CR3: 0000000139153002 CR4: 00000000001606f0
> [  452.020038] Call Trace:
> [  452.020708]  _copy_to_user+0x56/0x70
> [  452.021580]  do_syslog+0x47d/0x7b0
> [  452.022359]  ? syscall_trace_enter+0x1c3/0x360
> [  452.023395]  ? trace_hardirqs_on_thunk+0x1a/0x1c
> [  452.024438]  ? lockdep_hardirqs_off+0x73/0xe0
> [  452.025433]  __x64_sys_syslog+0x13/0x20
> [  452.026363]  do_syscall_64+0x4f/0x1f0
> [  452.027235]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> 
> >  			/* message fits into buffer, move forward */
> >  			syslog_idx = log_next(syslog_idx);
> 
> But we need to make this "static bool time;" accessible from SYSLOG_ACTION_SIZE_UNREAD case
> because syslog_partial subtraction assumes that msg_print_text() used printk_time value
> as of syslog_print() updated syslog_partial.
> 
> 	while (seq < log_next_seq) {
> 		struct printk_log *msg = log_from_idx(idx);
> 
> 		error += msg_print_text(msg, true, NULL, 0);
> 		idx = log_next(idx);
> 		seq++;
> 	}
> 	error -= syslog_partial;
> 

Great catch.

Well, IMHO, it does not explain the pagefault above. The copy_to_user()
could come either from syslog_print() or from syslog_print_all(). They
both have their own checks that prevent the buf overflow.

The code is tricky but it looks safe now. Is it possible that
the crash was with some old, still broken, code? Or did you
add some extra changes that you used to trigger a testing
scenario?

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] printk: don't unconditionally shortcut print_time()
  2018-12-03 14:14         ` Petr Mladek
@ 2018-12-03 15:00           ` Tetsuo Handa
  0 siblings, 0 replies; 18+ messages in thread
From: Tetsuo Handa @ 2018-12-03 15:00 UTC (permalink / raw)
  To: Petr Mladek; +Cc: Sergey Senozhatsky, Steven Rostedt, linux-kernel

On 2018/12/03 23:14, Petr Mladek wrote:
> Well, IMHO, it does not explain the pagefault above. The copy_to_user()
> could come either from syslog_print() or from syslog_print_all(). They
> both have their own checks that prevent the buf overflow.
> 
> The code is tricky but it looks safe now. Is it possible that
> the crash was with some old, still broken, code? Or did you
> add some extra changes that you used to trigger a testing
> scenario?

I think I was using unmodified current linux.git, but I failed to save
the reproducer because I was editing under /tmp directory which does not
create a backup.

Anyway, after this patch, no problem is occurring.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] printk: don't unconditionally shortcut print_time()
  2018-12-02  5:02           ` Tetsuo Handa
  2018-12-03 13:16             ` Petr Mladek
@ 2018-12-04  2:56             ` Sergey Senozhatsky
  2018-12-04 10:06               ` Tetsuo Handa
  1 sibling, 1 reply; 18+ messages in thread
From: Sergey Senozhatsky @ 2018-12-04  2:56 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Petr Mladek, Sergey Senozhatsky, Steven Rostedt, linux-kernel

On (12/02/18 14:02), Tetsuo Handa wrote:
>  
> @@ -1541,11 +1545,13 @@ int do_syslog(int type, char __user *buf, int len, int source)
>  		} else {
>  			u64 seq = syslog_seq;
>  			u32 idx = syslog_idx;
> +			bool f = syslog_partial ? syslog_time : printk_time;
			^^^^^^

>  			while (seq < log_next_seq) {
>  				struct printk_log *msg = log_from_idx(idx);
>  
> -				error += msg_print_text(msg, true, NULL, 0);
> +				error += msg_print_text(msg, true, f, NULL, 0);
								^^^^

> +				f = printk_time;
				^^^

>  				idx = log_next(idx);
>  				seq++;

Can we please have something better than 'f'?

	-ss

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] printk: don't unconditionally shortcut print_time()
  2018-12-04  2:56             ` Sergey Senozhatsky
@ 2018-12-04 10:06               ` Tetsuo Handa
  2018-12-04 10:59                 ` Sergey Senozhatsky
  0 siblings, 1 reply; 18+ messages in thread
From: Tetsuo Handa @ 2018-12-04 10:06 UTC (permalink / raw)
  To: Sergey Senozhatsky, Petr Mladek, Steven Rostedt
  Cc: Sergey Senozhatsky, linux-kernel

On 2018/12/04 11:56, Sergey Senozhatsky wrote:
> Can we please have something better than 'f'?

OK. To pass 80 column limit, one line increased. ;-)

< +                     bool f = syslog_partial ? syslog_time : printk_time;
> +                     bool time = syslog_partial ? syslog_time : printk_time;

< +                             error += msg_print_text(msg, true, f, NULL, 0);
< +                             f = printk_time;
> +                             error += msg_print_text(msg, true, time, NULL,
> +                                                     0);
> +                             time = printk_time;

From d233699cbd1b43178ef6ce1f801e8a3b8e4326b9 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Date: Tue, 4 Dec 2018 19:00:01 +0900
Subject: [PATCH v3] printk: fix printk_time race.

Since printk_time can be toggled via /sys/module/printk/parameters/time ,
it is not safe to assume that output length does not change across
multiple msg_print_text() calls. If we hit this race, we can observe
failures such as SYSLOG_ACTION_READ_ALL writes more bytes than userspace
has supplied, SYSLOG_ACTION_SIZE_UNREAD returns -EFAULT when succeeded,
SYSLOG_ACTION_READ reads garbage memory or even triggers an kernel oops
at _copy_to_user() due to integer overflow.

To close this race, get a snapshot value of printk_time and pass it to
SYSLOG_ACTION_READ, SYSLOG_ACTION_READ_ALL, SYSLOG_ACTION_SIZE_UNREAD and
kmsg_dump_get_buffer().

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Petr Mladek <pmladek@suse.com>
---
 kernel/printk/printk.c | 70 ++++++++++++++++++++++++++++----------------------
 1 file changed, 39 insertions(+), 31 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 5c2079d..4f18472 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -403,6 +403,7 @@ __packed __aligned(4)
 static u64 syslog_seq;
 static u32 syslog_idx;
 static size_t syslog_partial;
+static bool syslog_time;
 
 /* index and sequence number of the first record stored in the buffer */
 static u64 log_first_seq;
@@ -1228,12 +1229,7 @@ static inline void boot_delay_msec(int level)
 
 static size_t print_time(u64 ts, char *buf)
 {
-	unsigned long rem_nsec;
-
-	if (!printk_time)
-		return 0;
-
-	rem_nsec = do_div(ts, 1000000000);
+	unsigned long rem_nsec = do_div(ts, 1000000000);
 
 	if (!buf)
 		return snprintf(NULL, 0, "[%5lu.000000] ", (unsigned long)ts);
@@ -1242,7 +1238,8 @@ static size_t print_time(u64 ts, char *buf)
 		       (unsigned long)ts, rem_nsec / 1000);
 }
 
-static size_t print_prefix(const struct printk_log *msg, bool syslog, char *buf)
+static size_t print_prefix(const struct printk_log *msg, bool syslog,
+			   bool time, char *buf)
 {
 	size_t len = 0;
 	unsigned int prefix = (msg->facility << 3) | msg->level;
@@ -1261,11 +1258,13 @@ static size_t print_prefix(const struct printk_log *msg, bool syslog, char *buf)
 		}
 	}
 
-	len += print_time(msg->ts_nsec, buf ? buf + len : NULL);
+	if (time)
+		len += print_time(msg->ts_nsec, buf ? buf + len : NULL);
 	return len;
 }
 
-static size_t msg_print_text(const struct printk_log *msg, bool syslog, char *buf, size_t size)
+static size_t msg_print_text(const struct printk_log *msg, bool syslog,
+			     bool time, char *buf, size_t size)
 {
 	const char *text = log_text(msg);
 	size_t text_size = msg->text_len;
@@ -1284,17 +1283,17 @@ static size_t msg_print_text(const struct printk_log *msg, bool syslog, char *bu
 		}
 
 		if (buf) {
-			if (print_prefix(msg, syslog, NULL) +
+			if (print_prefix(msg, syslog, time, NULL) +
 			    text_len + 1 >= size - len)
 				break;
 
-			len += print_prefix(msg, syslog, buf + len);
+			len += print_prefix(msg, syslog, time, buf + len);
 			memcpy(buf + len, text, text_len);
 			len += text_len;
 			buf[len++] = '\n';
 		} else {
 			/* SYSLOG_ACTION_* buffer size only calculation */
-			len += print_prefix(msg, syslog, NULL);
+			len += print_prefix(msg, syslog, time, NULL);
 			len += text_len;
 			len++;
 		}
@@ -1331,9 +1330,17 @@ static int syslog_print(char __user *buf, int size)
 			break;
 		}
 
+		/*
+		 * To keep reading/counting partial line consistent,
+		 * use printk_time value as of the biginning of a line.
+		 */
+		if (!syslog_partial)
+			syslog_time = printk_time;
+
 		skip = syslog_partial;
 		msg = log_from_idx(syslog_idx);
-		n = msg_print_text(msg, true, text, LOG_LINE_MAX + PREFIX_MAX);
+		n = msg_print_text(msg, true, syslog_time, text,
+				   LOG_LINE_MAX + PREFIX_MAX);
 		if (n - syslog_partial <= size) {
 			/* message fits into buffer, move forward */
 			syslog_idx = log_next(syslog_idx);
@@ -1373,11 +1380,13 @@ static int syslog_print_all(char __user *buf, int size, bool clear)
 	u64 next_seq;
 	u64 seq;
 	u32 idx;
+	bool time;
 
 	text = kmalloc(LOG_LINE_MAX + PREFIX_MAX, GFP_KERNEL);
 	if (!text)
 		return -ENOMEM;
 
+	time = printk_time;
 	logbuf_lock_irq();
 	/*
 	 * Find first record that fits, including all following records,
@@ -1388,7 +1397,7 @@ static int syslog_print_all(char __user *buf, int size, bool clear)
 	while (seq < log_next_seq) {
 		struct printk_log *msg = log_from_idx(idx);
 
-		len += msg_print_text(msg, true, NULL, 0);
+		len += msg_print_text(msg, true, time, NULL, 0);
 		idx = log_next(idx);
 		seq++;
 	}
@@ -1399,7 +1408,7 @@ static int syslog_print_all(char __user *buf, int size, bool clear)
 	while (len > size && seq < log_next_seq) {
 		struct printk_log *msg = log_from_idx(idx);
 
-		len -= msg_print_text(msg, true, NULL, 0);
+		len -= msg_print_text(msg, true, time, NULL, 0);
 		idx = log_next(idx);
 		seq++;
 	}
@@ -1410,14 +1419,9 @@ static int syslog_print_all(char __user *buf, int size, bool clear)
 	len = 0;
 	while (len >= 0 && seq < next_seq) {
 		struct printk_log *msg = log_from_idx(idx);
-		int textlen;
+		int textlen = msg_print_text(msg, true, time, text,
+					     LOG_LINE_MAX + PREFIX_MAX);
 
-		textlen = msg_print_text(msg, true, text,
-					 LOG_LINE_MAX + PREFIX_MAX);
-		if (textlen < 0) {
-			len = textlen;
-			break;
-		}
 		idx = log_next(idx);
 		seq++;
 
@@ -1541,11 +1545,14 @@ int do_syslog(int type, char __user *buf, int len, int source)
 		} else {
 			u64 seq = syslog_seq;
 			u32 idx = syslog_idx;
+			bool time = syslog_partial ? syslog_time : printk_time;
 
 			while (seq < log_next_seq) {
 				struct printk_log *msg = log_from_idx(idx);
 
-				error += msg_print_text(msg, true, NULL, 0);
+				error += msg_print_text(msg, true, time, NULL,
+							0);
+				time = printk_time;
 				idx = log_next(idx);
 				seq++;
 			}
@@ -2003,6 +2010,7 @@ asmlinkage __visible int printk(const char *fmt, ...)
 
 #define LOG_LINE_MAX		0
 #define PREFIX_MAX		0
+#define printk_time		false
 
 static u64 syslog_seq;
 static u32 syslog_idx;
@@ -2026,8 +2034,8 @@ static void console_lock_spinning_enable(void) { }
 static int console_lock_spinning_disable_and_check(void) { return 0; }
 static void call_console_drivers(const char *ext_text, size_t ext_len,
 				 const char *text, size_t len) {}
-static size_t msg_print_text(const struct printk_log *msg,
-			     bool syslog, char *buf, size_t size) { return 0; }
+static size_t msg_print_text(const struct printk_log *msg, bool syslog,
+			     bool time, char *buf, size_t size) { return 0; }
 static bool suppress_message_printing(int level) { return false; }
 
 #endif /* CONFIG_PRINTK */
@@ -2385,8 +2393,7 @@ void console_unlock(void)
 
 		len += msg_print_text(msg,
 				console_msg_format & MSG_FORMAT_SYSLOG,
-				text + len,
-				sizeof(text) - len);
+				printk_time, text + len, sizeof(text) - len);
 		if (nr_ext_console_drivers) {
 			ext_len = msg_print_ext_header(ext_text,
 						sizeof(ext_text),
@@ -3110,7 +3117,7 @@ bool kmsg_dump_get_line_nolock(struct kmsg_dumper *dumper, bool syslog,
 		goto out;
 
 	msg = log_from_idx(dumper->cur_idx);
-	l = msg_print_text(msg, syslog, line, size);
+	l = msg_print_text(msg, syslog, printk_time, line, size);
 
 	dumper->cur_idx = log_next(dumper->cur_idx);
 	dumper->cur_seq++;
@@ -3181,6 +3188,7 @@ bool kmsg_dump_get_buffer(struct kmsg_dumper *dumper, bool syslog,
 	u32 next_idx;
 	size_t l = 0;
 	bool ret = false;
+	bool time = printk_time;
 
 	if (!dumper->active)
 		goto out;
@@ -3204,7 +3212,7 @@ bool kmsg_dump_get_buffer(struct kmsg_dumper *dumper, bool syslog,
 	while (seq < dumper->next_seq) {
 		struct printk_log *msg = log_from_idx(idx);
 
-		l += msg_print_text(msg, true, NULL, 0);
+		l += msg_print_text(msg, true, time, NULL, 0);
 		idx = log_next(idx);
 		seq++;
 	}
@@ -3215,7 +3223,7 @@ bool kmsg_dump_get_buffer(struct kmsg_dumper *dumper, bool syslog,
 	while (l > size && seq < dumper->next_seq) {
 		struct printk_log *msg = log_from_idx(idx);
 
-		l -= msg_print_text(msg, true, NULL, 0);
+		l -= msg_print_text(msg, true, time, NULL, 0);
 		idx = log_next(idx);
 		seq++;
 	}
@@ -3228,7 +3236,7 @@ bool kmsg_dump_get_buffer(struct kmsg_dumper *dumper, bool syslog,
 	while (seq < dumper->next_seq) {
 		struct printk_log *msg = log_from_idx(idx);
 
-		l += msg_print_text(msg, syslog, buf + l, size - l);
+		l += msg_print_text(msg, syslog, time, buf + l, size - l);
 		idx = log_next(idx);
 		seq++;
 	}
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH] printk: don't unconditionally shortcut print_time()
  2018-12-04 10:06               ` Tetsuo Handa
@ 2018-12-04 10:59                 ` Sergey Senozhatsky
  0 siblings, 0 replies; 18+ messages in thread
From: Sergey Senozhatsky @ 2018-12-04 10:59 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Sergey Senozhatsky, Petr Mladek, Steven Rostedt,
	Sergey Senozhatsky, linux-kernel

On (12/04/18 19:06), Tetsuo Handa wrote:
> On 2018/12/04 11:56, Sergey Senozhatsky wrote:
> > Can we please have something better than 'f'?
>
> OK. To pass 80 column limit, one line increased. ;-)

Ah, so that was the reason :)

[..]

> To close this race, get a snapshot value of printk_time and pass it to
> SYSLOG_ACTION_READ, SYSLOG_ACTION_READ_ALL, SYSLOG_ACTION_SIZE_UNREAD and
> kmsg_dump_get_buffer().
>
> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Signed-off-by: Petr Mladek <pmladek@suse.com>
> Acked-by: Petr Mladek <pmladek@suse.com>

Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>

	-ss

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2018-12-04 10:59 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-24 10:50 [PATCH] printk: don't unconditionally shortcut print_time() Tetsuo Handa
2018-11-24 14:28 ` [PATCH] printk: deduplicate print_prefix() calls Tetsuo Handa
2018-11-26  7:56   ` Sergey Senozhatsky
2018-11-26 10:21     ` Tetsuo Handa
2018-11-29 14:35   ` Petr Mladek
2018-11-29 14:19 ` [PATCH] printk: don't unconditionally shortcut print_time() Petr Mladek
2018-11-30  2:26   ` Tetsuo Handa
2018-11-30 12:30     ` Petr Mladek
2018-11-30 14:33       ` Tetsuo Handa
2018-12-01 11:30         ` Tetsuo Handa
2018-12-02  5:02           ` Tetsuo Handa
2018-12-03 13:16             ` Petr Mladek
2018-12-04  2:56             ` Sergey Senozhatsky
2018-12-04 10:06               ` Tetsuo Handa
2018-12-04 10:59                 ` Sergey Senozhatsky
2018-12-03 14:14         ` Petr Mladek
2018-12-03 15:00           ` Tetsuo Handa
2018-11-30  4:24   ` Sergey Senozhatsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).