linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH -v5 0/2] printk.devkmsg: Ratelimit it by default
@ 2016-07-16  6:17 Borislav Petkov
  2016-07-16  6:17 ` [PATCH -v5 1/2] ratelimit: Extend to print suppressed messages on release Borislav Petkov
  2016-07-16  6:17 ` [PATCH -v5 2/2] printk: Add kernel parameter to control writes to /dev/kmsg Borislav Petkov
  0 siblings, 2 replies; 3+ messages in thread
From: Borislav Petkov @ 2016-07-16  6:17 UTC (permalink / raw)
  To: Andrew Morton
  Cc: LKML, Borislav Petkov, Dave Young, Franck Bui,
	Greg Kroah-Hartman, Ingo Molnar, Linus Torvalds, Peter Zijlstra,
	Steven Rostedt, Uwe Kleine-König

From: Borislav Petkov <bp@suse.de>

Hi guys,

here's v5 with Andrew's review comments addressed (I hope all).

I've unified the setting names for both sysctl(2) and the kernel cmdline
to "on", "off" and "ratelimit" so that there's no confusion anymore as
to which sysctl number corresponds to what string. This should make
dealing with that option straight-forward.

Also, requested textual expansion regarding the reasoning for the
current design has been added in comments and commit messages.

Thanks.


Changelog:
----------

v4:

sorry for spamming so quickly again and not waiting for a week before
resubmitting but I believe the stuff is ready for 4.8.

So here's v4 with all the minor review comments addressed.

v3:

here's v3 integrating Ingo's comments. The thing is called
printk.devkmsg= or printk_devkmsg now, depending on cmdline option or
sysctl.


v2:

here's v2 with the requested sysctl option kernel.printk_kmsg and
locking of the setting when printk.kmsg= is supplied on the command
line.

Patch 1 is unchanged.

Patch 2 has grown the sysctl addition.

v1:

Rostedt is busy so I took Linus' old patch and Steven's last v2 and
split and extended them with the comments people had on the last thread:

https://lkml.kernel.org/r/20160425145606.598329f2@gandalf.local.home

I hope, at least.

So it is ratelimiting by default, with "on" and "off" cmdline options. I
called the option somewhat a bit shorter too: "printk.kmsg"

The current use cases of this and of which I'm aware are:

* debug the kernel and thus shut up all interfering input from
userspace, i.e. boot with "printk.kmsg=off"

* debug userspace (and by that I mean systemd) by booting with
"printk.kmsg=on" so that the ratelimiting is disabled and the kernel log
gets all the spew.

Thoughts?

Please queue,
thanks.


Borislav Petkov (2):
  ratelimit: Extend to print suppressed messages on release
  printk: Add kernel parameter to control writes to /dev/kmsg

 Documentation/kernel-parameters.txt |   7 ++
 Documentation/sysctl/kernel.txt     |  14 ++++
 include/linux/printk.h              |   9 +++
 include/linux/ratelimit.h           |  38 ++++++++--
 kernel/printk/printk.c              | 143 ++++++++++++++++++++++++++++++++++--
 kernel/sysctl.c                     |   7 ++
 lib/ratelimit.c                     |  10 ++-
 7 files changed, 211 insertions(+), 17 deletions(-)

-- 
2.8.4

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH -v5 1/2] ratelimit: Extend to print suppressed messages on release
  2016-07-16  6:17 [PATCH -v5 0/2] printk.devkmsg: Ratelimit it by default Borislav Petkov
@ 2016-07-16  6:17 ` Borislav Petkov
  2016-07-16  6:17 ` [PATCH -v5 2/2] printk: Add kernel parameter to control writes to /dev/kmsg Borislav Petkov
  1 sibling, 0 replies; 3+ messages in thread
From: Borislav Petkov @ 2016-07-16  6:17 UTC (permalink / raw)
  To: Andrew Morton
  Cc: LKML, Borislav Petkov, Dave Young, Franck Bui,
	Greg Kroah-Hartman, Ingo Molnar, Linus Torvalds, Peter Zijlstra,
	Steven Rostedt, Uwe Kleine-König

From: Borislav Petkov <bp@suse.de>

Extend the ratelimiting facility to print the amount of suppressed lines
when it is being released.

This use case is aimed at short-termed, burst-like users for which we
want to output the suppressed lines stats only once, after it has been
disposed of. For an example, see /dev/kmsg usage in a follow-on patch.

Also, change the printk() line we issue on release to not use "callbacks"
as it is misleading: we're not suppressing callbacks but printk() calls.

This has been separated from a previous patch by Linus.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dave Young <dyoung@redhat.com>
Cc: Franck Bui <fbui@suse.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
---
 include/linux/ratelimit.h | 38 +++++++++++++++++++++++++++++++++-----
 lib/ratelimit.c           | 10 ++++++----
 2 files changed, 39 insertions(+), 9 deletions(-)

diff --git a/include/linux/ratelimit.h b/include/linux/ratelimit.h
index 18102529254e..57c9e0622a38 100644
--- a/include/linux/ratelimit.h
+++ b/include/linux/ratelimit.h
@@ -2,11 +2,15 @@
 #define _LINUX_RATELIMIT_H
 
 #include <linux/param.h>
+#include <linux/sched.h>
 #include <linux/spinlock.h>
 
 #define DEFAULT_RATELIMIT_INTERVAL	(5 * HZ)
 #define DEFAULT_RATELIMIT_BURST		10
 
+/* issue num suppressed message on exit */
+#define RATELIMIT_MSG_ON_RELEASE	BIT(0)
+
 struct ratelimit_state {
 	raw_spinlock_t	lock;		/* protect the state */
 
@@ -15,6 +19,7 @@ struct ratelimit_state {
 	int		printed;
 	int		missed;
 	unsigned long	begin;
+	unsigned long	flags;
 };
 
 #define RATELIMIT_STATE_INIT(name, interval_init, burst_init) {		\
@@ -34,12 +39,35 @@ struct ratelimit_state {
 static inline void ratelimit_state_init(struct ratelimit_state *rs,
 					int interval, int burst)
 {
+	memset(rs, 0, sizeof(*rs));
+
 	raw_spin_lock_init(&rs->lock);
-	rs->interval = interval;
-	rs->burst = burst;
-	rs->printed = 0;
-	rs->missed = 0;
-	rs->begin = 0;
+	rs->interval	= interval;
+	rs->burst	= burst;
+}
+
+static inline void ratelimit_default_init(struct ratelimit_state *rs)
+{
+	return ratelimit_state_init(rs, DEFAULT_RATELIMIT_INTERVAL,
+					DEFAULT_RATELIMIT_BURST);
+}
+
+static inline void ratelimit_state_exit(struct ratelimit_state *rs)
+{
+	if (!(rs->flags & RATELIMIT_MSG_ON_RELEASE))
+		return;
+
+	if (rs->missed) {
+		pr_warn("%s: %d output lines suppressed due to ratelimiting\n",
+			current->comm, rs->missed);
+		rs->missed = 0;
+	}
+}
+
+static inline void
+ratelimit_set_flags(struct ratelimit_state *rs, unsigned long flags)
+{
+	rs->flags = flags;
 }
 
 extern struct ratelimit_state printk_ratelimit_state;
diff --git a/lib/ratelimit.c b/lib/ratelimit.c
index 2c5de86460c5..08f8043cac61 100644
--- a/lib/ratelimit.c
+++ b/lib/ratelimit.c
@@ -46,12 +46,14 @@ int ___ratelimit(struct ratelimit_state *rs, const char *func)
 		rs->begin = jiffies;
 
 	if (time_is_before_jiffies(rs->begin + rs->interval)) {
-		if (rs->missed)
-			printk(KERN_WARNING "%s: %d callbacks suppressed\n",
-				func, rs->missed);
+		if (rs->missed) {
+			if (!(rs->flags & RATELIMIT_MSG_ON_RELEASE)) {
+				pr_warn("%s: %d callbacks suppressed\n", func, rs->missed);
+				rs->missed = 0;
+			}
+		}
 		rs->begin   = jiffies;
 		rs->printed = 0;
-		rs->missed  = 0;
 	}
 	if (rs->burst && rs->burst > rs->printed) {
 		rs->printed++;
-- 
2.8.4

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [PATCH -v5 2/2] printk: Add kernel parameter to control writes to /dev/kmsg
  2016-07-16  6:17 [PATCH -v5 0/2] printk.devkmsg: Ratelimit it by default Borislav Petkov
  2016-07-16  6:17 ` [PATCH -v5 1/2] ratelimit: Extend to print suppressed messages on release Borislav Petkov
@ 2016-07-16  6:17 ` Borislav Petkov
  1 sibling, 0 replies; 3+ messages in thread
From: Borislav Petkov @ 2016-07-16  6:17 UTC (permalink / raw)
  To: Andrew Morton
  Cc: LKML, Borislav Petkov, Dave Young, Franck Bui,
	Greg Kroah-Hartman, Ingo Molnar, Linus Torvalds, Peter Zijlstra,
	Steven Rostedt, Uwe Kleine-König

From: Borislav Petkov <bp@suse.de>

Add a "printk.devkmsg" kernel command line parameter which controls how
userspace writes into /dev/kmsg. It has three options:

* ratelimit - ratelimit logging from userspace.
* on  - unlimited logging from userspace
* off - logging from userspace gets ignored

The default setting is to ratelimit the messages written to it.

This changes the kernel default setting of "on" to "ratelimit" and we
do that because we want to keep userspace spamming /dev/kmsg to sane
levels. This is especially moot when a small kernel log buffer wraps
around and messages get lost. So the ratelimiting setting should be a
sane setting where kernel messages should have a bit higher chance of
survival from all the spamming.

It additionally does not limit logging to /dev/kmsg while the system is
booting if we haven't disabled it on the command line.

Furthermore, we can control the logging from a lower priority sysctl
interface - kernel.printk_devkmsg.

That interface will succeed only if printk.devkmsg *hasn't* been
supplied on the command line. If it has, then printk.devkmsg is a
one-time setting which remains for the duration of the system lifetime.
This "locking" of the setting is to prevent userspace from changing the
logging on us through sysctl(2).

This patch is based on previous patches from Linus and Steven.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dave Young <dyoung@redhat.com>
Cc: Franck Bui <fbui@suse.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
---
 Documentation/kernel-parameters.txt |   7 ++
 Documentation/sysctl/kernel.txt     |  14 ++++
 include/linux/printk.h              |   9 +++
 kernel/printk/printk.c              | 143 ++++++++++++++++++++++++++++++++++--
 kernel/sysctl.c                     |   7 ++
 5 files changed, 172 insertions(+), 8 deletions(-)

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 82b42c958d1c..a8e9f0bd7801 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -3150,6 +3150,13 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
 			Format: <bool>  (1/Y/y=enable, 0/N/n=disable)
 			default: disabled
 
+	printk.devkmsg={on,off,ratelimit}
+			Control writing to /dev/kmsg.
+			on - unlimited logging to /dev/kmsg from userspace
+			off - logging to /dev/kmsg disabled
+			ratelimit - ratelimit the logging
+			Default: ratelimit
+
 	printk.time=	Show timing data prefixed to each printk message line
 			Format: <bool>  (1/Y/y=enable, 0/N/n=disable)
 
diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
index a3683ce2a2f3..378421d423fa 100644
--- a/Documentation/sysctl/kernel.txt
+++ b/Documentation/sysctl/kernel.txt
@@ -752,6 +752,20 @@ send before ratelimiting kicks in.
 
 ==============================================================
 
+printk_devkmsg:
+
+Control the logging to /dev/kmsg from userspace:
+
+ratelimit: default, ratelimited
+on: unlimited logging to /dev/kmsg from userspace
+off: logging to /dev/kmsg disabled
+
+The kernel command line parameter printk.devkmsg= overrides this and is
+a one-time setting until next reboot: once set, it cannot be changed by
+this sysctl interface anymore.
+
+==============================================================
+
 randomize_va_space:
 
 This option can be used to select the type of process address
diff --git a/include/linux/printk.h b/include/linux/printk.h
index f4da695fd615..9c21535e5714 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -61,6 +61,11 @@ static inline void console_verbose(void)
 		console_loglevel = CONSOLE_LOGLEVEL_MOTORMOUTH;
 }
 
+/* strlen("ratelimit") + 1 */
+#define DEVKMSG_STR_MAX_SIZE 10
+extern char devkmsg_log_str[];
+struct ctl_table;
+
 struct va_format {
 	const char *fmt;
 	va_list *va;
@@ -172,6 +177,10 @@ extern int printk_delay_msec;
 extern int dmesg_restrict;
 extern int kptr_restrict;
 
+extern int
+devkmsg_sysctl_set_loglvl(struct ctl_table *table, int write, void __user *buf,
+			  size_t *lenp, loff_t *ppos);
+
 extern void wake_up_klogd(void);
 
 char *log_buf_addr_get(void);
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 60cdf6386763..ea5944aa71d9 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -86,6 +86,112 @@ static struct lockdep_map console_lock_dep_map = {
 };
 #endif
 
+enum devkmsg_log_bits {
+	__DEVKMSG_LOG_BIT_ON = 0,
+	__DEVKMSG_LOG_BIT_OFF,
+	__DEVKMSG_LOG_BIT_LOCK,
+};
+
+enum devkmsg_log_masks {
+	DEVKMSG_LOG_MASK_ON             = BIT(__DEVKMSG_LOG_BIT_ON),
+	DEVKMSG_LOG_MASK_OFF            = BIT(__DEVKMSG_LOG_BIT_OFF),
+	DEVKMSG_LOG_MASK_LOCK           = BIT(__DEVKMSG_LOG_BIT_LOCK),
+};
+
+/* Keep both the 'on' and 'off' bits clear, i.e. ratelimit by default: */
+#define DEVKMSG_LOG_MASK_DEFAULT	0
+
+static unsigned int __read_mostly devkmsg_log = DEVKMSG_LOG_MASK_DEFAULT;
+
+static int __control_devkmsg(char *str)
+{
+	if (!str)
+		return -EINVAL;
+
+	if (!strncmp(str, "on", 2)) {
+		devkmsg_log = DEVKMSG_LOG_MASK_ON;
+		return 2;
+	} else if (!strncmp(str, "off", 3)) {
+		devkmsg_log = DEVKMSG_LOG_MASK_OFF;
+		return 3;
+	} else if (!strncmp(str, "ratelimit", 9)) {
+		devkmsg_log = DEVKMSG_LOG_MASK_DEFAULT;
+		return 9;
+	}
+	else
+		return -EINVAL;
+}
+
+static int __init control_devkmsg(char *str)
+{
+	if (__control_devkmsg(str) < 0)
+		return 1;
+
+	/*
+	 * Set sysctl string accordingly:
+	 */
+	if (devkmsg_log == DEVKMSG_LOG_MASK_ON) {
+		memset(devkmsg_log_str, 0, DEVKMSG_STR_MAX_SIZE);
+		strncpy(devkmsg_log_str, "on", 2);
+	} else if (devkmsg_log == DEVKMSG_LOG_MASK_OFF) {
+		memset(devkmsg_log_str, 0, DEVKMSG_STR_MAX_SIZE);
+		strncpy(devkmsg_log_str, "off", 3);
+	}
+	/* else "ratelimit" which is set by default. */
+
+	/*
+	 * Sysctl cannot change it anymore. The kernel command line setting of
+	 * this parameter is to force the setting to be permanent throughout the
+	 * runtime of the system. This is a precation measure against userspace
+	 * trying to be a smarta** and attempting to change it up on us.
+	 */
+	devkmsg_log |= DEVKMSG_LOG_MASK_LOCK;
+
+	return 0;
+}
+__setup("printk.devkmsg=", control_devkmsg);
+
+char devkmsg_log_str[DEVKMSG_STR_MAX_SIZE] = "ratelimit";
+
+int devkmsg_sysctl_set_loglvl(struct ctl_table *table, int write,
+			      void __user *buffer, size_t *lenp, loff_t *ppos)
+{
+	char old_str[DEVKMSG_STR_MAX_SIZE];
+	unsigned int old;
+	int err;
+
+	if (write) {
+		if (devkmsg_log & DEVKMSG_LOG_MASK_LOCK)
+			return -EINVAL;
+
+		old = devkmsg_log;
+		strncpy(old_str, devkmsg_log_str, DEVKMSG_STR_MAX_SIZE);
+	}
+
+	err = proc_dostring(table, write, buffer, lenp, ppos);
+	if (err)
+		return err;
+
+	if (write) {
+		err = __control_devkmsg(devkmsg_log_str);
+
+		/*
+		 * Do not accept an unknown string OR a known string with
+		 * trailing crap...
+		 */
+		if (err < 0 || (err + 1 != *lenp)) {
+
+			/* ... and restore old setting. */
+			devkmsg_log = old;
+			strncpy(devkmsg_log_str, old_str, DEVKMSG_STR_MAX_SIZE);
+
+			return -EINVAL;
+		}
+	}
+
+	return 0;
+}
+
 /*
  * Number of registered extended console drivers.
  *
@@ -614,6 +720,7 @@ struct devkmsg_user {
 	u64 seq;
 	u32 idx;
 	enum log_flags prev;
+	struct ratelimit_state rs;
 	struct mutex lock;
 	char buf[CONSOLE_EXT_LOG_MAX];
 };
@@ -623,11 +730,24 @@ static ssize_t devkmsg_write(struct kiocb *iocb, struct iov_iter *from)
 	char *buf, *line;
 	int level = default_message_loglevel;
 	int facility = 1;	/* LOG_USER */
+	struct file *file = iocb->ki_filp;
+	struct devkmsg_user *user = file->private_data;
 	size_t len = iov_iter_count(from);
 	ssize_t ret = len;
 
-	if (len > LOG_LINE_MAX)
+	if (!user || len > LOG_LINE_MAX)
 		return -EINVAL;
+
+	/* Ignore when user logging is disabled. */
+	if (devkmsg_log & DEVKMSG_LOG_MASK_OFF)
+		return len;
+
+	/* Ratelimit when not explicitly enabled or when we're not booting. */
+	if ((system_state != SYSTEM_BOOTING) && !(devkmsg_log & DEVKMSG_LOG_MASK_ON)) {
+		if (!___ratelimit(&user->rs, current->comm))
+			return ret;
+	}
+
 	buf = kmalloc(len+1, GFP_KERNEL);
 	if (buf == NULL)
 		return -ENOMEM;
@@ -800,19 +920,24 @@ static int devkmsg_open(struct inode *inode, struct file *file)
 	struct devkmsg_user *user;
 	int err;
 
-	/* write-only does not need any file context */
-	if ((file->f_flags & O_ACCMODE) == O_WRONLY)
-		return 0;
+	if (devkmsg_log & DEVKMSG_LOG_MASK_OFF)
+		return -EPERM;
 
-	err = check_syslog_permissions(SYSLOG_ACTION_READ_ALL,
-				       SYSLOG_FROM_READER);
-	if (err)
-		return err;
+	/* write-only does not need any file context */
+	if ((file->f_flags & O_ACCMODE) != O_WRONLY) {
+		err = check_syslog_permissions(SYSLOG_ACTION_READ_ALL,
+					       SYSLOG_FROM_READER);
+		if (err)
+			return err;
+	}
 
 	user = kmalloc(sizeof(struct devkmsg_user), GFP_KERNEL);
 	if (!user)
 		return -ENOMEM;
 
+	ratelimit_default_init(&user->rs);
+	ratelimit_set_flags(&user->rs, RATELIMIT_MSG_ON_RELEASE);
+
 	mutex_init(&user->lock);
 
 	raw_spin_lock_irq(&logbuf_lock);
@@ -831,6 +956,8 @@ static int devkmsg_release(struct inode *inode, struct file *file)
 	if (!user)
 		return 0;
 
+	ratelimit_state_exit(&user->rs);
+
 	mutex_destroy(&user->lock);
 	kfree(user);
 	return 0;
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 87b2fc38398b..0d9fe53682bb 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -814,6 +814,13 @@ static struct ctl_table kern_table[] = {
 		.extra2		= &ten_thousand,
 	},
 	{
+		.procname	= "printk_devkmsg",
+		.data		= devkmsg_log_str,
+		.maxlen		= DEVKMSG_STR_MAX_SIZE,
+		.mode		= 0644,
+		.proc_handler	= devkmsg_sysctl_set_loglvl,
+	},
+	{
 		.procname	= "dmesg_restrict",
 		.data		= &dmesg_restrict,
 		.maxlen		= sizeof(int),
-- 
2.8.4

^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2016-07-16  6:18 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-16  6:17 [PATCH -v5 0/2] printk.devkmsg: Ratelimit it by default Borislav Petkov
2016-07-16  6:17 ` [PATCH -v5 1/2] ratelimit: Extend to print suppressed messages on release Borislav Petkov
2016-07-16  6:17 ` [PATCH -v5 2/2] printk: Add kernel parameter to control writes to /dev/kmsg Borislav Petkov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).