All of lore.kernel.org
 help / color / mirror / Atom feed
From: Petr Mladek <pmladek@suse.com>
To: Chris Down <chris@chrisdown.name>
Cc: linux-kernel@vger.kernel.org, Jessica Yu <jeyu@kernel.org>,
	Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
	John Ogness <john.ogness@linutronix.de>,
	Steven Rostedt <rostedt@goodmis.org>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Kees Cook <keescook@chromium.org>,
	Andy Shevchenko <andriy.shevchenko@linux.intel.com>,
	Rasmus Villemoes <linux@rasmusvillemoes.dk>,
	kernel-team@fb.com
Subject: Re: [PATCH v6 3/4] printk: Userspace format indexing support
Date: Tue, 25 May 2021 17:06:46 +0200	[thread overview]
Message-ID: <YK0ShqdTSVTYXGPF@alley> (raw)
In-Reply-To: <05d25c65d3f5149c1e8537f74041a7a46bd489d6.1621338324.git.chris@chrisdown.name>

On Tue 2021-05-18 13:00:43, Chris Down wrote:
> We have a number of systems industry-wide that have a subset of their
> functionality that works as follows:
> 
> 1. Receive a message from local kmsg, serial console, or netconsole;
> 2. Apply a set of rules to classify the message;
> 3. Do something based on this classification (like scheduling a
>    remediation for the machine), rinse, and repeat.
> 
> This provides a solution to the issue of silently changed or deleted
> printks: we record pointers to all printk format strings known at
> compile time into a new .printk_index section, both in vmlinux and
> modules. At runtime, this can then be iterated by looking at
> <debugfs>/printk/index/<module>, which emits the following format, both
> readable by humans and able to be parsed by machines:
> 
> diff --git a/include/linux/printk.h b/include/linux/printk.h
> index f589b8b60806..40b97d538e12 100644
> --- a/include/linux/printk.h
> +++ b/include/linux/printk.h
> +#ifdef CONFIG_PRINTK_INDEX
> +struct pi_entry {
> +	const char *fmt;
> +	const char *func;
> +	const char *file;
> +	unsigned int line;
> +
> +	/*
> +	 * While printk and pr_* have the level stored in the string at compile
> +	 * time, some subsystems dynamically add it at runtime through the
> +	 * format string. For these dynamic cases, we allow the subsystem to
> +	 * tell us the level at compile time.
> +	 *
> +	 * NULL indicates that the level, if any, is stored in fmt.
> +	 */
> +	const char *level;
> +
> +	/*
> +	 * The format string before and after the fmt, or at least something
> +	 * that would match whatever it could be.
> +	 */
> +	const char *pre_fmt;
> +	const char *post_fmt;

Are you aware of any subsystem having some extra post_fmt, please?

I think about calling this "subsys_fmt" or "subsys_prefix".

Well, the main question is whether we need to store this
at all. Does it bring any useful information?

Note that we do not store the value defined by pr_fmt() which
is yet another automatically added prefix.

I would personally omit these prefixes. The most important
information is:

  + fmt: is the pattern that the system monitors would most likely
	look for.

  + level: says whether the string will appear on console that
      shows only messages below a certain console_loglevel

  + func, file, line: help to find the string in the kernel sources.
      It is useful when investigating what has changed.

For example, pre_fmt="%s %s:" used by dev_printk (4th patch)
is not much useful.


> +} __packed;
> +
> +#define __printk_index_emit(_fmt, _level, _pre_fmt, _post_fmt)		       \
> +	({								       \
> +		if (__builtin_constant_p(_fmt) && __builtin_constant_p(_level)) { \
> +			/*
> +			 * The compiler may not be able to eliminate the
> +			 * non-constant variants of _fmt and _level, so we need
> +			 * to make sure that it doesn't see any hypothetical
> +			 * assignment for non-constants even though this is
> +			 * already inside the __builtin_constant_p guard.
> +			 */						       \
> +			static const struct pi_entry _entry		       \
> +			__used = {		       \
> +				.fmt = __builtin_constant_p(_fmt) ? (_fmt) : NULL, \
> +				.func = __func__,			       \
> +				.file = __FILE__,			       \
> +				.line = __LINE__,			       \
> +				.level = __builtin_constant_p(_level) ? (_level) : NULL, \
> +				.pre_fmt = _pre_fmt,			       \

Should this also be?

	.pre_fmt = __builtin_constant_p(_pre_fmt) ? _pre_fmt : NULL

> +				.post_fmt = _post_fmt,				\

Similar here.

Well, I would remove .pre_fmt and .post_fmt completely.

> +			};						       \
> +			static const struct pi_entry *_entry_ptr	       \
> +			__used __section(".printk_index") = &_entry;	       \
> +		}							       \
> +	})
> +
> +#else /* !CONFIG_PRINTK_INDEX */
> +#define __printk_index_emit(...) do {} while (0)
> +#endif /* CONFIG_PRINTK_INDEX */
> +
> +/*
> + * Some subsystems have their own custom printk that applies a va_format to a
> + * generic format, for example, to include a device number or other metadata
> + * alongside the format supplied by the caller.
> + *
> + * In order to store these in the way they would be emitted by the printk
> + * infrastructure, the subsystem provides us with the start, fixed string, and
> + * any subsequent text in the format string.
> + *
> + * We take a variable argument list as pr_fmt/dev_fmt/etc are sometimes passed
> + * as multiple arguments (eg: `"%s: ", "blah"`), and we must only take the
> + * first one.
> + *
> + * pre and post must be known at compile time, or compilation will fail (since
> + * this is a mistake). If fmt or level is not known at compile time, no index
> + * entry will be made (since this can legitimately happen).

How is this achieved, please? 

__printk_index_emit() creates the entry when the following check is true:

       if (__builtin_constant_p(_fmt) && __builtin_constant_p(_level))

It checks neither _pre_fmt nor _post_fmt.


> + */
> +#define printk_index_subsys_emit(pre, post, level, fmt, ...) \
> +	__printk_index_emit(fmt, level, pre, post)
> +
> +#define printk_index_wrap(_p_func, _fmt, ...)				       \
> +	({								       \
> +		__printk_index_emit(_fmt, NULL, NULL, NULL);		       \
> +		_p_func(_fmt, ##__VA_ARGS__);				       \
> +	})
> +
> +
> --- /dev/null
> +++ b/kernel/printk/index.c
> @@ -0,0 +1,198 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * printk/index.c - Userspace indexing of printk formats
> + */
> +
> +#include <linux/debugfs.h>
> +#include <linux/module.h>
> +#include <linux/printk.h>
> +#include <linux/slab.h>
> +#include <linux/string_helpers.h>
> +
> +#include "internal.h"
> +
> +extern struct pi_entry *__start_printk_index[];
> +extern struct pi_entry *__stop_printk_index[];
> +
> +/* The base dir for module formats, typically debugfs/printk/index/ */
> +static struct dentry *dfs_index;
> +
> +#ifdef CONFIG_MODULES
> +static const char *pi_get_module_name(struct module *mod)
> +{
> +	return mod ? mod->name : "vmlinux";
> +}
> +
> +void pi_create_file(struct module *mod);
> +void pi_remove_file(struct module *mod);

I would personally move all this module-notify-related code right
above pi_init() definition and avoid any forward declarations.


> +
> +static int pi_module_notify(struct notifier_block *nb, unsigned long op,
> +			    void *data)
> +{
> +	struct module *mod = data;
> +
> +	switch (op) {
> +	case MODULE_STATE_COMING:
> +		pi_create_file(mod);
> +		break;
> +	case MODULE_STATE_GOING:
> +		pi_remove_file(mod);
> +		break;
> +	}
> +
> +	return NOTIFY_OK;
> +}

[...]

> +static int pi_show(struct seq_file *s, void *v)
> +{
> +	const struct pi_entry *entry = v;
> +	int level = LOGLEVEL_DEFAULT;
> +	enum printk_info_flags flags = 0;
> +	u16 prefix_len = 0;
> +
> +	if (v == SEQ_START_TOKEN) {
> +		seq_puts(s,
> +			 "# <level[,flags]> filename:line function \"format\"\n");
> +		return 0;
> +	}
> +
> +	if (!entry->fmt)
> +		return 0;

Is this just a paranoid check or is it a valid case?

Is is possible to update __printk_index_emit() to do not create
entries with fmt = NULL ?

We should either remove the above check or add a comment
explaining why it is necessary.


> +
> +	if (entry->level)
> +		printk_parse_prefix(entry->level, &level, &flags);
> +	else
> +		prefix_len = printk_parse_prefix(entry->fmt, &level, &flags);

This is missing:

	if (level == LOGLEVEL_DEFAULT)
		level = default_message_loglevel;

Without it, it produces lines with loglevel <-1>, for example:

<-1> init/do_mounts.c:457 mount_block_root "\n"
<-1> init/do_mounts.c:456 mount_block_root " %s"
<-1> init/do_mounts.c:454 mount_block_root "No filesystem could mount root, tried: "



> +
> +	seq_printf(s, "<%d%s> %s:%d %s \"",
> +			level, flags & LOG_CONT ? ",c" : "", entry->file,
> +			entry->line, entry->func);

It produces the following for continuous lines:

<-1,c> arch/x86/events/core.c:2101 init_hw_perf_events "%s PMU driver.\n"
<-1,c> arch/x86/events/core.c:2091 init_hw_perf_events "no PMU driver, software events only.\n"

But we should print the loglevel only when explicitly defined.
So I would do it the following way:

	if (flags & LOG_CONT) {
		if (level == LOGLEVEL_DEFAULT)
			seq_printf(s, "<c>");
		else
			seq_printf(s, "<%d,c>", level);
	} else {
		if (level == LOGLEVEL_DEFAULT)
			level = default_message_loglevel;
		seq_printf(s, "<%d>", level);
	}

	seq_printf(s, " %s:%d %s \"", entry->file, entry->line, entry->func);

> +	if (entry->pre_fmt)
> +		seq_escape_printf_format(s, entry->pre_fmt);
> +	seq_escape_printf_format(s, entry->fmt + prefix_len);
> +	if (entry->post_fmt)
> +		seq_escape_printf_format(s, entry->post_fmt);
> +	seq_puts(s, "\"\n");
> +
> +	return 0;
> +}
> +
> diff --git a/kernel/printk/internal.h b/kernel/printk/internal.h
> index 51615c909b2f..dbeeb2fa74ab 100644
> --- a/kernel/printk/internal.h
> +++ b/kernel/printk/internal.h
> @@ -4,6 +4,8 @@
>   */
>  #include <linux/percpu.h>
>  
> +#include "printk_ringbuffer.h"
> +
>  #ifdef CONFIG_PRINTK
>  
>  #define PRINTK_SAFE_CONTEXT_MASK	0x007ffffff
> @@ -51,6 +53,8 @@ bool printk_percpu_data_ready(void);
>  
>  void defer_console_output(void);
>  
> +u16 printk_parse_prefix(const char *text, int *level,
> +			enum printk_info_flags *flags);

It would be nice to do all this parse_prefix() shuffling and
renaming in another preparatoty patch. I mean to have
two preparaty patches:

    + patch renaming enum log_flags -> printk_info_flags
    + patch renaming parse_prefix -> printk_parse_prefix

That said, I do not resit on it. It is just nice to have ;-)

>  #else
>  
>  /*

I am really happy with the progress. The remaining problems are mostly
with the support for the subsystem-specific printk() callers that was
added in this revision.

Best Regards,
Petr

  parent reply	other threads:[~2021-05-25 15:06 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-18 12:00 [PATCH v6 0/4] printk: Userspace format indexing support Chris Down
2021-05-18 12:00 ` [PATCH v6 1/4] string_helpers: Escape double quotes in escape_special Chris Down
2021-05-18 13:10   ` Andy Shevchenko
2021-05-18 14:10     ` Chris Down
2021-05-25 10:17   ` Petr Mladek
2021-05-18 12:00 ` [PATCH v6 2/4] printk: Straighten out log_flags into printk_info_flags Chris Down
2021-05-25 10:33   ` Petr Mladek
2021-05-25 11:35     ` John Ogness
2021-05-26  7:31       ` Petr Mladek
2021-05-26  8:39         ` John Ogness
2021-05-26  9:25           ` Petr Mladek
2021-06-01 15:16             ` Chris Down
2021-05-18 12:00 ` [PATCH v6 3/4] printk: Userspace format indexing support Chris Down
2021-05-18 13:30   ` Andy Shevchenko
2021-05-18 14:07     ` Chris Down
2021-05-18 16:00       ` Andy Shevchenko
2021-05-18 16:28         ` Chris Down
2021-05-18 16:59           ` Chris Down
2021-05-19  6:59         ` Rasmus Villemoes
2021-05-20  9:25           ` Andy Shevchenko
2021-05-25 15:19             ` Petr Mladek
2021-05-25 15:06   ` Petr Mladek [this message]
2021-06-01 15:15     ` Chris Down
2021-06-04 10:19       ` Petr Mladek
2021-06-04 11:50         ` Chris Down
2021-05-18 12:00 ` [PATCH v6 4/4] printk: index: Add indexing support to dev_printk Chris Down
2021-05-26  8:57   ` Petr Mladek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YK0ShqdTSVTYXGPF@alley \
    --to=pmladek@suse.com \
    --cc=andriy.shevchenko@linux.intel.com \
    --cc=chris@chrisdown.name \
    --cc=gregkh@linuxfoundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=jeyu@kernel.org \
    --cc=john.ogness@linutronix.de \
    --cc=keescook@chromium.org \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@rasmusvillemoes.dk \
    --cc=rostedt@goodmis.org \
    --cc=sergey.senozhatsky@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.