linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Joe Perches <joe@perches.com>
To: Chris Down <chris@chrisdown.name>
Cc: Petr Mladek <pmladek@suse.com>,
	linux-kernel@vger.kernel.org,
	Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
	John Ogness <john.ogness@linutronix.de>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	kernel-team@fb.com, Steven Rostedt <rostedt@goodmis.org>,
	Alexey Dobriyan <adobriyan@gmail.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Jason Baron <jbaron@akamai.com>,
	Kees Cook <keescook@chromium.org>,
	linux-api@vger.kernel.org
Subject: Re: [PATCH] printk: Userspace format enumeration support
Date: Sun, 07 Feb 2021 06:58:28 -0800	[thread overview]
Message-ID: <bd53d894b7bb0fcaa520282a04a6487828282695.camel@perches.com> (raw)
In-Reply-To: <YB/1iHwwTi9dOv38@chrisdown.name>

On Sun, 2021-02-07 at 14:13 +0000, Chris Down wrote:
> Joe Perches writes:
> > > There are certainly printks which can't be trivially monitored using the printk
> > > format alone, but the vast majority of the ones that are monitored _do_ have
> > > meaningful formats and can be monitored over time. No solution to this is going
> > > to catch every single case, especially when so much of the information can be
> > > generated dyamically, but this patchset still goes a long way to making printk
> > > monitoring more tractable for use cases like the one described in the
> > > changelog.
> > 
> > For the _vast_ majority of printk strings, this can easily be found
> > and compared using a trivial modification to strings.
> 
> There are several issues with your proposed approach that make it unsuitable 
> for use as part of a reliable production environment:
> 
> 1. It misses printk() formats without KERN_SOH
> 
> printk() formats without KERN_SOH are legal and use MESSAGE_LOGLEVEL_DEFAULT.  
> On my test kernel, your proposed patch loses >5% of printk formats -- over 200 
> messages -- due to this, including critical ones like those about hardware or 
> other errors.

There are _very_ few of those printks without KERN_<level> and those
very few are not generally being changed.

> 2. Users don't always have the kernel image available
> 
> Many of our machines and many of the machines of others like us do not boot 
> using local storage, but instead use PXE or other technologies where the kernel 
> may not be stored during runtime.
> 
> As is described in the changelog, it is necessary to be able to vary 
> remediations not only based on what is already in /dev/kmsg, but also to be 
> able to make decisions about our methodology based on what's _supported_ in the 
> running kernel at runtime, and your proposed approach makes this not viable.

Indirection would alway work.

You could load a separate file with output strings along with your
kernel image.

> 3. `KERN_SOH + level' can appear in other places than just printk strings
> 
> KERN_SOH is just ASCII '\001' -- it's not distinctive or unique, even when 
> paired with a check for something that looks like a level after it. For this 
> reason, your proposed patch results in a non-trivial amount of non-printk 
> related garbage in its output. For example:
> 
>      % binutils/strings -k /tmp/vmlinux | head -5
>      3L)s
>      3L)s
>      c,[]A\
>      c(L)c
>      d$pL)d$`u4
> 
> Fundamentally, one cannot use a tool which just determines whether something is 
> printable to determine semantic intent.

$ kernel_strings --kernel --section ".rodata" vmlinux

I got exactly 0.

> 4. strings(1) output cannot differentiate embedded newlines and new formats
> 
> The following has exactly the same output from strings(1), but will manifest 
> completely differently at printk() time:
> 
>      printk(KERN_ERR "line one\nline two\nline three\n");
>      printk("line four\n");

This is not the preferred output style and is only done in old and
unchanging code.

Your use case in your commit log is looking for _changed_ formats.

On Thu, 2021-02-04 at 15:37 +0000, Chris Down wrote:
> This patch provides a solution to the issue of silently changed or
> deleted printks:

Exactly _how_ many of these use cases do you think exist?

The generally preferred style for the example above would be:

	pr_err("line one\n");
	pr_err("line two\n");
	pr_err("line three\n");
	pr_err("line four\n");

> The originally posted patch _does_ differentiate between these cases, using \0 
> as a reliable separator. Its outputs are, respectively:
> 
>      \0013line one\nline two\nline three\0\nline four\n\0
>      \0013line one\nline two\n\0line three\nline four\n\0
> 
> This isn't just a theoretical concern -- there are plenty of places which use 
> multiline printks, and we must be able to distinguish between that and 
> _multiple_ printks.

Just like there are many places that use buffered printks as the
example I gave earlier.  None of which your proposed solution would find.

> 5. strings(1) is not contextually aware, and cannot be made to act as if it is
> 
> strings has no idea about what it is reading, which is why it is more than 
> happy to output the kind of meaningless output shown in #3. There are plenty of 
> places across the kernel where there might be a sequence of bytes which the 
> strings utility happens to interpret as being semantically meaningful, but in 
> reality just happens to be an unrelated sequence of coincidentally printable 
> bytes that just happens to contain a \001.
> 
> I appreciate your willingness to propose other solutions, but for these 
> reasons, the proposed strings(1) patch would not suffice as an interface for 
> printk enumeration.

I think you are on a path to try to make printk output immutable.
I think that's a _very_ bad path.

I also think this is adding needless complexity.

A possible complexity I would like to support would be optionally
compressing printk format strings at compile time and uncompressing
them at use time.



  reply	other threads:[~2021-02-07 14:59 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-04 15:37 [PATCH] printk: Userspace format enumeration support Chris Down
2021-02-05  0:47 ` Chris Down
2021-02-05 14:26 ` Chris Down
2021-02-05 16:42 ` Petr Mladek
2021-02-05 17:47   ` Steven Rostedt
2021-02-05 22:45     ` Chris Down
2021-02-05 22:49       ` Steven Rostedt
2021-02-06  7:13       ` Greg Kroah-Hartman
2021-02-06 12:44         ` Chris Down
2021-02-05 22:25   ` Chris Down
2021-02-06 17:57     ` Joe Perches
2021-02-06 21:21       ` Chris Down
2021-02-07  4:41         ` Joe Perches
2021-02-07 14:13           ` Chris Down
2021-02-07 14:58             ` Joe Perches [this message]
2021-02-07 16:13               ` Chris Down
2021-02-07 16:53                 ` Chris Down
2021-02-07 22:21 ` kernel test robot
2021-02-08  1:13   ` Chris Down

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bd53d894b7bb0fcaa520282a04a6487828282695.camel@perches.com \
    --to=joe@perches.com \
    --cc=adobriyan@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=chris@chrisdown.name \
    --cc=gregkh@linuxfoundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=jbaron@akamai.com \
    --cc=john.ogness@linutronix.de \
    --cc=keescook@chromium.org \
    --cc=kernel-team@fb.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pmladek@suse.com \
    --cc=rostedt@goodmis.org \
    --cc=sergey.senozhatsky@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).