All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kees Cook <keescook@chromium.org>
To: Justin Stitt <justinstitt@google.com>
Cc: Bill Wendling <morbo@google.com>,
	Richard Weinberger <richard@nod.at>,
	Anton Ivanov <anton.ivanov@cambridgegreys.com>,
	Johannes Berg <johannes@sipsolutions.net>,
	linux-hardening@vger.kernel.org, linux-um@lists.infradead.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] um: refactor deprecated strncpy to strtomem
Date: Tue, 8 Aug 2023 17:41:55 -0700	[thread overview]
Message-ID: <202308081708.D5ADC80F@keescook> (raw)
In-Reply-To: <CAFhGd8oSGJ5=fk58wOSgbuXX_VaP14q0Re=Xfom=rdOR6fT1rQ@mail.gmail.com>

On Tue, Aug 08, 2023 at 10:28:57AM -0700, Justin Stitt wrote:
> On Mon, Aug 7, 2023 at 4:40 PM Kees Cook <keescook@chromium.org> wrote:
> >
> > On Mon, Aug 07, 2023 at 03:36:55PM -0700, Bill Wendling wrote:
> > > On Mon, Aug 7, 2023 at 2:18 PM Justin Stitt <justinstitt@google.com> wrote:
> > > >
> > > > Use `strtomem` here since `console_buf` is not expected to be
> > > > NUL-terminated. We should probably also just use `MCONSOLE_MAX_DATA`
> >
> > How is it known that console_buf is not a C-string?
> There are a few clues that led me to believe console_buf was not a C-string:
> 1)  `n = min((size_t) len, ARRAY_SIZE(console_buf));` means that `n`
> can be equal to size of buffer which is then subsequently filled
> entirely by the `strncpy` invocation.
> 2) console_buf looks to be a circular buffer wherein once it's filled
> it starts again from the beginning.
> 3) ARRAY_SIZE is used on it as opposed to strlen or something like
> that (but not sure if ARRAY_SIZE is also used on C-strings to be fair)
> 4) It has `buf` in its name which I loosely associate with non
> C-strings char buffers.
> 
> All in all, it looks to be a non C-string but honestly it's hard to
> tell, especially since if it IS a C-string the previous implementation
> had some bugs with strncpy I believe.

I took a look at this. It's only used in that one function, and it's
always working from the start, and uses at max ARRAY_SIZE(console_buf),
as you mentioned. Then it's passed to mconsole_reply_len() with "len",
so we can examine that:

	do {
		...
                len = MIN(total, MCONSOLE_MAX_DATA - 1);
		...
                memcpy(reply.data, str, len);
                reply.data[len] = '\0';
                total -= len;
		...
	} while (total > 0);

It's sending it as NUL-terminated, possibly breaking it up. If this used
the whole MCONSOLE_MAX_DATA, it would send MCONSOLE_MAX_DATA - 1 bytes
followed by NUL, and then 1 byte, followed by NUL. :P

Anyway, point being, yes, it appears that console_buf is a nonstring.
But it's a weird one...

In your v2 patch, you use strtomem(), which is probably close, but in
looking at the implementation details here, I'm starting to think that
strtomem() needs to return the number of bytes copied. Initially I was
thinking it could actually just be replaced with memcpy(), but there
are some side-effects going on in the original code.

First:

static void console_write(..., const char *string, unsigned int len)
	...
                n = min((size_t) len, ARRAY_SIZE(console_buf));
                strncpy(console_buf, string, n);

The 'n' is being passed in, so it's possible that "string" has NUL-bytes
in it. (Though I would assume, unlikely -- I haven't tracked down the
callers of console_write() here...)

That means that strncpy() will stop the copy at the first NUL and
then NUL pad the rest to 'n', and that's what gets passed down to
mconsole_reply_len() (which is specifically using memcpy and will carry
those NUL bytes forward). So we should not lose the NUL-padding behavior.

Additionally, when "len" is smaller than MCONSOLE_MAX_DATA, the padding
will only go up to "len", not MCONSOLE_MAX_DATA. So there's a weird
behavioral difference here where the new code is doing more work, but,
okay fine.

I find this original code to be rather confused about whether it is
dealing with C-strings or byte arrays. :|

So now I wanted to find the callers of struct console::write. My
Coccinelle script:

@found@
struct console *CON;
@@

*       CON->write(...)

show hits in kernel/printk/printk.c, which is dealing with a
NUL-terminated string constructed by vsnprintf():

	console_emit_next_record

Technically there could be NUL bytes in such a string, but almost
everything else expects to be dealing with C-strings here. This looks
really fragile.

So, I'm back around to thinking this should just be memcpy():

-		strncpy(console_buf, string, n);
+		memcpy(console_buf, string, n);


-- 
Kees Cook

      reply	other threads:[~2023-08-09  0:42 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-07 21:17 [PATCH] um: refactor deprecated strncpy to strtomem Justin Stitt
2023-08-07 21:17 ` Justin Stitt
2023-08-07 22:36 ` Bill Wendling
2023-08-07 22:36   ` Bill Wendling
2023-08-07 23:39   ` Kees Cook
2023-08-08 17:28     ` Justin Stitt
2023-08-08 17:28       ` Justin Stitt
2023-08-09  0:41       ` Kees Cook [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=202308081708.D5ADC80F@keescook \
    --to=keescook@chromium.org \
    --cc=anton.ivanov@cambridgegreys.com \
    --cc=johannes@sipsolutions.net \
    --cc=justinstitt@google.com \
    --cc=linux-hardening@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-um@lists.infradead.org \
    --cc=morbo@google.com \
    --cc=richard@nod.at \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.