linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mark Rutland <mark.rutland@arm.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Kees Cook <keescook@chromium.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	the arch/x86 maintainers <x86@kernel.org>
Subject: Re: [GIT pull] sched/core for v5.16-rc1
Date: Wed, 3 Nov 2021 13:52:49 +0000	[thread overview]
Message-ID: <20211103135249.GA38767@C02TD0UTHF1T.local> (raw)
In-Reply-To: <YYD5ti23DQUjdQdz@hirez.programming.kicks-ass.net>

On Tue, Nov 02, 2021 at 09:41:26AM +0100, Peter Zijlstra wrote:
> On Mon, Nov 01, 2021 at 02:27:49PM -0700, Linus Torvalds wrote:
> > On Mon, Nov 1, 2021 at 2:01 PM Linus Torvalds
> > <torvalds@linux-foundation.org> wrote:
> > >
> > > Unwinders that need locks because they can do bad things if they are
> > > working on unstable data are EVIL and WRONG.
> > 
> > Note that this is fundamental: if you can fool an unwider to do
> > something bad just because the data isn't stable, then the unwinder is
> > truly horrendously buggy, and not usable.
> 
> From what I've been led to believe, quite a few of our arch unwinders
> seem to fall in that category. They're mostly only happy when unwinding
> self and don't have many guardrails on otherwise.
> 
> > It could be a user process doing bad things to the user stack frame
> > from another thread when profiling is enabled.
> 
> Most of the unwinders seem to only care about the kernel stack. Not the
> user stack.

Yup; there are usually separate unwinders for user/kernel, since there
are different constaints (and potentially different ABIs for unwinding).

> > It could be debug code unwinding without locks for random reasons.
> > 
> > So I really don't like "take a lock for unwinding". It's a pretty bad
> > bug if the lock required.
> 
> Fair enough; te x86 unwinder is pretty robust in this regard, but it
> seems to be one of few :/

FWIW, the arm64 kernel unwinder also shouldn't blow up (so long as the
target stack is pinned via try_get_stack() or similar).

However, depending on how the task reuses the stack, the results can be
entirely bogus rather than just stale, since data on the stack can look
like a kernel pointer (even if that's fairly unllikely). I'm happy to
believe that we don't care aobut that for wchan, but it's not something
I'd like to see spread.

> > The "Link" in the commit also is entirely useless, pointing back to
> > the emailed submission of the patch, rather than any useful discussion
> > about why the patch happened.
> 
> So the initial discussion started here:
> 
>   https://lkml.kernel.org/r/20210923233105.4045080-1-keescook@chromium.org
> 
> A later thread that might also be of interest is:
> 
>   https://lkml.kernel.org/r/YWgyy+KvNLQ7eMIV@shell.armlinux.org.uk
> 
> Also, an even later thread proposes to push that lock into more stack
> unwinding functions (anything doing remote unwinds):
> 
>   https://lkml.kernel.org/r/20211022150933.883959987@infradead.org
> 
> But it seems to be you're thinking that's fundamentally buggered and
> people should instead invest in fixing their unwinders already.
> 
> Now, as is, this stuff is user exposed through /proc/$pid/{wchan,stack}
> and as such I think it *can* do with a few extra guardrails in generic
> code. OTOH, /proc/$pid/stack is root only.
> 
> Also, the remote stack-trace code is hooked into bpf (because
> kitchen-sink) and while I didn't look too hard, I can imagine it could
> be used to trigger crashes on our less robust architectures if prodded
> just right.

I do worry that remote unwinds from BPF are just silently generating
junk, but it's not clear to me what they're actually used for and how
much that matters. I don't understand why a remote unwind is necessary
at all.

> Should I care about all this from a generic code PoV, or simply let the
> architectures that got it 'wrong' deal with it?

FWIW I'm happy either way. There are some upcoming improvements to the
arm64 unwinder that currently conflict and I need to know whether to
wait and rebase or assume that we take those first.

Thanks,
Mark.

  reply	other threads:[~2021-11-03 13:53 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-01  1:15 [GIT pull] irq/core for v5.16-rc1 Thomas Gleixner
2021-11-01  1:15 ` [GIT pull] locking/core " Thomas Gleixner
2021-11-01 21:20   ` pr-tracker-bot
2021-11-01  1:16 ` [GIT pull] objtool/core " Thomas Gleixner
2021-11-01 20:44   ` Linus Torvalds
2021-11-02  8:00     ` Peter Zijlstra
2021-11-02  8:06       ` Borislav Petkov
2021-11-02  9:05       ` Stackleak vs noinstr (Was: [GIT pull] objtool/core for v5.16-rc1) Peter Zijlstra
2021-11-02 10:03         ` Peter Zijlstra
2021-11-02 17:50           ` Linus Torvalds
2021-11-02 21:18             ` Borislav Petkov
2021-11-03  7:18           ` Alexander Popov
2021-11-03  8:19             ` Peter Zijlstra
2022-02-01 23:59           ` Kees Cook
2021-11-01 21:20   ` [GIT pull] objtool/core for v5.16-rc1 pr-tracker-bot
2021-11-01  1:16 ` [GIT pull] perf/core " Thomas Gleixner
2021-11-01 21:20   ` pr-tracker-bot
2021-11-01  1:16 ` [GIT pull] sched/core " Thomas Gleixner
2021-11-01 21:01   ` Linus Torvalds
2021-11-01 21:27     ` Linus Torvalds
2021-11-02  8:41       ` Peter Zijlstra
2021-11-03 13:52         ` Mark Rutland [this message]
2021-11-03 16:23         ` Linus Torvalds
2021-11-02  8:54     ` Peter Zijlstra
2021-11-01 21:20   ` pr-tracker-bot
2021-11-01  1:16 ` [GIT pull] timers/core " Thomas Gleixner
2021-11-01 21:20   ` pr-tracker-bot
2021-11-01  1:16 ` [GIT pull] x86/apic " Thomas Gleixner
2021-11-01 21:20   ` pr-tracker-bot
2021-11-01  1:52 ` [GIT pull RESEND] x86/fpu " Thomas Gleixner
2021-11-01 21:20   ` pr-tracker-bot
2021-11-01 21:19 ` [GIT pull] irq/core " pr-tracker-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211103135249.GA38767@C02TD0UTHF1T.local \
    --to=mark.rutland@arm.com \
    --cc=keescook@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).