From: Ingo Molnar <mingo@kernel.org>
To: Len Brown <lenb@kernel.org>
Cc: "Chang S. Bae" <chang.seok.bae@intel.com>,
Borislav Petkov <bp@suse.de>,
Thomas Gleixner <tglx@linutronix.de>,
Andy Lutomirski <luto@kernel.org>, X86 ML <x86@kernel.org>,
"Brown, Len" <len.brown@intel.com>,
Dave Hansen <dave.hansen@intel.com>,
hjl.tools@gmail.com, Dave Martin <Dave.Martin@arm.com>,
jannh@google.com, mpe@ellerman.id.au, carlos@redhat.com,
"bothersome-borer for tony.luck@intel.com" <tony.luck@intel.com>,
"Ravi V. Shankar" <ravi.v.shankar@intel.com>,
libc-alpha@sourceware.org, linux-arch@vger.kernel.org,
linux-api@vger.kernel.org,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v7 0/6] x86: Improve Minimum Alternate Stack Size
Date: Sat, 20 Mar 2021 18:32:00 +0100 [thread overview]
Message-ID: <20210320173200.GA4153106@gmail.com> (raw)
In-Reply-To: <CAJvTdKnpWL8y4N_BrCiK7fU0UXERwuuM8o84LUpp7Watxd8STw@mail.gmail.com>
* Len Brown <lenb@kernel.org> wrote:
> On Wed, Mar 17, 2021 at 6:45 AM Ingo Molnar <mingo@kernel.org> wrote:
> >
> >
> > * Ingo Molnar <mingo@kernel.org> wrote:
> >
> > >
> > > * Chang S. Bae <chang.seok.bae@intel.com> wrote:
> > >
> > > > During signal entry, the kernel pushes data onto the normal userspace
> > > > stack. On x86, the data pushed onto the user stack includes XSAVE state,
> > > > which has grown over time as new features and larger registers have been
> > > > added to the architecture.
> > > >
> > > > MINSIGSTKSZ is a constant provided in the kernel signal.h headers and
> > > > typically distributed in lib-dev(el) packages, e.g. [1]. Its value is
> > > > compiled into programs and is part of the user/kernel ABI. The MINSIGSTKSZ
> > > > constant indicates to userspace how much data the kernel expects to push on
> > > > the user stack, [2][3].
> > > >
> > > > However, this constant is much too small and does not reflect recent
> > > > additions to the architecture. For instance, when AVX-512 states are in
> > > > use, the signal frame size can be 3.5KB while MINSIGSTKSZ remains 2KB.
> > > >
> > > > The bug report [4] explains this as an ABI issue. The small MINSIGSTKSZ can
> > > > cause user stack overflow when delivering a signal.
> > >
> > > > uapi: Define the aux vector AT_MINSIGSTKSZ
> > > > x86/signal: Introduce helpers to get the maximum signal frame size
> > > > x86/elf: Support a new ELF aux vector AT_MINSIGSTKSZ
> > > > selftest/sigaltstack: Use the AT_MINSIGSTKSZ aux vector if available
> > > > x86/signal: Detect and prevent an alternate signal stack overflow
> > > > selftest/x86/signal: Include test cases for validating sigaltstack
> > >
> > > So this looks really complicated, is this justified?
> > >
> > > Why not just internally round up sigaltstack size if it's too small?
> > > This would be more robust, as it would fix applications that use
> > > MINSIGSTKSZ but don't use the new AT_MINSIGSTKSZ facility.
> > >
> > > I.e. does AT_MINSIGSTKSZ have any other uses than avoiding the
> > > segfault if MINSIGSTKSZ is used to create a small signal stack?
> >
> > I.e. if the kernel sees a too small ->ss_size in sigaltstack() it
> > would ignore ->ss_sp and mmap() a new sigaltstack instead and use that
> > for the signal handler stack.
> >
> > This would automatically make MINSIGSTKSZ - and other too small sizes
> > work today, and in the future.
> >
> > But the question is, is there user-space usage of sigaltstacks that
> > relies on controlling or reading the contents of the stack?
> >
> > longjmp using programs perhaps?
>
> For the legacy binary that requests a too-small sigaltstack, there are
> several choices:
>
> We could detect the too-small stack at sigaltstack(2) invocation and
> return an error.
> This results in two deal-killing problems:
> First, some applications don't check the return value, so the check
> would be fruitless.
> Second, those that check and error-out may be programs that never
> actually take the signal, and so we'd be causing a dusty binary to
> exit, when it didn't exit on another system, or another kernel.
>
> Or we could detect the too small stack at signal registration time.
> This has the same two deal-killers as above.
>
> Then there is the approach in this patch-set, which detects an
> imminent stack overflow at run time.
> It has neither of the two problems above, and the benefit that we now
> prevent data corruption
> that could have been happening on some systems already today. The
> down side is that the dusty binary
> that does request the too-small stack can now die at run time.
>
> So your idea of recognizing the problem and conjuring up a
> sufficient stack is compelling, since it would likely "just work",
> no matter how dumb the program. But where would the the sufficient
> stack come from -- is this a new kernel buffer, or is there a way to
> abscond some user memory? I would expect a signal handler to look
> at the data on its stack and nobody else will look at that stack.
> But this is already an unreasonable program for allocating a special
> signal stack in the first place :-/ So yes, one could imagine the
> signal handler could longjump instead of gracefully completing, and
> if this specially allocated signal stack isn't where the user
> planned, that could be trouble.
We could mmap() (implicitly) new anonymous memory - but I can see why
this is probably more trouble than worth...
> Another idea we discussed was to detect the potential overflow at
> run-time, and instead of killing the process, just push the signal
> onto the regular user stack. this might actually work, but it is
> sort of devious; and it would not work in the case where the user
> overflowed their regular stack already, which may be the most
> (only?) compelling reason that they allocated and declared a special
> sigaltstack in the first place...
Yeah, this doesn't sound deterministic enough.
Ok, thanks for the detailed answers - I withdraw my objections, let's
proceed with the approach you are proposing?
Thanks,
Ingo
prev parent reply other threads:[~2021-03-20 17:33 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-16 6:52 [PATCH v7 0/6] x86: Improve Minimum Alternate Stack Size Chang S. Bae
2021-03-16 6:52 ` [PATCH v7 1/6] uapi: Define the aux vector AT_MINSIGSTKSZ Chang S. Bae
2021-03-16 6:52 ` [PATCH v7 2/6] x86/signal: Introduce helpers to get the maximum signal frame size Chang S. Bae
2021-03-16 6:52 ` [PATCH v7 3/6] x86/elf: Support a new ELF aux vector AT_MINSIGSTKSZ Chang S. Bae
2021-03-16 6:52 ` [PATCH v7 4/6] selftest/sigaltstack: Use the AT_MINSIGSTKSZ aux vector if available Chang S. Bae
2021-03-16 6:52 ` [PATCH v7 5/6] x86/signal: Detect and prevent an alternate signal stack overflow Chang S. Bae
2021-03-16 11:52 ` Borislav Petkov
2021-03-16 18:26 ` Bae, Chang Seok
2021-03-25 16:20 ` Borislav Petkov
2021-03-25 17:21 ` Bae, Chang Seok
2021-03-25 20:14 ` Florian Weimer
2021-03-25 18:13 ` Andy Lutomirski
2021-03-25 18:54 ` Borislav Petkov
2021-03-25 21:11 ` Bae, Chang Seok
2021-03-25 21:27 ` Borislav Petkov
2021-03-26 4:56 ` Andy Lutomirski
2021-03-26 10:30 ` Borislav Petkov
2021-04-12 22:30 ` Bae, Chang Seok
2021-04-14 10:12 ` Borislav Petkov
2021-04-14 11:30 ` Florian Weimer
2021-04-14 12:06 ` Borislav Petkov
2021-05-03 5:30 ` Florian Weimer
2021-05-03 11:17 ` Borislav Petkov
2021-03-26 4:58 ` Andy Lutomirski
2021-03-16 6:52 ` [PATCH v7 6/6] selftest/x86/signal: Include test cases for validating sigaltstack Chang S. Bae
2021-03-17 10:06 ` [PATCH v7 0/6] x86: Improve Minimum Alternate Stack Size Ingo Molnar
2021-03-17 10:44 ` Ingo Molnar
2021-03-19 18:12 ` Len Brown
2021-03-20 17:32 ` Ingo Molnar [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210320173200.GA4153106@gmail.com \
--to=mingo@kernel.org \
--cc=Dave.Martin@arm.com \
--cc=bp@suse.de \
--cc=carlos@redhat.com \
--cc=chang.seok.bae@intel.com \
--cc=dave.hansen@intel.com \
--cc=hjl.tools@gmail.com \
--cc=jannh@google.com \
--cc=len.brown@intel.com \
--cc=lenb@kernel.org \
--cc=libc-alpha@sourceware.org \
--cc=linux-api@vger.kernel.org \
--cc=linux-arch@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@kernel.org \
--cc=mpe@ellerman.id.au \
--cc=ravi.v.shankar@intel.com \
--cc=tglx@linutronix.de \
--cc=tony.luck@intel.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).