linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Len Brown <lenb@kernel.org>
Cc: "Chang S. Bae" <chang.seok.bae@intel.com>,
	Borislav Petkov <bp@suse.de>,
	Thomas Gleixner <tglx@linutronix.de>,
	Andy Lutomirski <luto@kernel.org>, X86 ML <x86@kernel.org>,
	"Brown, Len" <len.brown@intel.com>,
	Dave Hansen <dave.hansen@intel.com>,
	hjl.tools@gmail.com, Dave Martin <Dave.Martin@arm.com>,
	jannh@google.com, mpe@ellerman.id.au, carlos@redhat.com,
	"bothersome-borer for tony.luck@intel.com" <tony.luck@intel.com>,
	"Ravi V. Shankar" <ravi.v.shankar@intel.com>,
	libc-alpha@sourceware.org, linux-arch@vger.kernel.org,
	linux-api@vger.kernel.org,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v7 0/6] x86: Improve Minimum Alternate Stack Size
Date: Sat, 20 Mar 2021 18:32:00 +0100	[thread overview]
Message-ID: <20210320173200.GA4153106@gmail.com> (raw)
In-Reply-To: <CAJvTdKnpWL8y4N_BrCiK7fU0UXERwuuM8o84LUpp7Watxd8STw@mail.gmail.com>


* Len Brown <lenb@kernel.org> wrote:

> On Wed, Mar 17, 2021 at 6:45 AM Ingo Molnar <mingo@kernel.org> wrote:
> >
> >
> > * Ingo Molnar <mingo@kernel.org> wrote:
> >
> > >
> > > * Chang S. Bae <chang.seok.bae@intel.com> wrote:
> > >
> > > > During signal entry, the kernel pushes data onto the normal userspace
> > > > stack. On x86, the data pushed onto the user stack includes XSAVE state,
> > > > which has grown over time as new features and larger registers have been
> > > > added to the architecture.
> > > >
> > > > MINSIGSTKSZ is a constant provided in the kernel signal.h headers and
> > > > typically distributed in lib-dev(el) packages, e.g. [1]. Its value is
> > > > compiled into programs and is part of the user/kernel ABI. The MINSIGSTKSZ
> > > > constant indicates to userspace how much data the kernel expects to push on
> > > > the user stack, [2][3].
> > > >
> > > > However, this constant is much too small and does not reflect recent
> > > > additions to the architecture. For instance, when AVX-512 states are in
> > > > use, the signal frame size can be 3.5KB while MINSIGSTKSZ remains 2KB.
> > > >
> > > > The bug report [4] explains this as an ABI issue. The small MINSIGSTKSZ can
> > > > cause user stack overflow when delivering a signal.
> > >
> > > >   uapi: Define the aux vector AT_MINSIGSTKSZ
> > > >   x86/signal: Introduce helpers to get the maximum signal frame size
> > > >   x86/elf: Support a new ELF aux vector AT_MINSIGSTKSZ
> > > >   selftest/sigaltstack: Use the AT_MINSIGSTKSZ aux vector if available
> > > >   x86/signal: Detect and prevent an alternate signal stack overflow
> > > >   selftest/x86/signal: Include test cases for validating sigaltstack
> > >
> > > So this looks really complicated, is this justified?
> > >
> > > Why not just internally round up sigaltstack size if it's too small?
> > > This would be more robust, as it would fix applications that use
> > > MINSIGSTKSZ but don't use the new AT_MINSIGSTKSZ facility.
> > >
> > > I.e. does AT_MINSIGSTKSZ have any other uses than avoiding the
> > > segfault if MINSIGSTKSZ is used to create a small signal stack?
> >
> > I.e. if the kernel sees a too small ->ss_size in sigaltstack() it
> > would ignore ->ss_sp and mmap() a new sigaltstack instead and use that
> > for the signal handler stack.
> >
> > This would automatically make MINSIGSTKSZ - and other too small sizes
> > work today, and in the future.
> >
> > But the question is, is there user-space usage of sigaltstacks that
> > relies on controlling or reading the contents of the stack?
> >
> > longjmp using programs perhaps?
> 
> For the legacy binary that requests a too-small sigaltstack, there are
> several choices:
> 
> We could detect the too-small stack at sigaltstack(2) invocation and
> return an error.
> This results in two deal-killing problems:
> First, some applications don't check the return value, so the check
> would be fruitless.
> Second, those that check and error-out may be programs that never
> actually take the signal, and so we'd be causing a dusty binary to
> exit, when it didn't exit on another system, or another kernel.
> 
> Or we could detect the too small stack at signal registration time.
> This has the same two deal-killers as above.
> 
> Then there is the approach in this patch-set, which detects an
> imminent stack overflow at run time.
> It has neither of the two problems above, and the benefit that we now
> prevent data corruption
> that could have been happening on some systems already today.  The
> down side is that the dusty binary
> that does request the too-small stack can now die at run time.
> 
> So your idea of recognizing the problem and conjuring up a 
> sufficient stack is compelling, since it would likely "just work", 
> no matter how dumb the program. But where would the the sufficient 
> stack come from -- is this a new kernel buffer, or is there a way to 
> abscond some user memory?  I would expect a signal handler to look 
> at the data on its stack and nobody else will look at that stack.  
> But this is already an unreasonable program for allocating a special 
> signal stack in the first place :-/ So yes, one could imagine the 
> signal handler could longjump instead of gracefully completing, and 
> if this specially allocated signal stack isn't where the user 
> planned, that could be trouble.

We could mmap() (implicitly) new anonymous memory - but I can see why 
this is probably more trouble than worth...

> Another idea we discussed was to detect the potential overflow at 
> run-time, and instead of killing the process, just push the signal 
> onto the regular user stack. this might actually work, but it is 
> sort of devious; and it would not work in the case where the user 
> overflowed their regular stack already, which may be the most 
> (only?) compelling reason that they allocated and declared a special 
> sigaltstack in the first place...

Yeah, this doesn't sound deterministic enough.

Ok, thanks for the detailed answers - I withdraw my objections, let's 
proceed with the approach you are proposing?

Thanks,

	Ingo

      reply	other threads:[~2021-03-20 17:33 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-16  6:52 [PATCH v7 0/6] x86: Improve Minimum Alternate Stack Size Chang S. Bae
2021-03-16  6:52 ` [PATCH v7 1/6] uapi: Define the aux vector AT_MINSIGSTKSZ Chang S. Bae
2021-03-16  6:52 ` [PATCH v7 2/6] x86/signal: Introduce helpers to get the maximum signal frame size Chang S. Bae
2021-03-16  6:52 ` [PATCH v7 3/6] x86/elf: Support a new ELF aux vector AT_MINSIGSTKSZ Chang S. Bae
2021-03-16  6:52 ` [PATCH v7 4/6] selftest/sigaltstack: Use the AT_MINSIGSTKSZ aux vector if available Chang S. Bae
2021-03-16  6:52 ` [PATCH v7 5/6] x86/signal: Detect and prevent an alternate signal stack overflow Chang S. Bae
2021-03-16 11:52   ` Borislav Petkov
2021-03-16 18:26     ` Bae, Chang Seok
2021-03-25 16:20       ` Borislav Petkov
2021-03-25 17:21         ` Bae, Chang Seok
2021-03-25 20:14           ` Florian Weimer
2021-03-25 18:13   ` Andy Lutomirski
2021-03-25 18:54     ` Borislav Petkov
2021-03-25 21:11       ` Bae, Chang Seok
2021-03-25 21:27         ` Borislav Petkov
2021-03-26  4:56       ` Andy Lutomirski
2021-03-26 10:30         ` Borislav Petkov
2021-04-12 22:30           ` Bae, Chang Seok
2021-04-14 10:12             ` Borislav Petkov
2021-04-14 11:30               ` Florian Weimer
2021-04-14 12:06                 ` Borislav Petkov
2021-05-03  5:30                   ` Florian Weimer
2021-05-03 11:17                     ` Borislav Petkov
2021-03-26  4:58     ` Andy Lutomirski
2021-03-16  6:52 ` [PATCH v7 6/6] selftest/x86/signal: Include test cases for validating sigaltstack Chang S. Bae
2021-03-17 10:06 ` [PATCH v7 0/6] x86: Improve Minimum Alternate Stack Size Ingo Molnar
2021-03-17 10:44   ` Ingo Molnar
2021-03-19 18:12     ` Len Brown
2021-03-20 17:32       ` Ingo Molnar [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210320173200.GA4153106@gmail.com \
    --to=mingo@kernel.org \
    --cc=Dave.Martin@arm.com \
    --cc=bp@suse.de \
    --cc=carlos@redhat.com \
    --cc=chang.seok.bae@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=hjl.tools@gmail.com \
    --cc=jannh@google.com \
    --cc=len.brown@intel.com \
    --cc=lenb@kernel.org \
    --cc=libc-alpha@sourceware.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mpe@ellerman.id.au \
    --cc=ravi.v.shankar@intel.com \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).