linux-man.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: enh <enh@google.com>
To: Florian Weimer <fw@deneb.enyo.de>
Cc: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>,
	linux-man@vger.kernel.org
Subject: Re: [PATCH] pthread_kill.3: Update to match POSIX.
Date: Thu, 11 Nov 2021 16:01:41 -0800	[thread overview]
Message-ID: <CAJgzZooymW7fHnpCeVmhrAe-uue9zdssdP-QHeRtPN3MkVsnNA@mail.gmail.com> (raw)
In-Reply-To: <87lf1wjxcu.fsf@mid.deneb.enyo.de>

that was never the issue though ... well, okay, that's *an* issue, but
not the one i'm most concerned about :-)

the issue i'm trying to fix (and so maybe need to find even clearer
wording for) is basically this:

  * lots of people don't realize that pthread_t != pid_t
  * they think that "the worst that can happen" when passing a
no-longer valid pthread_t to these functions is ESRCH
  * they don't realize that using pthread_kill(3) like this is just a
use-after-free bug

i think one reason this persists is glibc's thread cache makes it
harder to hit there. i don't actually know whether glibc's thread
cache has an eviction policy at all? if it doesn't, that would indeed
turn this use-after-free into "just" a question of whether you have
the right pid_t or not. but assuming glibc's thread cache _does_ have
an eviction policy, glibc's in the same boat as more svelte libcs
(such as bionic and musl, plus the BSDs, and also Apple's anonymous
libc) --- it just needs more threads.

this confusion causes bugs (and crashes) today, and it's only going to
get worse as we get better tools for detecting UAF, such as Arm MTE,
and it's really hard to get people to understand the problem when the
man page is worded as it currently is (with a weak "can, for example"
hidden in the NOTES section).

if it really is the case that glibc has no eviction policy, then my
suggestion will probably be that we come up with wording along the
lines of "libc implementers face a choice here between the memory cost
of never freeing [whatever man7 calls the TCB] or not being able to
detect this temporal error; of all the libcs, only glibc chose the
former".

if nothing else that should at least answer the question "why can't
you just be like glibc?" :-)

the current text kind of sounds like "glibc has a great
implementation, but POSIX doesn't require that, and everyone else
sucks", but that's pretty misleading. (even if the glibc
implementation is safe, which i'm not sure it is. it also seems like
the "that's not a cache, that's a memory leak" implementation would
preclude memory tagging for thread stacks, which would be another
infelicity?)

-*-

this page is a bit weird in general... ESRCH isn't mentioned in
ERRORS, but the sig == 0 case is called out in DESCRIPTION, but you
need to read NOTES to find out that that's basically broken. and
no-where on the page do we try to describe alternatives that _do_
work. (happy to volunteer text along the lines of "you need to stash
your thread's tid at a time when you *know* the pthread_t is valid,
such as when the thread starts, and then you can use that with kill(2)
and sig == 0 to do what you _thought_ pthread_kill(3) with sig == 0
did, which still isn't 100% safe in light of pid wrapping, but is the
best you can get if you refuse to actually keep track of your threads'
lifetimes properly :-P ".)

actually, even this would be quite a good improvement:

        If sig is 0, then no signal is sent, but error checking is still
-       performed.
+       performed. See NOTES for why this can't be used to detect
whether another thread is still running.


On Tue, Nov 9, 2021 at 11:16 PM Florian Weimer <fw@deneb.enyo.de> wrote:
>
> > any comment from either of the maintainers?
> >
> > i think what we currently have on this page is factually incorrect,
> > and this patch better matches reality.
>
> One more data point:
>
> As of glibc 2.34, pthread_kill in glibc cannot fail with ESRCH anymore
> (unless the kernel thread is terminated by a direct system call).  And
> the race that the signal could be sent to the wrong thread is gone.
>
> > On Tue, Nov 12, 2019 at 10:10 PM Florian Weimer <fw@deneb.enyo.de> wrote:
> >>
> >> * enh:
> >>
> >> > On Tue, Nov 12, 2019 at 9:51 PM Florian Weimer <fw@deneb.enyo.de> wrote:
> >> >>
> >> >> * enh:
> >> >>
> >> >> > no, because the C library has two choices when a thread exits:
> >> >> >
> >> >> > 1. unmap the thread.
> >> >> >
> >> >> > 2. keep the thread around for recycling.
> >> >> >
> >> >> > if you choose 1 (optimizing for space, like Android), your dereference
> >> >> > is illegal.
> >> >>
> >> >> This choice is only available for threads in a detached state.  For
> >> >> joinable threads, a conforming implementation cannot immediately
> >> >> deallocate all data structures on thread termination.  Among other
> >> >> things, it has to store the future return value of pthread_join
> >> >> somewhere.
> >> >
> >> > ah, you're trying to say "signal 0 is potentially usable for a
> >> > joinable thread that's waiting to be joined"? that's true, but i'm not
> >> > sure how that's relevant to this patch. that wouldn't be an "invalid
> >> > thread ID" until it's joined.
> >>
> >> Correct.  That's POSIX's argument why ESRCH wouldn't be valid to
> >> return here.  It's still a forceful loss of information, and
> >> particularly annoying since POSIX doesn't specify pthread_tryjoin.
> >>
> >> But I'm glad we've brought our discussion to a conclusion. 8-)

  reply	other threads:[~2021-11-12  0:01 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-12 20:36 [PATCH] pthread_kill.3: Update to match POSIX enh
2019-11-12 21:38 ` Florian Weimer
2019-11-12 21:40   ` enh
2019-11-12 21:52     ` Florian Weimer
2019-11-12 22:06       ` enh
2019-11-12 22:11         ` Florian Weimer
2019-11-12 22:22           ` enh
2019-11-12 22:28             ` Florian Weimer
2019-11-13  5:27               ` enh
2019-11-13  5:51                 ` Florian Weimer
2019-11-13  5:59                   ` enh
2019-11-13  6:10                     ` Florian Weimer
2021-11-09 23:00                       ` enh
2021-11-10  7:14                         ` Florian Weimer
2021-11-12  0:01                           ` enh [this message]
2021-11-12 13:02                             ` Florian Weimer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJgzZooymW7fHnpCeVmhrAe-uue9zdssdP-QHeRtPN3MkVsnNA@mail.gmail.com \
    --to=enh@google.com \
    --cc=fw@deneb.enyo.de \
    --cc=linux-man@vger.kernel.org \
    --cc=mtk.manpages@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).