All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: git@vger.kernel.org
Cc: "Junio C Hamano" <gitster@pobox.com>,
	"Emily Shaffer" <emilyshaffer@google.com>,
	"Taylor Blau" <me@ttaylorr.com>,
	"Eric Sunshine" <sunshine@sunshineco.com>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Subject: [PATCH v2 0/6] tr2: plug memory leaks + logic errors + Win32 & Linux feature parity
Date: Thu, 26 Aug 2021 14:22:18 +0200	[thread overview]
Message-ID: <cover-v2-0.6-00000000000-20210826T121820Z-avarab@gmail.com> (raw)
In-Reply-To: <cover-0.6-00000000000-20210825T231400Z-avarab@gmail.com>

An early re-roll due to some very good & early review by Taylor Blau,
this also fixes the grammar/typo pointed out by Eric Sunshine. Thanks
both:

This should address all the comments Taylor brought up, and hopefully
a bit more. The only thing I left outstanding was the inclusion of the
"while we're at it" enum refactoring in 5/6. It's not necessary for
the end-state here, but I think since we're already reviewing most of
this file it makes sense to change it while we're at it to
future-proof the code vis-as-vis getting stricted checks from the
compiler.

Ævar Arnfjörð Bjarmason (6):
  tr2: remove NEEDSWORK comment for "non-procfs" implementations
  tr2: clarify TRACE2_PROCESS_INFO_EXIT comment under Linux
  tr2: stop leaking "thread_name" memory
  tr2: fix memory leak & logic error in 2f732bf15e6
  tr2: do compiler enum check in trace2_collect_process_info()
  tr2: log N parent process names on Linux

 compat/linux/procinfo.c | 169 ++++++++++++++++++++++++++++++++++------
 trace2/tr2_tls.c        |   1 +
 2 files changed, 146 insertions(+), 24 deletions(-)

Range-diff against v1:
1:  8c649ce3b49 = 1:  8c649ce3b49 tr2: remove NEEDSWORK comment for "non-procfs" implementations
2:  0150e3402a7 = 2:  0150e3402a7 tr2: clarify TRACE2_PROCESS_INFO_EXIT comment under Linux
3:  1fa1bbb6743 ! 3:  1d835d6767e tr2: stop leaking "thread_name" memory
    @@ Commit message
         Fix a memory leak introduced in ee4512ed481 (trace2: create new
         combined trace facility, 2019-02-22), we were doing a free() of other
         memory allocated in tr2tls_create_self(), but not the "thread_name"
    -    "strbuf strbuf".
    +    "struct strbuf".
     
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
4:  73e7d4eb6ac ! 4:  1aa0dbc394e tr2: fix memory leak & logic error in 2f732bf15e6
    @@ Commit message
         It was also using the strvec_push() API the wrong way. That API always
         does an xstrdup(), so by detaching the strbuf here we'd leak
         memory. Let's instead pass in our pointer for strvec_push() to
    -    xstrdup(), and then free our own strbuf.
    +    xstrdup(), and then free our own strbuf. I do have some WIP changes to
    +    make strvec_push_nodup() non-static, which makes this and some other
    +    callsites nicer, but let's just follow the prevailing pattern of using
    +    strvec_push() for now.
     
    -    Furthermore we need to free that "procfs_path" strbuf whether or not
    +    We'll also need to free that "procfs_path" strbuf whether or not
         strbuf_read_file() succeeds, which was another source of memory leaks
         in 2f732bf15e6, i.e. we'd leak that memory as well if we weren't on a
         system where we could read the file from procfs.
     
    +    Let's move all the freeing of the memory to the end of the
    +    function. If we're still at STRBUF_INIT with "name" due to not haven
    +    taken the branch where the strbuf_read_file() succeeds freeing it is
    +    redundant, so we could move it into the body of the "if", but just
    +    handling freeing the same way for all branches of the function makes
    +    it more readable.
    +
         In combination with the preceding commit this makes all of
         t[0-9]*trace2*.sh pass under SANITIZE=leak on Linux.
     
    @@ compat/linux/procinfo.c: static void get_ancestry_names(struct strvec *names)
      		strbuf_trim_trailing_newline(&name);
     -		strvec_push(names, strbuf_detach(&name, NULL));
     +		strvec_push(names, name.buf);
    -+		strbuf_release(&name);
      	}
      
     +	strbuf_release(&procfs_path);
    ++	strbuf_release(&name);
    ++
      	return;
      }
      
5:  4e378da2cce = 5:  70fef093d8d tr2: do compiler enum check in trace2_collect_process_info()
6:  da003330800 ! 6:  f6aac902484 tr2: log N parent process names on Linux
    @@ Commit message
         on Linux only the name of the immediate parent process was logged.
     
         Extend the functionality added there to also log full parent chain on
    -    Linux. In 2f732bf15e6 it was claimed that "further ancestry info can
    -    be gathered with procfs, but it's unwieldy to do so.".
    -
    -    I don't know what the author meant by that, but I think it probably
    -    referred to needing to slurp this up from the FS, as opposed to having
    -    an API. The underlying semantics on Linux are easier to deal with than
    -    on Windows though, at least as far as finding the parent PIDs
    -    goes. See the get_processes() function used on Windows. As shown in
    -    353d3d77f4f (trace2: collect Windows-specific process information,
    -    2019-02-22) it needs to deal with cycles.
    -
    -    What is more complex on Linux is getting at the process name, a
    -    simpler approach is to use fscanf(), see [1] for an implementation of
    -    that, but as noted in the comment being added here it would fail in
    -    the face of some weird process names, so we need our own
    -    parse_proc_stat() to parse it out.
    +    Linux.
    +
    +    This requires us to lookup "/proc/<getppid()>/stat" instead of
    +    "/proc/<getppid()>/comm". The "comm" file just contains the name of the
    +    process, but the "stat" file has both that information, and the parent
    +    PID of that process, see procfs(5). We parse out the parent PID of our
    +    own parent, and recursively walk the chain of "/proc/*/stat" files all
    +    the way up the chain. A parent PID of 0 indicates the end of the
    +    chain.
    +
    +    It's possible given the semantics of Linux's PID files that we end up
    +    getting an entirely nonsensical chain of processes. It could happen if
    +    e.g. we have a chain of processes like:
    +
    +        1 (init) => 321 (bash) => 123 (git)
    +
    +    Let's assume that "bash" was started a while ago, and that as shown
    +    the OS has already cycled back to using a lower PID for us than our
    +    parent process. In the time it takes us to start up and get to
    +    trace2_collect_process_info(TRACE2_PROCESS_INFO_STARTUP) our parent
    +    process might exit, and be replaced by an entirely different process!
    +
    +    We'd racily look up our own getppid(), but in the meantime our parent
    +    would exit, and Linux would have cycled all the way back to starting
    +    an entirely unrelated process as PID 321.
    +
    +    If that happens we'll just silently log incorrect data in our ancestry
    +    chain. Luckily we don't need to worry about this except in this
    +    specific cycling scenario, as Linux does not have PID
    +    randomization. It appears it once did through a third-party feature,
    +    but that it was removed around 2006[1]. For anyone worried about this
    +    edge case raising PID_MAX via "/proc/sys/kernel/pid_max" will mitigate
    +    it, but not eliminate it.
    +
    +    One thing we don't need to worry about is getting into an infinite
    +    loop when walking "/proc/*/stat". See 353d3d77f4f (trace2: collect
    +    Windows-specific process information, 2019-02-22) for the related
    +    Windows code that needs to deal with that, and [2] for an explanation
    +    of that edge case.
    +
    +    Aside from potential race conditions it's also a bit painful to
    +    correctly parse the process name out of "/proc/*/stat". A simpler
    +    approach is to use fscanf(), see [3] for an implementation of that,
    +    but as noted in the comment being added here it would fail in the face
    +    of some weird process names, so we need our own parse_proc_stat() to
    +    parse it out.
     
         With this patch the "ancestry" chain for a trace2 event might look
         like this:
    @@ Commit message
               "systemd"
             ]
     
    -    And in the case of naughty process names. This uses perl's ability to
    -    use prctl(PR_SET_NAME, ...). See Perl/perl5@7636ea95c5 (Set the legacy
    -    process name with prctl() on assignment to $0 on Linux, 2010-04-15)[2]:
    +    And in the case of naughty process names like the following. This uses
    +    perl's ability to use prctl(PR_SET_NAME, ...). See
    +    Perl/perl5@7636ea95c5 (Set the legacy process name with prctl() on
    +    assignment to $0 on Linux, 2010-04-15)[4]:
     
             $ perl -e '$0 = "(naughty\nname)"; system "GIT_TRACE2_EVENT=/dev/stdout ~/g/git/git version"' | grep ancestry | jq -r .ancestry
             [
    @@ Commit message
               "systemd"
             ]
     
    -    1. https://lore.kernel.org/git/87o8agp29o.fsf@evledraar.gmail.com/
    -    2. https://github.com/Perl/perl5/commit/7636ea95c57762930accf4358f7c0c2dec086b5e
    +    1. https://grsecurity.net/news#grsec2110
    +    2. https://lore.kernel.org/git/48a62d5e-28e2-7103-a5bb-5db7e197a4b9@jeffhostetler.com/
    +    3. https://lore.kernel.org/git/87o8agp29o.fsf@evledraar.gmail.com/
    +    4. https://github.com/Perl/perl5/commit/7636ea95c57762930accf4358f7c0c2dec086b5e
     
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
    @@ compat/linux/procinfo.c
      
     -static void get_ancestry_names(struct strvec *names)
     +/*
    -+ * We need more complex parsing instat_parent_pid() and
    ++ * We need more complex parsing in stat_parent_pid() and
     + * parse_proc_stat() below than a dumb fscanf(). That's because while
     + * the statcomm field is surrounded by parentheses, the process itself
     + * is free to insert any arbitrary byte sequence its its name. That
    -+ * can include newlines, spaces, closing parentheses etc. See
    -+ * do_task_stat() in fs/proc/array.c in linux.git, this is in contrast
    -+ * with the escaped version of the name found in /proc/%d/status.
    ++ * can include newlines, spaces, closing parentheses etc.
    ++ *
    ++ * See do_task_stat() in fs/proc/array.c in linux.git, this is in
    ++ * contrast with the escaped version of the name found in
    ++ * /proc/%d/status.
     + *
     + * So instead of using fscanf() we'll read N bytes from it, look for
     + * the first "(", and then the last ")", anything in-between is our
    @@ compat/linux/procinfo.c
     + * that's 7 digits for a PID. We have 2 PIDs in the first four fields
     + * we're interested in, so 2 * 7 = 14.
     + *
    -+ * We then have 4 spaces between those four values, which brings us up
    -+ * to 18. Add the two parentheses and it's 20. The "state" is then one
    -+ * character (now at 21).
    ++ * We then have 3 spaces between those four values, and we'd like to
    ++ * get to the space between the 4th and the 5th (the "pgrp" field) to
    ++ * make sure we read the entire "ppid" field. So that brings us up to
    ++ * 14 + 3 + 1 = 18. Add the two parentheses around the "comm" value
    ++ * and it's 20. The "state" value itself is then one character (now at
    ++ * 21).
     + *
     + * Finally the maximum length of the "comm" name itself is 15
     + * characters, e.g. a setting of "123456789abcdefg" will be truncated
    @@ compat/linux/procinfo.c
     +static int parse_proc_stat(struct strbuf *sb, struct strbuf *name,
     +			    int *statppid)
      {
    -+	const char *lhs = strchr(sb->buf, '(');
    -+	const char *rhs = strrchr(sb->buf, ')');
    ++	const char *comm_lhs = strchr(sb->buf, '(');
    ++	const char *comm_rhs = strrchr(sb->buf, ')');
     +	const char *ppid_lhs, *ppid_rhs;
     +	char *p;
     +	pid_t ppid;
     +
    -+	if (!lhs || !rhs)
    ++	if (!comm_lhs || !comm_rhs)
     +		goto bad_kernel;
     +
      	/*
    @@ compat/linux/procinfo.c
     +	 * We're at the ")", that's followed by " X ", where X is a
     +	 * single "state" character. So advance by 4 bytes.
      	 */
    -+	ppid_lhs = rhs + 4;
    ++	ppid_lhs = comm_rhs + 4;
     +
    ++	/*
    ++	 * Read until the space between the "ppid" and "pgrp" fields
    ++	 * to make sure we're anchored after the untruncated "ppid"
    ++	 * field..
    ++	 */
     +	ppid_rhs = strchr(ppid_lhs, ' ');
     +	if (!ppid_rhs)
     +		goto bad_kernel;
     +
     +	ppid = strtol(ppid_lhs, &p, 10);
     +	if (ppid_rhs == p) {
    -+		const char *comm = lhs + 1;
    -+		int commlen = rhs - lhs - 1;
    ++		const char *comm = comm_lhs + 1;
    ++		size_t commlen = comm_rhs - comm;
     +
    -+		strbuf_addf(name, "%.*s", commlen, comm);
    ++		strbuf_add(name, comm, commlen);
     +		*statppid = ppid;
     +
     +		return 0;
    @@ compat/linux/procinfo.c
      	struct strbuf procfs_path = STRBUF_INIT;
     -	struct strbuf name = STRBUF_INIT;
     +	struct strbuf sb = STRBUF_INIT;
    -+	size_t n;
    -+	FILE *fp = NULL;
    ++	FILE *fp;
     +	int ret = -1;
      
      	/* try to use procfs if it's present. */
    @@ compat/linux/procinfo.c
     -	if (strbuf_read_file(&name, procfs_path.buf, 0) > 0) {
     -		strbuf_trim_trailing_newline(&name);
     -		strvec_push(names, name.buf);
    --		strbuf_release(&name);
     -	}
     +	strbuf_addf(&procfs_path, "/proc/%d/stat", pid);
     +	fp = fopen(procfs_path.buf, "r");
     +	if (!fp)
     +		goto cleanup;
     +
    -+	n = strbuf_fread(&sb, STAT_PARENT_PID_READ_N, fp);
    -+	if (n != STAT_PARENT_PID_READ_N)
    ++	/*
    ++	 * We could be more strict here and assert that we read at
    ++	 * least STAT_PARENT_PID_READ_N. My reading of procfs(5) is
    ++	 * that on any modern kernel (at least since 2.6.0 released in
    ++	 * 2003) even if all the mandatory numeric fields were zero'd
    ++	 * out we'd get at least 100 bytes, but let's just check that
    ++	 * we got anything at all and trust the parse_proc_stat()
    ++	 * function to handle its "Bad Kernel?" error checking.
    ++	 */
    ++	if (!strbuf_fread(&sb, STAT_PARENT_PID_READ_N, fp))
     +		goto cleanup;
     +	if (parse_proc_stat(&sb, name, statppid) < 0)
     +		goto cleanup;
    @@ compat/linux/procinfo.c
     +	if (ppid)
     +		push_ancestry_name(names, ppid);
     +cleanup:
    -+	strbuf_release(&name);
    - 	return;
    - }
    + 	strbuf_release(&name);
      
    + 	return;
     @@ compat/linux/procinfo.c: void trace2_collect_process_info(enum trace2_process_info_reason reason)
      		 */
      		break;
-- 
2.33.0.733.ga72a4f1c2e1


  parent reply	other threads:[~2021-08-26 12:22 UTC|newest]

Thread overview: 87+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-07  0:29 [PATCH] tr2: log parent process name Emily Shaffer
2021-05-07  3:25 ` Bagas Sanjaya
2021-05-07 17:09 ` Emily Shaffer
2021-05-10 12:29 ` Ævar Arnfjörð Bjarmason
2021-05-11 21:31   ` Junio C Hamano
2021-05-14 22:06   ` Emily Shaffer
2021-05-16  3:48     ` Junio C Hamano
2021-05-17 20:17       ` Emily Shaffer
2021-05-11 17:28 ` Jeff Hostetler
2021-05-14 22:07   ` Emily Shaffer
2021-05-20 21:05 ` [PATCH v2] " Emily Shaffer
2021-05-20 21:36   ` Randall S. Becker
2021-05-20 23:23     ` Emily Shaffer
2021-05-21 13:20       ` Randall S. Becker
2021-05-21 16:24         ` Randall S. Becker
2021-05-21  2:09   ` Junio C Hamano
2021-05-21 19:02     ` Emily Shaffer
2021-05-21 23:22       ` Junio C Hamano
2021-05-24 18:37         ` Emily Shaffer
2021-05-21 19:15   ` Jeff Hostetler
2021-05-21 20:05     ` Emily Shaffer
2021-05-21 20:23       ` Randall S. Becker
2021-05-22 11:18       ` Jeff Hostetler
2021-05-24 23:33       ` Ævar Arnfjörð Bjarmason
2021-05-24 20:10   ` [PATCH v3] " Emily Shaffer
2021-05-24 20:49     ` Emily Shaffer
2021-05-25  3:54     ` Junio C Hamano
2021-05-25 13:33       ` Randall S. Becker
2021-06-08 18:58     ` [PATCH v4] " Emily Shaffer
2021-06-08 20:56       ` Emily Shaffer
2021-06-08 22:10       ` [PATCH v5] " Emily Shaffer
2021-06-08 22:16         ` Randall S. Becker
2021-06-08 22:24           ` Emily Shaffer
2021-06-08 22:39             ` Randall S. Becker
2021-06-09 20:17               ` Emily Shaffer
2021-06-16  8:42         ` Junio C Hamano
2021-06-28 16:45         ` Jeff Hostetler
2021-06-29 23:51           ` Emily Shaffer
2021-06-30  6:10             ` Ævar Arnfjörð Bjarmason
2021-07-22  0:21               ` Emily Shaffer
2021-07-22  1:27         ` [PATCH v6 0/2] " Emily Shaffer
2021-07-22  1:27           ` [PATCH v6 1/2] tr2: make process info collection platform-generic Emily Shaffer
2021-08-02  9:34             ` Ævar Arnfjörð Bjarmason
2021-07-22  1:27           ` [PATCH v6 2/2] tr2: log parent process name Emily Shaffer
2021-07-22 21:02             ` Junio C Hamano
2021-08-02  9:38             ` Ævar Arnfjörð Bjarmason
2021-08-02 12:45               ` Ævar Arnfjörð Bjarmason
2021-08-02 10:22             ` Ævar Arnfjörð Bjarmason
2021-08-02 12:47               ` Ævar Arnfjörð Bjarmason
2021-08-02 15:23               ` Jeff Hostetler
2021-08-02 16:10               ` Randall S. Becker
2021-08-02 18:41                 ` Ævar Arnfjörð Bjarmason
2021-08-25 23:19               ` [PATCH 0/6] tr2: plug memory leaks + logic errors + Win32 & Linux feature parity Ævar Arnfjörð Bjarmason
2021-08-25 23:19                 ` [PATCH 1/6] tr2: remove NEEDSWORK comment for "non-procfs" implementations Ævar Arnfjörð Bjarmason
2021-08-25 23:19                 ` [PATCH 2/6] tr2: clarify TRACE2_PROCESS_INFO_EXIT comment under Linux Ævar Arnfjörð Bjarmason
2021-08-25 23:19                 ` [PATCH 3/6] tr2: stop leaking "thread_name" memory Ævar Arnfjörð Bjarmason
2021-08-26  3:09                   ` Taylor Blau
2021-08-25 23:19                 ` [PATCH 4/6] tr2: fix memory leak & logic error in 2f732bf15e6 Ævar Arnfjörð Bjarmason
2021-08-26  3:21                   ` Taylor Blau
2021-08-25 23:19                 ` [PATCH 5/6] tr2: do compiler enum check in trace2_collect_process_info() Ævar Arnfjörð Bjarmason
2021-08-26  3:23                   ` Taylor Blau
2021-08-25 23:19                 ` [PATCH 6/6] tr2: log N parent process names on Linux Ævar Arnfjörð Bjarmason
2021-08-25 23:49                   ` Eric Sunshine
2021-08-26  4:07                   ` Taylor Blau
2021-08-26 12:24                     ` "I don't know what the author meant by that..." (was "Re: [PATCH 6/6] tr2: log N parent process names on Linux") Ævar Arnfjörð Bjarmason
2021-08-26 12:22                 ` Ævar Arnfjörð Bjarmason [this message]
2021-08-26 12:22                   ` [PATCH v2 1/6] tr2: remove NEEDSWORK comment for "non-procfs" implementations Ævar Arnfjörð Bjarmason
2021-08-26 12:22                   ` [PATCH v2 2/6] tr2: clarify TRACE2_PROCESS_INFO_EXIT comment under Linux Ævar Arnfjörð Bjarmason
2021-08-26 12:22                   ` [PATCH v2 3/6] tr2: stop leaking "thread_name" memory Ævar Arnfjörð Bjarmason
2021-08-26 12:22                   ` [PATCH v2 4/6] tr2: fix memory leak & logic error in 2f732bf15e6 Ævar Arnfjörð Bjarmason
2021-08-26 15:58                     ` Eric Sunshine
2021-08-26 16:42                     ` Junio C Hamano
2021-08-26 12:22                   ` [PATCH v2 5/6] tr2: do compiler enum check in trace2_collect_process_info() Ævar Arnfjörð Bjarmason
2021-08-26 12:22                   ` [PATCH v2 6/6] tr2: log N parent process names on Linux Ævar Arnfjörð Bjarmason
2021-08-26 22:38                   ` [PATCH v2 0/6] tr2: plug memory leaks + logic errors + Win32 & Linux feature parity Taylor Blau
2021-08-27  8:02                   ` [PATCH v3 " Ævar Arnfjörð Bjarmason
2021-08-27  8:02                     ` [PATCH v3 1/6] tr2: remove NEEDSWORK comment for "non-procfs" implementations Ævar Arnfjörð Bjarmason
2021-08-27  8:02                     ` [PATCH v3 2/6] tr2: clarify TRACE2_PROCESS_INFO_EXIT comment under Linux Ævar Arnfjörð Bjarmason
2021-08-27  8:02                     ` [PATCH v3 3/6] tr2: stop leaking "thread_name" memory Ævar Arnfjörð Bjarmason
2021-08-27  8:02                     ` [PATCH v3 4/6] tr2: leave the parent list empty upon failure & don't leak memory Ævar Arnfjörð Bjarmason
2021-08-27  8:02                     ` [PATCH v3 5/6] tr2: do compiler enum check in trace2_collect_process_info() Ævar Arnfjörð Bjarmason
2021-08-27  8:02                     ` [PATCH v3 6/6] tr2: log N parent process names on Linux Ævar Arnfjörð Bjarmason
2021-08-31  0:17                     ` [PATCH v3 0/6] tr2: plug memory leaks + logic errors + Win32 & Linux feature parity Taylor Blau
2021-08-02 10:30             ` [PATCH v6 2/2] tr2: log parent process name Ævar Arnfjörð Bjarmason
2021-08-02 16:24               ` Junio C Hamano
2021-08-02 18:42                 ` Ævar Arnfjörð Bjarmason
2021-07-22 16:59           ` [PATCH v6 0/2] " Jeff Hostetler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover-v2-0.6-00000000000-20210826T121820Z-avarab@gmail.com \
    --to=avarab@gmail.com \
    --cc=emilyshaffer@google.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=me@ttaylorr.com \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.