It can be useful to tell who invoked Git - was it invoked manually by a user via CLI or script? By an IDE? Knowing where the Git invocation came from can help with debugging to isolate where the problem came from. Unfortunately, there's no cross-platform reliable way to gather the name of the parent process. If procfs is present, we can use that; otherwise we will need to discover the name another way. However, the process ID should be sufficient regardless of platform. Signed-off-by: Emily Shaffer <emilyshaffer@google.com> --- We briefly discussed hiding this behind a config, internally. However, I wanted to include the parent name alongside the cmd_start event, which happens very early (maybe before config gathering?). Maybe it's better to log the parent_name as its own event, since it shouldn't change over the lifetime of the process? procfs is very non-portable, though - I think this won't even work on MacOS. So I'm curious if anybody has better suggestions for how to do this. - Emily Makefile | 1 + compat/procinfo.c | 19 +++++++++++++++++++ git-compat-util.h | 6 ++++++ t/t0212-trace2-event.sh | 8 ++++++++ t/t0212/parse_events.perl | 1 + trace2/tr2_tgt_event.c | 3 +++ 6 files changed, 38 insertions(+) create mode 100644 compat/procinfo.c diff --git a/Makefile b/Makefile index 93664d6714..19f5189c6f 100644 --- a/Makefile +++ b/Makefile @@ -855,6 +855,7 @@ LIB_OBJS += commit-graph.o LIB_OBJS += commit-reach.o LIB_OBJS += commit.o LIB_OBJS += compat/obstack.o +LIB_OBJS += compat/procinfo.o LIB_OBJS += compat/terminal.o LIB_OBJS += config.o LIB_OBJS += connect.o diff --git a/compat/procinfo.c b/compat/procinfo.c new file mode 100644 index 0000000000..7f4c8dd284 --- /dev/null +++ b/compat/procinfo.c @@ -0,0 +1,19 @@ +#include "git-compat-util.h" + +#include "strbuf.h" + +char *get_process_name(int pid) +{ + struct strbuf procfs_path = STRBUF_INIT; + struct strbuf out = STRBUF_INIT; + /* try to use procfs if it's present. */ + strbuf_addf(&procfs_path, "/proc/%d/cmdline", pid); + if (strbuf_read_file(&out, procfs_path.buf, 0) > 0) + { + strbuf_release(&procfs_path); + return strbuf_detach(&out, NULL); + } + + /* NEEDSWORK: add non-procfs implementations here. */ + return NULL; +} diff --git a/git-compat-util.h b/git-compat-util.h index a508dbe5a3..cc7d5d8a2a 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -1382,4 +1382,10 @@ static inline void *container_of_or_null_offset(void *ptr, size_t offset) void sleep_millisec(int millisec); +/* + * Convert PID to process name (as would show in top/task manager). Returns + * NULL if unimplemented - be sure to check for NULL at callsite. + */ +char *get_process_name(int pid); + #endif diff --git a/t/t0212-trace2-event.sh b/t/t0212-trace2-event.sh index 1529155cf0..3a2a8a5b5f 100755 --- a/t/t0212-trace2-event.sh +++ b/t/t0212-trace2-event.sh @@ -61,6 +61,7 @@ test_expect_success JSON_PP 'event stream, error event' ' | "exit_code":0, | "hierarchy":"trace2", | "name":"trace2", + | "parent_name":"/bin/sh", | "version":"$V" | } |}; @@ -115,6 +116,7 @@ test_expect_success JSON_PP 'event stream, return code 0' ' | "exit_code":0, | "hierarchy":"trace2", | "name":"trace2", + | "parent_name":"/bin/sh", | "version":"$V" | }, | "_SID0_/_SID1_":{ @@ -143,6 +145,7 @@ test_expect_success JSON_PP 'event stream, return code 0' ' | "exit_code":0, | "hierarchy":"trace2/trace2", | "name":"trace2", + | "parent_name":"test-tool", | "version":"$V" | }, | "_SID0_/_SID1_/_SID2_":{ @@ -155,6 +158,7 @@ test_expect_success JSON_PP 'event stream, return code 0' ' | "exit_code":0, | "hierarchy":"trace2/trace2/trace2", | "name":"trace2", + | "parent_name":"$TEST_DIRECTORY/../t/helper//test-tool", | "version":"$V" | } |}; @@ -192,6 +196,7 @@ test_expect_success JSON_PP 'event stream, list config' ' | "value":"hello world" | } | ], + | "parent_name":"/bin/sh", | "version":"$V" | } |}; @@ -229,6 +234,7 @@ test_expect_success JSON_PP 'event stream, list env vars' ' | "value":"hello world" | } | ], + | "parent_name":"/bin/sh", | "version":"$V" | } |}; @@ -263,6 +269,7 @@ test_expect_success JSON_PP 'basic trace2_data' ' | "exit_code":0, | "hierarchy":"trace2", | "name":"trace2", + | "parent_name":"/bin/sh", | "version":"$V" | } |}; @@ -295,6 +302,7 @@ test_expect_success JSON_PP 'using global config, event stream, error event' ' | "exit_code":0, | "hierarchy":"trace2", | "name":"trace2", + | "parent_name":"/bin/sh", | "version":"$V" | } |}; diff --git a/t/t0212/parse_events.perl b/t/t0212/parse_events.perl index 6584bb5634..dd8b1be844 100644 --- a/t/t0212/parse_events.perl +++ b/t/t0212/parse_events.perl @@ -99,6 +99,7 @@ } elsif ($event eq 'start') { + $processes->{$sid}->{'parent_name'} = $line->{'parent_name'}; $processes->{$sid}->{'argv'} = $line->{'argv'}; $processes->{$sid}->{'argv'}[0] = "_EXE_"; } diff --git a/trace2/tr2_tgt_event.c b/trace2/tr2_tgt_event.c index 6353e8ad91..d258d5807c 100644 --- a/trace2/tr2_tgt_event.c +++ b/trace2/tr2_tgt_event.c @@ -145,10 +145,12 @@ static void fn_start_fl(const char *file, int line, const char *event_name = "start"; struct json_writer jw = JSON_WRITER_INIT; double t_abs = (double)us_elapsed_absolute / 1000000.0; + char *parent_name = get_process_name(getppid()); jw_object_begin(&jw, 0); event_fmt_prepare(event_name, file, line, NULL, &jw); jw_object_double(&jw, "t_abs", 6, t_abs); + jw_object_string(&jw, "parent_name", parent_name ? parent_name : NULL ); jw_object_inline_begin_array(&jw, "argv"); jw_array_argv(&jw, argv); jw_end(&jw); @@ -156,6 +158,7 @@ static void fn_start_fl(const char *file, int line, tr2_dst_write_line(&tr2dst_event, &jw.json); jw_release(&jw); + free(parent_name); } static void fn_exit_fl(const char *file, int line, uint64_t us_elapsed_absolute, -- 2.31.1.607.g51e8a6a459-goog
On 07/05/21 07.29, Emily Shaffer wrote: > It can be useful to tell who invoked Git - was it invoked manually by a > user via CLI or script? By an IDE? Knowing where the Git invocation came > from can help with debugging to isolate where the problem came from. > > Unfortunately, there's no cross-platform reliable way to gather the name > of the parent process. If procfs is present, we can use that; otherwise > we will need to discover the name another way. However, the process ID > should be sufficient regardless of platform. What about on Windows? > Signed-off-by: Emily Shaffer <emilyshaffer@google.com> > --- > We briefly discussed hiding this behind a config, internally. However, I > wanted to include the parent name alongside the cmd_start event, which > happens very early (maybe before config gathering?). > > Maybe it's better to log the parent_name as its own event, since it > shouldn't change over the lifetime of the process? > > procfs is very non-portable, though - I think this won't even work on > MacOS. So I'm curious if anybody has better suggestions for how to do > this. Maybe we can say that "currently the method of gathering parent process name that Git uses only work on Linux". -- An old man doll... just what I always wanted! - Clara
On Thu, May 06, 2021 at 05:29:08PM -0700, Emily Shaffer wrote: > > It can be useful to tell who invoked Git - was it invoked manually by a > user via CLI or script? By an IDE? Knowing where the Git invocation came > from can help with debugging to isolate where the problem came from. > > Unfortunately, there's no cross-platform reliable way to gather the name > of the parent process. If procfs is present, we can use that; otherwise > we will need to discover the name another way. However, the process ID > should be sufficient regardless of platform. > > Signed-off-by: Emily Shaffer <emilyshaffer@google.com> > --- > We briefly discussed hiding this behind a config, internally. However, I > wanted to include the parent name alongside the cmd_start event, which > happens very early (maybe before config gathering?). > > Maybe it's better to log the parent_name as its own event, since it > shouldn't change over the lifetime of the process? > > procfs is very non-portable, though - I think this won't even work on > MacOS. So I'm curious if anybody has better suggestions for how to do > this. I wrote this and then I wrote nonportable tests anyways, bah. Working on a fix now - the tests break for MacOS and Windows. The parent_name needs to be set conditionally below. > diff --git a/t/t0212-trace2-event.sh b/t/t0212-trace2-event.sh > index 1529155cf0..3a2a8a5b5f 100755 > --- a/t/t0212-trace2-event.sh > +++ b/t/t0212-trace2-event.sh > @@ -61,6 +61,7 @@ test_expect_success JSON_PP 'event stream, error event' ' > | "exit_code":0, > | "hierarchy":"trace2", > | "name":"trace2", > + | "parent_name":"/bin/sh", > | "version":"$V" > | } > |}; > @@ -115,6 +116,7 @@ test_expect_success JSON_PP 'event stream, return code 0' ' > | "exit_code":0, > | "hierarchy":"trace2", > | "name":"trace2", > + | "parent_name":"/bin/sh", > | "version":"$V" > | }, > | "_SID0_/_SID1_":{ > @@ -143,6 +145,7 @@ test_expect_success JSON_PP 'event stream, return code 0' ' > | "exit_code":0, > | "hierarchy":"trace2/trace2", > | "name":"trace2", > + | "parent_name":"test-tool", > | "version":"$V" > | }, > | "_SID0_/_SID1_/_SID2_":{ > @@ -155,6 +158,7 @@ test_expect_success JSON_PP 'event stream, return code 0' ' > | "exit_code":0, > | "hierarchy":"trace2/trace2/trace2", > | "name":"trace2", > + | "parent_name":"$TEST_DIRECTORY/../t/helper//test-tool", > | "version":"$V" > | } > |}; > @@ -192,6 +196,7 @@ test_expect_success JSON_PP 'event stream, list config' ' > | "value":"hello world" > | } > | ], > + | "parent_name":"/bin/sh", > | "version":"$V" > | } > |}; > @@ -229,6 +234,7 @@ test_expect_success JSON_PP 'event stream, list env vars' ' > | "value":"hello world" > | } > | ], > + | "parent_name":"/bin/sh", > | "version":"$V" > | } > |}; > @@ -263,6 +269,7 @@ test_expect_success JSON_PP 'basic trace2_data' ' > | "exit_code":0, > | "hierarchy":"trace2", > | "name":"trace2", > + | "parent_name":"/bin/sh", > | "version":"$V" > | } > |}; > @@ -295,6 +302,7 @@ test_expect_success JSON_PP 'using global config, event stream, error event' ' > | "exit_code":0, > | "hierarchy":"trace2", > | "name":"trace2", > + | "parent_name":"/bin/sh", > | "version":"$V" > | } > |}; - Emily
On Thu, May 06 2021, Emily Shaffer wrote:
> It can be useful to tell who invoked Git - was it invoked manually by a
> user via CLI or script? By an IDE? Knowing where the Git invocation came
> from can help with debugging to isolate where the problem came from.
Aside from the portability concerns others have raised, I don't really
see why you'd need this.
We already have the nest-level as part of the SID, so isn't it
sufficient (and portable) at the top-level to log what isatty says + set
the initial SID "root" in the IDE (which presumably knows about git).
Wouldn't this log passwords in cases of e.g.:
some-script --git-password secret # invokes "git"
In older versions of linux reading e.g. smaps from /proc/self would
stall the kernel while the read was happening, I haven't checked whether
cmdline is such a thing (probably not), but it's a subtle thing to have
in mind for this / follow-ups if it's made portable (is that an issue on
other OS's?).
All that being said I've got nothing fundamentally against this.
On 5/6/21 8:29 PM, Emily Shaffer wrote:
> It can be useful to tell who invoked Git - was it invoked manually by a
> user via CLI or script? By an IDE? Knowing where the Git invocation came
> from can help with debugging to isolate where the problem came from.
>
> Unfortunately, there's no cross-platform reliable way to gather the name
> of the parent process. If procfs is present, we can use that; otherwise
> we will need to discover the name another way. However, the process ID
> should be sufficient regardless of platform.
>
> Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
> ---
> We briefly discussed hiding this behind a config, internally. However, I
> wanted to include the parent name alongside the cmd_start event, which
> happens very early (maybe before config gathering?).
>
> Maybe it's better to log the parent_name as its own event, since it
> shouldn't change over the lifetime of the process?
>
> procfs is very non-portable, though - I think this won't even work on
> MacOS. So I'm curious if anybody has better suggestions for how to do
> this.
>
> - Emily
Look at `trace2_collect_process_info()` in `trace2.h` and
`compat/win32/trace2_win32_process_info.c`.
That function is designed to let the process call out to
platform-specific code to do things like getting the name
of the invoking process.
It is called with an enum to indicate when/why it is being
called.
On Windows, I have platform code to get the process ancestry,
the peak VM usage, and whether the process is being debugged.
Two of those were easy.... And all are very platform-specific.
They all generate events of the form:
trace2_data_*("process", "windows/<something>", ...)
Some of my `t021[012]/` helper scripts exclude "process" category
messages from the output to make the t*.sh tests portable.
To implement a /proc lookup as you have suggested, I would
fixup the Makefile or config.mak.uname to include something
like a "HAVE_..." feature and then use that at the
bottom of trace2.h to declare a trace2_collect_process_info()
function and then have your code in compat/procinfo.c hide
behind my interface.
Your code should emits events of the form:
trace2_data_*("process", "linux/<something>", ...)
I notice there are several `PROCFS_*` symbols currently defined
in `config.mak.uname` so maybe this should be:
trace2_data_*("process", "procfs/<something>", ...)
(I'm not sure how similar code for Linux and the various BSD
versions would be.)
Jeff
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
> On Thu, May 06 2021, Emily Shaffer wrote:
>
>> It can be useful to tell who invoked Git - was it invoked manually by a
>> user via CLI or script? By an IDE? Knowing where the Git invocation came
>> from can help with debugging to isolate where the problem came from.
>
> Aside from the portability concerns others have raised, I don't really
> see why you'd need this.
>
> We already have the nest-level as part of the SID, so isn't it
> sufficient (and portable) at the top-level to log what isatty says + set
> the initial SID "root" in the IDE (which presumably knows about git).
>
> Wouldn't this log passwords in cases of e.g.:
>
> some-script --git-password secret # invokes "git"
Both valid and excellent points, I would think.
On Mon, May 10, 2021 at 02:29:15PM +0200, Ævar Arnfjörð Bjarmason wrote: > > > On Thu, May 06 2021, Emily Shaffer wrote: > > > It can be useful to tell who invoked Git - was it invoked manually by a > > user via CLI or script? By an IDE? Knowing where the Git invocation came > > from can help with debugging to isolate where the problem came from. > > Aside from the portability concerns others have raised, I don't really > see why you'd need this. > > We already have the nest-level as part of the SID, so isn't it > sufficient (and portable) at the top-level to log what isatty says + set > the initial SID "root" in the IDE (which presumably knows about git). If you already know all the IDEs and scripts which invoke Git, sure, you could set the SID root - we do this with 'repo' tool today. But actually we want this because we aren't sure exactly who at Google is invoking Git in their tooling, how, why, etc. - this logline was supposed to help with that. Chicken, egg, etc. Or else I'm misunderstanding your suggestion; to me it sounded like "go fix your VSCode plugin so that it sets an additional envvar" and I responded accordingly. If you mean something else I didn't see, implemented in Git land, then I'm curious to hear more. > > Wouldn't this log passwords in cases of e.g.: > > some-script --git-password secret # invokes "git" I have nothing but nasty things to say about scripts which would do that, but your point is valid enough to make me think this logline should be gated behind a config and posted much later (when config is available - I think it's not yet, with the patch as-is). > > In older versions of linux reading e.g. smaps from /proc/self would > stall the kernel while the read was happening, I haven't checked whether > cmdline is such a thing (probably not), but it's a subtle thing to have > in mind for this / follow-ups if it's made portable (is that an issue on > other OS's?). That's an interesting point and I wonder how I can validate if it does/doesn't stall in this way - I guess I can go manpage diving? - Emily
On Tue, May 11, 2021 at 01:28:39PM -0400, Jeff Hostetler wrote:
>
>
>
> On 5/6/21 8:29 PM, Emily Shaffer wrote:
> > It can be useful to tell who invoked Git - was it invoked manually by a
> > user via CLI or script? By an IDE? Knowing where the Git invocation came
> > from can help with debugging to isolate where the problem came from.
> >
> > Unfortunately, there's no cross-platform reliable way to gather the name
> > of the parent process. If procfs is present, we can use that; otherwise
> > we will need to discover the name another way. However, the process ID
> > should be sufficient regardless of platform.
> >
> > Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
> > ---
> > We briefly discussed hiding this behind a config, internally. However, I
> > wanted to include the parent name alongside the cmd_start event, which
> > happens very early (maybe before config gathering?).
> >
> > Maybe it's better to log the parent_name as its own event, since it
> > shouldn't change over the lifetime of the process?
> >
> > procfs is very non-portable, though - I think this won't even work on
> > MacOS. So I'm curious if anybody has better suggestions for how to do
> > this.
> >
> > - Emily
>
>
> Look at `trace2_collect_process_info()` in `trace2.h` and
> `compat/win32/trace2_win32_process_info.c`.
>
> That function is designed to let the process call out to
> platform-specific code to do things like getting the name
> of the invoking process.
>
> It is called with an enum to indicate when/why it is being
> called.
>
> On Windows, I have platform code to get the process ancestry,
> the peak VM usage, and whether the process is being debugged.
> Two of those were easy.... And all are very platform-specific.
>
> They all generate events of the form:
>
> trace2_data_*("process", "windows/<something>", ...)
>
> Some of my `t021[012]/` helper scripts exclude "process" category
> messages from the output to make the t*.sh tests portable.
>
>
> To implement a /proc lookup as you have suggested, I would
> fixup the Makefile or config.mak.uname to include something
> like a "HAVE_..." feature and then use that at the
> bottom of trace2.h to declare a trace2_collect_process_info()
> function and then have your code in compat/procinfo.c hide
> behind my interface.
>
> Your code should emits events of the form:
>
> trace2_data_*("process", "linux/<something>", ...)
>
> I notice there are several `PROCFS_*` symbols currently defined
> in `config.mak.uname` so maybe this should be:
>
> trace2_data_*("process", "procfs/<something>", ...)
>
> (I'm not sure how similar code for Linux and the various BSD
> versions would be.)
Thanks a bunch for the pointers - this is great.
- Emily
Emily Shaffer <emilyshaffer@google.com> writes:
> On Mon, May 10, 2021 at 02:29:15PM +0200, Ævar Arnfjörð Bjarmason wrote:
>
>> We already have the nest-level as part of the SID, so isn't it
>> sufficient (and portable) at the top-level to log what isatty says + set
>> the initial SID "root" in the IDE (which presumably knows about git).
>
> If you already know all the IDEs and scripts which invoke Git, sure, you
> could set the SID root - we do this with 'repo' tool today. But actually
> we want this because we aren't sure exactly who at Google is invoking
> Git in their tooling, how, why, etc. - this logline was supposed to help
> with that. Chicken, egg, etc.
I agreed with Æver's suggestion exectly because I failed to read the
above motivation from the patch. If you are trying to find out who
called, then you'd need to do the "find the parent process" dance
when your parent did not give you their SID to append your ident to.
Perhaps it was obvious to the author of the patch, but it was
unclear from the point of view of readers. Perhaps the first
paragraph of the proposed message wants a bit of rephrasing. "we
aren't sure exactly ... how, why, etc." part of the above really
helped.
Thanks.
On Sun, May 16, 2021 at 12:48:37PM +0900, Junio C Hamano wrote:
>
> Emily Shaffer <emilyshaffer@google.com> writes:
>
> > On Mon, May 10, 2021 at 02:29:15PM +0200, Ævar Arnfjörð Bjarmason wrote:
> >
> >> We already have the nest-level as part of the SID, so isn't it
> >> sufficient (and portable) at the top-level to log what isatty says + set
> >> the initial SID "root" in the IDE (which presumably knows about git).
> >
> > If you already know all the IDEs and scripts which invoke Git, sure, you
> > could set the SID root - we do this with 'repo' tool today. But actually
> > we want this because we aren't sure exactly who at Google is invoking
> > Git in their tooling, how, why, etc. - this logline was supposed to help
> > with that. Chicken, egg, etc.
>
> I agreed with Æver's suggestion exectly because I failed to read the
> above motivation from the patch. If you are trying to find out who
> called, then you'd need to do the "find the parent process" dance
> when your parent did not give you their SID to append your ident to.
>
> Perhaps it was obvious to the author of the patch, but it was
> unclear from the point of view of readers. Perhaps the first
> paragraph of the proposed message wants a bit of rephrasing. "we
> aren't sure exactly ... how, why, etc." part of the above really
> helped.
>
> Thanks.
Thanks for the feedback. I was vacationing last week but should be
sending a rework today or tomorrow, will see if I can shore up the
commit-msg as well as the portability stuff Jeff H raised.
- Emily
It can be useful to tell who invoked Git - was it invoked manually by a user via CLI or script? By an IDE? In some cases - like 'repo' tool - we can influence the source code and set the GIT_TRACE2_PARENT_SID environment variable from the caller process. In 'repo''s case, that parent SID is manipulated to include the string "repo", which means we can positively identify when Git was invoked by 'repo' tool. However, identifying parents that way requires both that we know which tools invoke Git and that we have the ability to modify the source code of those tools. It cannot scale to keep up with the various IDEs and wrappers which use Git, most of which we don't know about. Learning which tools and wrappers invoke Git, and how, would give us insight to decide where to improve Git's usability and performance. Unfortunately, there's no cross-platform reliable way to gather the name of the parent process. If procfs is present, we can use that; otherwise we will need to discover the name another way. However, the process ID should be sufficient regardless of platform. Git for Windows gathers similar information and logs it as a "data_json" event. However, since "data_json" has a variable format, it is difficult to parse effectively in some languages; instead, let's pursue a dedicated "cmd_ancestry" event to record information about the ancestry of the current process and a consistent, parseable way. Git for Windows also gathers information about more than one parent. In Linux further ancestry info can be gathered with procfs, but it's unwieldy to do so. In the interest of later moving Git for Windows ancestry logging to the 'cmd_ancestry' event, and in the interest of later adding more ancestry to the Linux implementation - or of adding this functionality to other platforms which have an easier time walking the process tree - let's make 'cmd_ancestry' accept an array of parentage. Signed-off-by: Emily Shaffer <emilyshaffer@google.com> --- Hi folks, the comments I received in v1 were of two varieties: 1) "There are better ways to make this platform-safe", and 2) "Your commit message doesn't convince me". Since I sent v1, though, I also learned a little more about procfs, and about the trace2 structure overall, so there are some pretty significant differences from v1: - I took a look at Jeff H's advice on using a "data_json" event to log this and decided it would be a little more flexible to add a new event instead. If we want, it'd be feasible to then shoehorn the GfW parent tree stuff into this new event too. Doing it this way is definitely easier to parse for Google's trace analysis system (which for now completely skips "data_json" as it's polymorphic), and also - I think - means that we can add more fields later on if we need to (thread info, different fields than just /proc/n/comm like exec path, argv, whatever). - Jonathan N also pointed out to me that /proc/n/comm exists, and logs the "command name" - excluding argv, excluding path, etc. It seems like this is a little more safe about excluding personal information from the traces which take the form of "myscript.sh --password=hunter2", but would still be worrisome for something like "mysupersecretproject.sh". I'm not sure whether that means we still want to guard it with a config flag, though. - I also added a lot to the commit message; hopefully it's not too rambly, but I hoped to explain why just setting GIT_TRACE2_PARENT_SID wasn't going to cut it. - As for testing, I followed the lead of GfW's parentage info - "this isn't portable so writing tests for it will suck, just scrub it from the tests". Maybe it makes sense to do some more platform-specific-ness in the test suite instead? I wasn't sure. Thanks, all. - Emily Makefile | 5 ++++ compat/procinfo.c | 53 +++++++++++++++++++++++++++++++++++++++ config.mak.uname | 1 + git-compat-util.h | 6 +++++ t/t0210/scrub_normal.perl | 6 +++++ t/t0211/scrub_perf.perl | 5 ++++ t/t0212/parse_events.perl | 5 +++- trace2.c | 13 ++++++++++ trace2.h | 12 ++++++++- trace2/tr2_tgt.h | 3 +++ trace2/tr2_tgt_event.c | 21 ++++++++++++++++ trace2/tr2_tgt_normal.c | 19 ++++++++++++++ trace2/tr2_tgt_perf.c | 16 ++++++++++++ 13 files changed, 163 insertions(+), 2 deletions(-) create mode 100644 compat/procinfo.c diff --git a/Makefile b/Makefile index 93664d6714..330e4fa011 100644 --- a/Makefile +++ b/Makefile @@ -1889,6 +1889,11 @@ ifneq ($(PROCFS_EXECUTABLE_PATH),) BASIC_CFLAGS += '-DPROCFS_EXECUTABLE_PATH="$(procfs_executable_path_SQ)"' endif +ifdef HAVE_PROCFS_LINUX + BASIC_CFLAGS += -DHAVE_PROCFS_LINUX + COMPAT_OBJS += compat/procinfo.o +endif + ifdef HAVE_NS_GET_EXECUTABLE_PATH BASIC_CFLAGS += -DHAVE_NS_GET_EXECUTABLE_PATH endif diff --git a/compat/procinfo.c b/compat/procinfo.c new file mode 100644 index 0000000000..523600673f --- /dev/null +++ b/compat/procinfo.c @@ -0,0 +1,53 @@ +#include "cache.h" + +#include "strbuf.h" +#include "trace2.h" + +char *get_process_name(int pid) +{ +#ifdef HAVE_PROCFS_LINUX + struct strbuf procfs_path = STRBUF_INIT; + struct strbuf out = STRBUF_INIT; + /* try to use procfs if it's present. */ + strbuf_addf(&procfs_path, "/proc/%d/comm", pid); + if (!strbuf_read_file(&out, procfs_path.buf, 0)) + { + /* All done with file reads, clean up early */ + strbuf_release(&procfs_path); + return strbuf_detach(&out, NULL); + } +#endif + + /* NEEDSWORK: add non-procfs implementations here. */ + return NULL; +} + +void trace2_collect_process_info(enum trace2_process_info_reason reason) +{ + if (!trace2_is_enabled()) + return; + + /* someday we may want to write something extra here, but not today */ + if (reason == TRACE2_PROCESS_INFO_EXIT) + return; + + if (reason == TRACE2_PROCESS_INFO_STARTUP) + { + /* + * NEEDSWORK: we could do the entire ptree in an array instead, + * see compat/win32/trace2_win32_process_info.c. + */ + char *names[2]; + names[0] = get_process_name(getppid()); + names[1] = NULL; + + if (!names[0]) + return; + + trace2_cmd_ancestry((const char**)names); + + free(names[0]); + } + + return; +} diff --git a/config.mak.uname b/config.mak.uname index cb443b4e02..7ad110a1d2 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -58,6 +58,7 @@ ifeq ($(uname_S),Linux) FREAD_READS_DIRECTORIES = UnfortunatelyYes BASIC_CFLAGS += -DHAVE_SYSINFO PROCFS_EXECUTABLE_PATH = /proc/self/exe + HAVE_PROCFS_LINUX = YesPlease endif ifeq ($(uname_S),GNU/kFreeBSD) HAVE_ALLOCA_H = YesPlease diff --git a/git-compat-util.h b/git-compat-util.h index a508dbe5a3..cc7d5d8a2a 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -1382,4 +1382,10 @@ static inline void *container_of_or_null_offset(void *ptr, size_t offset) void sleep_millisec(int millisec); +/* + * Convert PID to process name (as would show in top/task manager). Returns + * NULL if unimplemented - be sure to check for NULL at callsite. + */ +char *get_process_name(int pid); + #endif diff --git a/t/t0210/scrub_normal.perl b/t/t0210/scrub_normal.perl index c65d1a815e..7cc4de392a 100644 --- a/t/t0210/scrub_normal.perl +++ b/t/t0210/scrub_normal.perl @@ -42,6 +42,12 @@ # so just omit it for testing purposes. # print "cmd_path _EXE_\n"; } + elsif ($line =~ m/^cmd_ancestry/) { + # 'cmd_ancestry' is not implemented everywhere, so for portability's + # sake, skip it when parsing normal. + # + # print "$line"; + } else { print "$line"; } diff --git a/t/t0211/scrub_perf.perl b/t/t0211/scrub_perf.perl index 351af7844e..d164b750ff 100644 --- a/t/t0211/scrub_perf.perl +++ b/t/t0211/scrub_perf.perl @@ -44,6 +44,11 @@ # $tokens[$col_rest] = "_EXE_"; goto SKIP_LINE; } + elsif ($tokens[$col_event] =~ m/cmd_ancestry/) { + # 'cmd_ancestry' is platform-specific and not implemented everywhere, + # so skip it. + goto SKIP_LINE; + } elsif ($tokens[$col_event] =~ m/child_exit/) { $tokens[$col_rest] =~ s/ pid:\d* / pid:_PID_ /; } diff --git a/t/t0212/parse_events.perl b/t/t0212/parse_events.perl index 6584bb5634..b6408560c0 100644 --- a/t/t0212/parse_events.perl +++ b/t/t0212/parse_events.perl @@ -132,7 +132,10 @@ # just omit it for testing purposes. # $processes->{$sid}->{'path'} = "_EXE_"; } - + elsif ($event eq 'cmd_ancestry') { + # 'cmd_ancestry' is platform-specific and not implemented everywhere, so + # just skip it for testing purposes. + } elsif ($event eq 'cmd_name') { $processes->{$sid}->{'name'} = $line->{'name'}; $processes->{$sid}->{'hierarchy'} = $line->{'hierarchy'}; diff --git a/trace2.c b/trace2.c index 256120c7fd..b9b154ac44 100644 --- a/trace2.c +++ b/trace2.c @@ -260,6 +260,19 @@ void trace2_cmd_path_fl(const char *file, int line, const char *pathname) tgt_j->pfn_command_path_fl(file, line, pathname); } +void trace2_cmd_ancestry_fl(const char *file, int line, const char **parent_names) +{ + struct tr2_tgt *tgt_j; + int j; + + if (!trace2_enabled) + return; + + for_each_wanted_builtin (j, tgt_j) + if (tgt_j->pfn_command_ancestry_fl) + tgt_j->pfn_command_ancestry_fl(file, line, parent_names); +} + void trace2_cmd_name_fl(const char *file, int line, const char *name) { struct tr2_tgt *tgt_j; diff --git a/trace2.h b/trace2.h index ede18c2e06..23743ac62b 100644 --- a/trace2.h +++ b/trace2.h @@ -133,6 +133,16 @@ void trace2_cmd_path_fl(const char *file, int line, const char *pathname); #define trace2_cmd_path(p) trace2_cmd_path_fl(__FILE__, __LINE__, (p)) +/* + * Emit an 'ancestry' event with the process name of the current process's + * parent process. + * This gives post-processors a way to determine what invoked the command and + * learn more about usage patterns. + */ +void trace2_cmd_ancestry_fl(const char *file, int line, const char **parent_names); + +#define trace2_cmd_ancestry(v) trace2_cmd_ancestry_fl(__FILE__, __LINE__, (v)) + /* * Emit a 'cmd_name' event with the canonical name of the command. * This gives post-processors a simple field to identify the command @@ -492,7 +502,7 @@ enum trace2_process_info_reason { TRACE2_PROCESS_INFO_EXIT, }; -#if defined(GIT_WINDOWS_NATIVE) +#if ( defined(GIT_WINDOWS_NATIVE) || defined(HAVE_PROCFS_LINUX) ) void trace2_collect_process_info(enum trace2_process_info_reason reason); #else #define trace2_collect_process_info(reason) \ diff --git a/trace2/tr2_tgt.h b/trace2/tr2_tgt.h index 7b90469212..1f66fd6573 100644 --- a/trace2/tr2_tgt.h +++ b/trace2/tr2_tgt.h @@ -27,6 +27,8 @@ typedef void(tr2_tgt_evt_error_va_fl_t)(const char *file, int line, typedef void(tr2_tgt_evt_command_path_fl_t)(const char *file, int line, const char *command_path); +typedef void(tr2_tgt_evt_command_ancestry_fl_t)(const char *file, int line, + const char **parent_names); typedef void(tr2_tgt_evt_command_name_fl_t)(const char *file, int line, const char *name, const char *hierarchy); @@ -108,6 +110,7 @@ struct tr2_tgt { tr2_tgt_evt_atexit_t *pfn_atexit; tr2_tgt_evt_error_va_fl_t *pfn_error_va_fl; tr2_tgt_evt_command_path_fl_t *pfn_command_path_fl; + tr2_tgt_evt_command_ancestry_fl_t *pfn_command_ancestry_fl; tr2_tgt_evt_command_name_fl_t *pfn_command_name_fl; tr2_tgt_evt_command_mode_fl_t *pfn_command_mode_fl; tr2_tgt_evt_alias_fl_t *pfn_alias_fl; diff --git a/trace2/tr2_tgt_event.c b/trace2/tr2_tgt_event.c index 6353e8ad91..578a9a5287 100644 --- a/trace2/tr2_tgt_event.c +++ b/trace2/tr2_tgt_event.c @@ -261,6 +261,26 @@ static void fn_command_path_fl(const char *file, int line, const char *pathname) jw_release(&jw); } +static void fn_command_ancestry_fl(const char *file, int line, const char **parent_names) +{ + const char *event_name = "cmd_ancestry"; + const char *parent_name = NULL; + struct json_writer jw = JSON_WRITER_INIT; + + jw_object_begin(&jw, 0); + event_fmt_prepare(event_name, file, line, NULL, &jw); + jw_object_inline_begin_array(&jw, "ancestry"); + + while ((parent_name = *parent_names++)) + jw_array_string(&jw, parent_name); + + jw_end(&jw); /* 'ancestry' array */ + jw_end(&jw); /* event object */ + + tr2_dst_write_line(&tr2dst_event, &jw.json); + jw_release(&jw); +} + static void fn_command_name_fl(const char *file, int line, const char *name, const char *hierarchy) { @@ -584,6 +604,7 @@ struct tr2_tgt tr2_tgt_event = { fn_atexit, fn_error_va_fl, fn_command_path_fl, + fn_command_ancestry_fl, fn_command_name_fl, fn_command_mode_fl, fn_alias_fl, diff --git a/trace2/tr2_tgt_normal.c b/trace2/tr2_tgt_normal.c index 31b602c171..a5751c8864 100644 --- a/trace2/tr2_tgt_normal.c +++ b/trace2/tr2_tgt_normal.c @@ -160,6 +160,24 @@ static void fn_command_path_fl(const char *file, int line, const char *pathname) strbuf_release(&buf_payload); } +static void fn_command_ancestry_fl(const char *file, int line, const char **parent_names) +{ + const char *parent_name = NULL; + struct strbuf buf_payload = STRBUF_INIT; + + /* cmd_ancestry parent <- grandparent <- great-grandparent */ + strbuf_addstr(&buf_payload, "cmd_ancestry "); + while ((parent_name = *parent_names++)) { + strbuf_addstr(&buf_payload, parent_name); + /* if we'll write another one after this, add a delimiter */ + if (parent_names && *parent_names) + strbuf_addstr(&buf_payload, " <- "); + } + + normal_io_write_fl(file, line, &buf_payload); + strbuf_release(&buf_payload); +} + static void fn_command_name_fl(const char *file, int line, const char *name, const char *hierarchy) { @@ -306,6 +324,7 @@ struct tr2_tgt tr2_tgt_normal = { fn_atexit, fn_error_va_fl, fn_command_path_fl, + fn_command_ancestry_fl, fn_command_name_fl, fn_command_mode_fl, fn_alias_fl, diff --git a/trace2/tr2_tgt_perf.c b/trace2/tr2_tgt_perf.c index a8018f18cc..af4d65a0a5 100644 --- a/trace2/tr2_tgt_perf.c +++ b/trace2/tr2_tgt_perf.c @@ -253,6 +253,21 @@ static void fn_command_path_fl(const char *file, int line, const char *pathname) strbuf_release(&buf_payload); } +static void fn_command_ancestry_fl(const char *file, int line, const char **parent_names) +{ + const char *event_name = "cmd_ancestry"; + struct strbuf buf_payload = STRBUF_INIT; + + strbuf_addstr(&buf_payload, "ancestry:["); + /* It's not an argv but the rules are basically the same. */ + sq_append_quote_argv_pretty(&buf_payload, parent_names); + strbuf_addch(&buf_payload, ']'); + + perf_io_write_fl(file, line, event_name, NULL, NULL, NULL, NULL, + &buf_payload); + strbuf_release(&buf_payload); +} + static void fn_command_name_fl(const char *file, int line, const char *name, const char *hierarchy) { @@ -532,6 +547,7 @@ struct tr2_tgt tr2_tgt_perf = { fn_atexit, fn_error_va_fl, fn_command_path_fl, + fn_command_ancestry_fl, fn_command_name_fl, fn_command_mode_fl, fn_alias_fl, -- 2.31.1.818.g46aad6cb9e-goog
On May 20, 2021 5:06 PM, Emily Shaffer wrote:
>To: git@vger.kernel.org
>Cc: Emily Shaffer <emilyshaffer@google.com>; Ævar Arnfjörð Bjarmason <avarab@gmail.com>; Junio C Hamano <gitster@pobox.com>;
>Jeff Hostetler <git@jeffhostetler.com>; Bagas Sanjaya <bagasdotme@gmail.com>
>Subject: [PATCH v2] tr2: log parent process name
>
>It can be useful to tell who invoked Git - was it invoked manually by a user via CLI or script? By an IDE? In some cases - like 'repo' tool -
>we can influence the source code and set the GIT_TRACE2_PARENT_SID environment variable from the caller process. In 'repo''s case,
>that parent SID is manipulated to include the string "repo", which means we can positively identify when Git was invoked by 'repo' tool.
>However, identifying parents that way requires both that we know which tools invoke Git and that we have the ability to modify the source
>code of those tools. It cannot scale to keep up with the various IDEs and wrappers which use Git, most of which we don't know about.
>Learning which tools and wrappers invoke Git, and how, would give us insight to decide where to improve Git's usability and performance.
>
>Unfortunately, there's no cross-platform reliable way to gather the name of the parent process. If procfs is present, we can use that;
>otherwise we will need to discover the name another way. However, the process ID should be sufficient regardless of platform.
I like this idea, but there are some platforms where this is unlikely to work. NonStop, in particular, can initiate git - and I frequently do - from a non-POSIX environment where process name is entirely different. In fact, it is something like $ABC (always beginning with a $, which makes life very difficult for shell scripts and screws up GIT_SSH_COMMAND, but I digress). I'm going to need to plug in something very platform-specific to make this work. getppid() always returns 1 in this situation, which is extraordinarily meaningless on the platform and does not represent the actual parent.
I will try to put the appropriate compat hooks in once this moves into master but I can't promise it will be particularly efficient at this stage.
Regards,
Randall
On Thu, May 20, 2021 at 05:36:25PM -0400, Randall S. Becker wrote:
>
> On May 20, 2021 5:06 PM, Emily Shaffer wrote:
> >To: git@vger.kernel.org
> >Cc: Emily Shaffer <emilyshaffer@google.com>; Ævar Arnfjörð Bjarmason <avarab@gmail.com>; Junio C Hamano <gitster@pobox.com>;
> >Jeff Hostetler <git@jeffhostetler.com>; Bagas Sanjaya <bagasdotme@gmail.com>
> >Subject: [PATCH v2] tr2: log parent process name
> >
> >It can be useful to tell who invoked Git - was it invoked manually by a user via CLI or script? By an IDE? In some cases - like 'repo' tool -
> >we can influence the source code and set the GIT_TRACE2_PARENT_SID environment variable from the caller process. In 'repo''s case,
> >that parent SID is manipulated to include the string "repo", which means we can positively identify when Git was invoked by 'repo' tool.
> >However, identifying parents that way requires both that we know which tools invoke Git and that we have the ability to modify the source
> >code of those tools. It cannot scale to keep up with the various IDEs and wrappers which use Git, most of which we don't know about.
> >Learning which tools and wrappers invoke Git, and how, would give us insight to decide where to improve Git's usability and performance.
> >
> >Unfortunately, there's no cross-platform reliable way to gather the name of the parent process. If procfs is present, we can use that;
> >otherwise we will need to discover the name another way. However, the process ID should be sufficient regardless of platform.
>
> I like this idea, but there are some platforms where this is unlikely to work. NonStop, in particular, can initiate git - and I frequently do - from a non-POSIX environment where process name is entirely different. In fact, it is something like $ABC (always beginning with a $, which makes life very difficult for shell scripts and screws up GIT_SSH_COMMAND, but I digress). I'm going to need to plug in something very platform-specific to make this work. getppid() always returns 1 in this situation, which is extraordinarily meaningless on the platform and does not represent the actual parent.
Ok. It sounds like you're saying I should be more conservative in the
commit message as well as in the #ifdef scope? Do you think this needs a
reroll to made the #ifdef more aggressive, or would you rather get to it
when you get to it?
It looks like the change in config.mak.uname won't affect NonStop; I
think also the compat/procinfo.c is probably indicative enough of "this
stuff is for procfs" that it won't look like it *should* work for
NonStop, which means that you should still get the stub for
'trace2_collect_process_info()'. But if you think the guards aren't
readable enough I can try to move them around a little more.
- Emily
Emily Shaffer <emilyshaffer@google.com> writes: > Unfortunately, there's no cross-platform reliable way to gather the name > of the parent process. If procfs is present, we can use that; otherwise > we will need to discover the name another way. However, the process ID > should be sufficient regardless of platform. Not a strong objection, but I wonder if seeing random integer(s) is better than not having cmd_ancestry info at all. The latter better signals that the platform does not yet have the "parent process name" feature, I would think. > Git for Windows also gathers information about more than one parent. In > Linux further ancestry info can be gathered with procfs, but it's > unwieldy to do so. In the interest of later moving Git for Windows > ancestry logging to the 'cmd_ancestry' event, and in the interest of > later adding more ancestry to the Linux implementation - or of adding > this functionality to other platforms which have an easier time walking > the process tree - let's make 'cmd_ancestry' accept an array of > parentage. Could we rephrase "more than one parent" at the beginning to clarify? I initially had to wonder what "an array of parentage" contains (father and mother, or a sole parent and its sole parent, which is a sole grandparent). Since there is no "multiple processes meet and spawn a single process", I take it is the latter. Perhaps "more than one generation of" or something? > +ifdef HAVE_PROCFS_LINUX > + BASIC_CFLAGS += -DHAVE_PROCFS_LINUX > + COMPAT_OBJS += compat/procinfo.o > +endif > + > ifdef HAVE_NS_GET_EXECUTABLE_PATH > BASIC_CFLAGS += -DHAVE_NS_GET_EXECUTABLE_PATH > endif > diff --git a/compat/procinfo.c b/compat/procinfo.c > new file mode 100644 > index 0000000000..523600673f > --- /dev/null > +++ b/compat/procinfo.c > @@ -0,0 +1,53 @@ > +#include "cache.h" > + > +#include "strbuf.h" > +#include "trace2.h" > + > +char *get_process_name(int pid) > +{ > +#ifdef HAVE_PROCFS_LINUX > + struct strbuf procfs_path = STRBUF_INIT; > + struct strbuf out = STRBUF_INIT; > + /* try to use procfs if it's present. */ > + strbuf_addf(&procfs_path, "/proc/%d/comm", pid); > + if (!strbuf_read_file(&out, procfs_path.buf, 0)) > + { Place this opening brace at the end of the previous line. > + /* All done with file reads, clean up early */ > + strbuf_release(&procfs_path); > + return strbuf_detach(&out, NULL); > + } > +#endif > + > + /* NEEDSWORK: add non-procfs implementations here. */ > + return NULL; > +} > +void trace2_collect_process_info(enum trace2_process_info_reason reason) > +{ > + if (!trace2_is_enabled()) > + return; > + > + /* someday we may want to write something extra here, but not today */ > + if (reason == TRACE2_PROCESS_INFO_EXIT) > + return; > + > + if (reason == TRACE2_PROCESS_INFO_STARTUP) > + { Ditto. > + /* > + * NEEDSWORK: we could do the entire ptree in an array instead, > + * see compat/win32/trace2_win32_process_info.c. > + */ > + char *names[2]; > + names[0] = get_process_name(getppid()); > + names[1] = NULL; > + > + if (!names[0]) > + return; OK, so if there is no name given, we do not show pid as a placeholder. > + trace2_cmd_ancestry((const char**)names); > + > + free(names[0]); > + } > + > + return; > +} > diff --git a/config.mak.uname b/config.mak.uname > index cb443b4e02..7ad110a1d2 100644 > --- a/config.mak.uname > +++ b/config.mak.uname > @@ -58,6 +58,7 @@ ifeq ($(uname_S),Linux) > FREAD_READS_DIRECTORIES = UnfortunatelyYes > BASIC_CFLAGS += -DHAVE_SYSINFO > PROCFS_EXECUTABLE_PATH = /proc/self/exe > + HAVE_PROCFS_LINUX = YesPlease Have all Linux instances procfs enabled and mounted? It might be that we need to detect this at runtime anyway? ... goes and thinks ... Ah, OK, that "try reading from proc/%d/comm" is the runtime detection, so it is only this Makefile variable is slightly misnamed (it is not "HAVE" but "is worth checking for it"). Makes sense.
On May 20, 2021 7:24 PM, Emily Shaffer wrote: > >On Thu, May 20, 2021 at 05:36:25PM -0400, Randall S. Becker wrote: >> >> On May 20, 2021 5:06 PM, Emily Shaffer wrote: >> >To: git@vger.kernel.org >> >Cc: Emily Shaffer <emilyshaffer@google.com>; Ævar Arnfjörð Bjarmason >> ><avarab@gmail.com>; Junio C Hamano <gitster@pobox.com>; Jeff >> >Hostetler <git@jeffhostetler.com>; Bagas Sanjaya >> ><bagasdotme@gmail.com> >> >Subject: [PATCH v2] tr2: log parent process name >> > >> >It can be useful to tell who invoked Git - was it invoked manually by >> >a user via CLI or script? By an IDE? In some cases - like 'repo' >> >tool - we can influence the source code and set the GIT_TRACE2_PARENT_SID environment variable from the caller process. In 'repo''s >case, that parent SID is manipulated to include the string "repo", which means we can positively identify when Git was invoked by 'repo' >tool. >> >However, identifying parents that way requires both that we know >> >which tools invoke Git and that we have the ability to modify the source code of those tools. It cannot scale to keep up with the various >IDEs and wrappers which use Git, most of which we don't know about. >> >Learning which tools and wrappers invoke Git, and how, would give us insight to decide where to improve Git's usability and >performance. >> > >> >Unfortunately, there's no cross-platform reliable way to gather the >> >name of the parent process. If procfs is present, we can use that; otherwise we will need to discover the name another way. However, >the process ID should be sufficient regardless of platform. >> >> I like this idea, but there are some platforms where this is unlikely to work. NonStop, in particular, can initiate git - and I frequently do - >from a non-POSIX environment where process name is entirely different. In fact, it is something like $ABC (always beginning with a $, >which makes life very difficult for shell scripts and screws up GIT_SSH_COMMAND, but I digress). I'm going to need to plug in something >very platform-specific to make this work. getppid() always returns 1 in this situation, which is extraordinarily meaningless on the platform >and does not represent the actual parent. > >Ok. It sounds like you're saying I should be more conservative in the commit message as well as in the #ifdef scope? Do you think this >needs a reroll to made the #ifdef more aggressive, or would you rather get to it when you get to it? I'll get to it pretty quickly once it's rolled in. >It looks like the change in config.mak.uname won't affect NonStop; I think also the compat/procinfo.c is probably indicative enough of "this >stuff is for procfs" that it won't look like it *should* work for NonStop, which means that you should still get the stub for >'trace2_collect_process_info()'. But if you think the guards aren't readable enough I can try to move them around a little more. Guards are fine. There's just a lot more work to do for me. We need to make sure that the rendering of ancestor processes are generic enough not to be just pid_t through any interfaces where this is queried. In NonStop's case, char[25], should be sufficient for the short term, but I would prefer something longer, say char[128] to be safe for the future in which to stick the ancestor. To be completely unique, the ancestor is going to look like \node.$proc:sequence (where node is a 7 character name, proc is a 5 character name, and sequence is currently a long. -Randall
On May 21, 2021 9:21 AM, I wrote: >To: 'Emily Shaffer' <emilyshaffer@google.com> >Cc: git@vger.kernel.org; 'Ævar Arnfjörð Bjarmason' <avarab@gmail.com>; 'Junio C Hamano' <gitster@pobox.com>; 'Jeff Hostetler' ><git@jeffhostetler.com>; 'Bagas Sanjaya' <bagasdotme@gmail.com> >Subject: RE: [PATCH v2] tr2: log parent process name > >On May 20, 2021 7:24 PM, Emily Shaffer wrote: >> >>On Thu, May 20, 2021 at 05:36:25PM -0400, Randall S. Becker wrote: >>> >>> On May 20, 2021 5:06 PM, Emily Shaffer wrote: >>> >To: git@vger.kernel.org >>> >Cc: Emily Shaffer <emilyshaffer@google.com>; Ævar Arnfjörð Bjarmason >>> ><avarab@gmail.com>; Junio C Hamano <gitster@pobox.com>; Jeff >>> >Hostetler <git@jeffhostetler.com>; Bagas Sanjaya >>> ><bagasdotme@gmail.com> >>> >Subject: [PATCH v2] tr2: log parent process name >>> > >>> >It can be useful to tell who invoked Git - was it invoked manually >>> >by a user via CLI or script? By an IDE? In some cases - like 'repo' >>> >tool - we can influence the source code and set the >>> >GIT_TRACE2_PARENT_SID environment variable from the caller process. >>> >In >'repo''s >>case, that parent SID is manipulated to include the string "repo", >>which means we can positively identify when Git was invoked by >'repo' >>tool. >>> >However, identifying parents that way requires both that we know >>> >which tools invoke Git and that we have the ability to modify the >>> >source code of those tools. It cannot scale to keep up with >the various >>IDEs and wrappers which use Git, most of which we don't know about. >>> >Learning which tools and wrappers invoke Git, and how, would give us >>> >insight to decide where to improve Git's usability and >>performance. >>> > >>> >Unfortunately, there's no cross-platform reliable way to gather the >>> >name of the parent process. If procfs is present, we can use that; otherwise we will need to discover the name another way. >However, >>the process ID should be sufficient regardless of platform. >>> >>> I like this idea, but there are some platforms where this is unlikely >>> to work. NonStop, in particular, can initiate git - and I >frequently do - >>from a non-POSIX environment where process name is entirely different. >>In fact, it is something like $ABC (always beginning with a >$, >>which makes life very difficult for shell scripts and screws up >>GIT_SSH_COMMAND, but I digress). I'm going to need to plug in >something >>very platform-specific to make this work. getppid() always returns 1 in >>this situation, which is extraordinarily meaningless on the >platform >>and does not represent the actual parent. >> >>Ok. It sounds like you're saying I should be more conservative in the >>commit message as well as in the #ifdef scope? Do you think >this >>needs a reroll to made the #ifdef more aggressive, or would you rather get to it when you get to it? > >I'll get to it pretty quickly once it's rolled in. > >>It looks like the change in config.mak.uname won't affect NonStop; I >>think also the compat/procinfo.c is probably indicative enough >of "this >>stuff is for procfs" that it won't look like it *should* work for >>NonStop, which means that you should still get the stub for 'trace2_collect_process_info()'. But if you think the guards aren't readable >enough I can try to move them around a little more. > >Guards are fine. There's just a lot more work to do for me. We need to make sure that the rendering of ancestor processes are generic >enough not to be just pid_t through any interfaces where this is queried. In NonStop's case, char[25], should be sufficient for the short >term, but I would prefer something longer, say char[128] to be safe for the future in which to stick the ancestor. To be completely unique, >the ancestor is going to look like \node.$proc:sequence (where node is a 7 character name, proc is a 5 character name, and sequence is >currently a long. Just so we know what's coming, the code snippet to get a parent on NonStop is as follows (roughly): pid_t ossParent = getppid(); if (ossParent == 1) { short pHandle[10]; short pAncestor[10]; short ancestorLength; short error; short attributes[] = { 40 }; /* MOM Process */ short processNameLength; char processName[64]; PROCESSHANDLE_NULLIT_(pHandle); PROCESS_GETINFO_(pHandle); error = PROCESS_GETINFOLIST_(,,,, pHandle, attributes, (short) sizeof(attributes)/sizeof(attributes[0]), pAncestor, (short) sizeof(pAncestor), &ancestorLength); if (error) { printf("Cannot process parent. Error %d\n", error); return; } PROCESSHANDLE_TO_FILENAME_(pAncestor, processName, (short) sizeof(processName), &processNameLength); processName[processNameLength] = '\0'; printf("GUARDIAN Parent %s\n", processName); } else { printf("OSS Parent %d\n", ossParent); } Which in the test program generates GUARDIAN Parent \HPITUG.$:3:1100:583555076 Regards, Randall
On Fri, May 21, 2021 at 11:09:49AM +0900, Junio C Hamano wrote: > > Emily Shaffer <emilyshaffer@google.com> writes: > > > Unfortunately, there's no cross-platform reliable way to gather the name > > of the parent process. If procfs is present, we can use that; otherwise > > we will need to discover the name another way. However, the process ID > > should be sufficient regardless of platform. > > Not a strong objection, but I wonder if seeing random integer(s) is > better than not having cmd_ancestry info at all. The latter better > signals that the platform does not yet have the "parent process > name" feature, I would think. Hm, we could; I guess then analyzers could still correlate "all these things were called by some kind of wrapper, even if we don't know if that was an IDE or a script or just the user or what, we can guess based on other heuristics". Ok. > > > Git for Windows also gathers information about more than one parent. In > > Linux further ancestry info can be gathered with procfs, but it's > > unwieldy to do so. In the interest of later moving Git for Windows > > ancestry logging to the 'cmd_ancestry' event, and in the interest of > > later adding more ancestry to the Linux implementation - or of adding > > this functionality to other platforms which have an easier time walking > > the process tree - let's make 'cmd_ancestry' accept an array of > > parentage. > > Could we rephrase "more than one parent" at the beginning to > clarify? I initially had to wonder what "an array of parentage" > contains (father and mother, or a sole parent and its sole parent, > which is a sole grandparent). Since there is no "multiple processes > meet and spawn a single process", I take it is the latter. Perhaps > "more than one generation of" or something? Good point, will refer to "more than one generation". Thanks. > > + if (!strbuf_read_file(&out, procfs_path.buf, 0)) > > + { > > Place this opening brace at the end of the previous line. Will polish up this and others for v3, hopefully today. > > + if (!names[0]) > > + return; > > OK, so if there is no name given, we do not show pid as a > placeholder. Based on your suggestion above I think it will make sense to show pid as placeholder after all, though. So I will change that for v3. > > PROCFS_EXECUTABLE_PATH = /proc/self/exe > > + HAVE_PROCFS_LINUX = YesPlease > > Have all Linux instances procfs enabled and mounted? It might be > that we need to detect this at runtime anyway? > > ... goes and thinks ... > > Ah, OK, that "try reading from proc/%d/comm" is the runtime > detection, so it is only this Makefile variable is slightly > misnamed (it is not "HAVE" but "is worth checking for it"). I wonder what is better. "MAYBE_PROCFS_LINUX"? I don't see any other vars in config.mak.uname that indicate "pretty sure but not totally sure" in a quick scan. However, right above this line we seem to feel certain in our guess about "PROCFS_EXECUTABLE_PATH"... ...but when it is used in exec-cmd.c:git_get_exec_path_procfs(), invoked by exec-cmd.c:git_get_exec_path(), we're very tolerant to faults if it's not there: static int git_get_exec_path(struct strbuf *buf, const char *argv0) { /* * [snip] * Each of these functions returns 0 on success, so evaluation will stop * after the first successful method. */ if ( #ifdef HAVE_BSD_KERN_PROC_SYSCTL git_get_exec_path_bsd_sysctl(buf) && #endif /* HAVE_BSD_KERN_PROC_SYSCTL */ #ifdef HAVE_NS_GET_EXECUTABLE_PATH git_get_exec_path_darwin(buf) && #endif /* HAVE_NS_GET_EXECUTABLE_PATH */ #ifdef PROCFS_EXECUTABLE_PATH git_get_exec_path_procfs(buf) && /*** <- OK if fails ***/ #endif /* PROCFS_EXECUTABLE_PATH */ #ifdef HAVE_WPGMPTR git_get_exec_path_wpgmptr(buf) && #endif /* HAVE_WPGMPTR */ git_get_exec_path_from_argv0(buf, argv0)) { return -1; } [snip] #ifdef PROCFS_EXECUTABLE_PATH /* * Resolves the executable path by examining a procfs symlink. * * Returns 0 on success, -1 on failure. */ static int git_get_exec_path_procfs(struct strbuf *buf) { if (strbuf_realpath(buf, PROCFS_EXECUTABLE_PATH, 0)) { trace_printf( "trace: resolved executable path from procfs: %s\n", buf->buf); return 0; } return -1; } #endif /* PROCFS_EXECUTABLE_PATH */ So it seems this other procfs bit takes the "probably but not definitely" step and is tolerant at runtime as well. Which doesn't help me much to decide how to rename HAVE_PROCFS_LINUX. I'll switch it to MAYBE_PROCFS_LINUX for v3 unless someone yells, I guess. - Emily
On 5/20/21 5:05 PM, Emily Shaffer wrote: > It can be useful to tell who invoked Git - was it invoked manually by a > user via CLI or script? By an IDE? In some cases - like 'repo' tool - > we can influence the source code and set the GIT_TRACE2_PARENT_SID > environment variable from the caller process. In 'repo''s case, that > parent SID is manipulated to include the string "repo", which means we > can positively identify when Git was invoked by 'repo' tool. However, > identifying parents that way requires both that we know which tools > invoke Git and that we have the ability to modify the source code of > those tools. It cannot scale to keep up with the various IDEs and > wrappers which use Git, most of which we don't know about. Learning > which tools and wrappers invoke Git, and how, would give us insight to > decide where to improve Git's usability and performance. > > Unfortunately, there's no cross-platform reliable way to gather the name > of the parent process. If procfs is present, we can use that; otherwise > we will need to discover the name another way. However, the process ID > should be sufficient regardless of platform. > > Git for Windows gathers similar information and logs it as a "data_json" > event. However, since "data_json" has a variable format, it is difficult > to parse effectively in some languages; instead, let's pursue a > dedicated "cmd_ancestry" event to record information about the ancestry > of the current process and a consistent, parseable way. > > Git for Windows also gathers information about more than one parent. In > Linux further ancestry info can be gathered with procfs, but it's > unwieldy to do so. In the interest of later moving Git for Windows > ancestry logging to the 'cmd_ancestry' event, and in the interest of > later adding more ancestry to the Linux implementation - or of adding > this functionality to other platforms which have an easier time walking > the process tree - let's make 'cmd_ancestry' accept an array of > parentage. > > Signed-off-by: Emily Shaffer <emilyshaffer@google.com> > --- > > Hi folks, the comments I received in v1 were of two varieties: > 1) "There are better ways to make this platform-safe", and > 2) "Your commit message doesn't convince me". > Since I sent v1, though, I also learned a little more about procfs, and > about the trace2 structure overall, so there are some pretty significant > differences from v1: > > - I took a look at Jeff H's advice on using a "data_json" event to log > this and decided it would be a little more flexible to add a new event > instead. If we want, it'd be feasible to then shoehorn the GfW parent > tree stuff into this new event too. Doing it this way is definitely > easier to parse for Google's trace analysis system (which for now > completely skips "data_json" as it's polymorphic), and also - I think > - means that we can add more fields later on if we need to (thread > info, different fields than just /proc/n/comm like exec path, argv, > whatever). I could argue both sides of this, so I guess it is fine either way. In GFW I log a array of argv[0] strings in a generic "data_json" event. I could also log additional "data_json" events with more structured data if needed. On the other hand, you're proposing a "cmd_ancestry" event with a single array of strings. You would have to expand the call signature of the trace2_cmd_ancestry() API to add additional data and inside tr2_tgt_event.c add additional fields to the JSON being composed. So both are about equal. (I'll avoid the temptation to make a snarky comment about fixing your post processing. :-) :-) :-) ) It really doesn't matter one way or the other. > - Jonathan N also pointed out to me that /proc/n/comm exists, and logs > the "command name" - excluding argv, excluding path, etc. It seems So you're trying to log argv[0] of the process and not the full command line. That's what I'm doing. > like this is a little more safe about excluding personal information > from the traces which take the form of "myscript.sh > --password=hunter2", but would still be worrisome for something like > "mysupersecretproject.sh". I'm not sure whether that means we still > want to guard it with a config flag, though. You might check whether you get the name of the script or just get a lot of entries with just "/usr/bin/bash". There's lots of PII in the data stream to worry about. The name of the command is just one aspect, but I digress. > - I also added a lot to the commit message; hopefully it's not too > rambly, but I hoped to explain why just setting GIT_TRACE2_PARENT_SID > wasn't going to cut it. > - As for testing, I followed the lead of GfW's parentage info - "this > isn't portable so writing tests for it will suck, just scrub it from > the tests". Maybe it makes sense to do some more > platform-specific-ness in the test suite instead? I wasn't sure. yeah, that's probably best. Unless you can tokenize it properly so that you can predict the results in a HEREDOC in the test source. For example, you might try to test tracing a command (where a top-level "git foo" (SPACE form) spawns a "git-foo" (DASHED form) and check the output for the child. > > Thanks, all. > - Emily > > Makefile | 5 ++++ > compat/procinfo.c | 53 +++++++++++++++++++++++++++++++++++++++ > config.mak.uname | 1 + > git-compat-util.h | 6 +++++ > t/t0210/scrub_normal.perl | 6 +++++ > t/t0211/scrub_perf.perl | 5 ++++ > t/t0212/parse_events.perl | 5 +++- > trace2.c | 13 ++++++++++ > trace2.h | 12 ++++++++- > trace2/tr2_tgt.h | 3 +++ > trace2/tr2_tgt_event.c | 21 ++++++++++++++++ > trace2/tr2_tgt_normal.c | 19 ++++++++++++++ > trace2/tr2_tgt_perf.c | 16 ++++++++++++ > 13 files changed, 163 insertions(+), 2 deletions(-) > create mode 100644 compat/procinfo.c > > diff --git a/Makefile b/Makefile > index 93664d6714..330e4fa011 100644 > --- a/Makefile > +++ b/Makefile > @@ -1889,6 +1889,11 @@ ifneq ($(PROCFS_EXECUTABLE_PATH),) > BASIC_CFLAGS += '-DPROCFS_EXECUTABLE_PATH="$(procfs_executable_path_SQ)"' > endif > > +ifdef HAVE_PROCFS_LINUX > + BASIC_CFLAGS += -DHAVE_PROCFS_LINUX > + COMPAT_OBJS += compat/procinfo.o > +endif > + > ifdef HAVE_NS_GET_EXECUTABLE_PATH > BASIC_CFLAGS += -DHAVE_NS_GET_EXECUTABLE_PATH > endif > diff --git a/compat/procinfo.c b/compat/procinfo.c > new file mode 100644 > index 0000000000..523600673f > --- /dev/null > +++ b/compat/procinfo.c > @@ -0,0 +1,53 @@ > +#include "cache.h" > + > +#include "strbuf.h" > +#include "trace2.h" > + > +char *get_process_name(int pid) > +{ > +#ifdef HAVE_PROCFS_LINUX > + struct strbuf procfs_path = STRBUF_INIT; > + struct strbuf out = STRBUF_INIT; > + /* try to use procfs if it's present. */ > + strbuf_addf(&procfs_path, "/proc/%d/comm", pid); > + if (!strbuf_read_file(&out, procfs_path.buf, 0)) > + { > + /* All done with file reads, clean up early */ > + strbuf_release(&procfs_path); > + return strbuf_detach(&out, NULL); > + } > +#endif > + > + /* NEEDSWORK: add non-procfs implementations here. */ > + return NULL; > +} > + > +void trace2_collect_process_info(enum trace2_process_info_reason reason) > +{ > + if (!trace2_is_enabled()) > + return; > + > + /* someday we may want to write something extra here, but not today */ > + if (reason == TRACE2_PROCESS_INFO_EXIT) > + return; > + > + if (reason == TRACE2_PROCESS_INFO_STARTUP) > + { > + /* > + * NEEDSWORK: we could do the entire ptree in an array instead, > + * see compat/win32/trace2_win32_process_info.c. > + */ > + char *names[2]; > + names[0] = get_process_name(getppid()); > + names[1] = NULL; You're only logging 1 parent. That's fine to get started. I'm logging IIRC 10 parents on GFW. That might seem overkill, but there are lots of intermediate parents that hide what is happening. For example, a "git push" might spawn "git remote-https" which spawns "git-remote-https" which spawn "git send-pack" which spawns "git pack-objects". And that doesn't include who called push. And it's not uncommon to see 2 or 3 "bash" entries in the array because of the bash scripts being run. > + > + if (!names[0]) > + return; > + > + trace2_cmd_ancestry((const char**)names); > + > + free(names[0]); > + } > + > + return; > +} > diff --git a/config.mak.uname b/config.mak.uname > index cb443b4e02..7ad110a1d2 100644 > --- a/config.mak.uname > +++ b/config.mak.uname > @@ -58,6 +58,7 @@ ifeq ($(uname_S),Linux) > FREAD_READS_DIRECTORIES = UnfortunatelyYes > BASIC_CFLAGS += -DHAVE_SYSINFO > PROCFS_EXECUTABLE_PATH = /proc/self/exe > + HAVE_PROCFS_LINUX = YesPlease > endif > ifeq ($(uname_S),GNU/kFreeBSD) > HAVE_ALLOCA_H = YesPlease > diff --git a/git-compat-util.h b/git-compat-util.h > index a508dbe5a3..cc7d5d8a2a 100644 > --- a/git-compat-util.h > +++ b/git-compat-util.h > @@ -1382,4 +1382,10 @@ static inline void *container_of_or_null_offset(void *ptr, size_t offset) > > void sleep_millisec(int millisec); > > +/* > + * Convert PID to process name (as would show in top/task manager). Returns > + * NULL if unimplemented - be sure to check for NULL at callsite. > + */ > +char *get_process_name(int pid); > + > #endif > diff --git a/t/t0210/scrub_normal.perl b/t/t0210/scrub_normal.perl > index c65d1a815e..7cc4de392a 100644 > --- a/t/t0210/scrub_normal.perl > +++ b/t/t0210/scrub_normal.perl > @@ -42,6 +42,12 @@ > # so just omit it for testing purposes. > # print "cmd_path _EXE_\n"; > } > + elsif ($line =~ m/^cmd_ancestry/) { > + # 'cmd_ancestry' is not implemented everywhere, so for portability's > + # sake, skip it when parsing normal. > + # > + # print "$line"; > + } > else { > print "$line"; > } > diff --git a/t/t0211/scrub_perf.perl b/t/t0211/scrub_perf.perl > index 351af7844e..d164b750ff 100644 > --- a/t/t0211/scrub_perf.perl > +++ b/t/t0211/scrub_perf.perl > @@ -44,6 +44,11 @@ > # $tokens[$col_rest] = "_EXE_"; > goto SKIP_LINE; > } > + elsif ($tokens[$col_event] =~ m/cmd_ancestry/) { > + # 'cmd_ancestry' is platform-specific and not implemented everywhere, > + # so skip it. > + goto SKIP_LINE; > + } > elsif ($tokens[$col_event] =~ m/child_exit/) { > $tokens[$col_rest] =~ s/ pid:\d* / pid:_PID_ /; > } > diff --git a/t/t0212/parse_events.perl b/t/t0212/parse_events.perl > index 6584bb5634..b6408560c0 100644 > --- a/t/t0212/parse_events.perl > +++ b/t/t0212/parse_events.perl > @@ -132,7 +132,10 @@ > # just omit it for testing purposes. > # $processes->{$sid}->{'path'} = "_EXE_"; > } > - > + elsif ($event eq 'cmd_ancestry') { > + # 'cmd_ancestry' is platform-specific and not implemented everywhere, so > + # just skip it for testing purposes. > + } > elsif ($event eq 'cmd_name') { > $processes->{$sid}->{'name'} = $line->{'name'}; > $processes->{$sid}->{'hierarchy'} = $line->{'hierarchy'}; > diff --git a/trace2.c b/trace2.c > index 256120c7fd..b9b154ac44 100644 > --- a/trace2.c > +++ b/trace2.c > @@ -260,6 +260,19 @@ void trace2_cmd_path_fl(const char *file, int line, const char *pathname) > tgt_j->pfn_command_path_fl(file, line, pathname); > } > > +void trace2_cmd_ancestry_fl(const char *file, int line, const char **parent_names) > +{ > + struct tr2_tgt *tgt_j; > + int j; > + > + if (!trace2_enabled) > + return; > + > + for_each_wanted_builtin (j, tgt_j) > + if (tgt_j->pfn_command_ancestry_fl) > + tgt_j->pfn_command_ancestry_fl(file, line, parent_names); > +} > + > void trace2_cmd_name_fl(const char *file, int line, const char *name) > { > struct tr2_tgt *tgt_j; > diff --git a/trace2.h b/trace2.h > index ede18c2e06..23743ac62b 100644 > --- a/trace2.h > +++ b/trace2.h > @@ -133,6 +133,16 @@ void trace2_cmd_path_fl(const char *file, int line, const char *pathname); > > #define trace2_cmd_path(p) trace2_cmd_path_fl(__FILE__, __LINE__, (p)) > > +/* > + * Emit an 'ancestry' event with the process name of the current process's > + * parent process. > + * This gives post-processors a way to determine what invoked the command and > + * learn more about usage patterns. > + */ > +void trace2_cmd_ancestry_fl(const char *file, int line, const char **parent_names); > + > +#define trace2_cmd_ancestry(v) trace2_cmd_ancestry_fl(__FILE__, __LINE__, (v)) > + > /* > * Emit a 'cmd_name' event with the canonical name of the command. > * This gives post-processors a simple field to identify the command > @@ -492,7 +502,7 @@ enum trace2_process_info_reason { > TRACE2_PROCESS_INFO_EXIT, > }; > > -#if defined(GIT_WINDOWS_NATIVE) > +#if ( defined(GIT_WINDOWS_NATIVE) || defined(HAVE_PROCFS_LINUX) ) > void trace2_collect_process_info(enum trace2_process_info_reason reason); > #else > #define trace2_collect_process_info(reason) \ > diff --git a/trace2/tr2_tgt.h b/trace2/tr2_tgt.h > index 7b90469212..1f66fd6573 100644 > --- a/trace2/tr2_tgt.h > +++ b/trace2/tr2_tgt.h > @@ -27,6 +27,8 @@ typedef void(tr2_tgt_evt_error_va_fl_t)(const char *file, int line, > > typedef void(tr2_tgt_evt_command_path_fl_t)(const char *file, int line, > const char *command_path); > +typedef void(tr2_tgt_evt_command_ancestry_fl_t)(const char *file, int line, > + const char **parent_names); > typedef void(tr2_tgt_evt_command_name_fl_t)(const char *file, int line, > const char *name, > const char *hierarchy); > @@ -108,6 +110,7 @@ struct tr2_tgt { > tr2_tgt_evt_atexit_t *pfn_atexit; > tr2_tgt_evt_error_va_fl_t *pfn_error_va_fl; > tr2_tgt_evt_command_path_fl_t *pfn_command_path_fl; > + tr2_tgt_evt_command_ancestry_fl_t *pfn_command_ancestry_fl; > tr2_tgt_evt_command_name_fl_t *pfn_command_name_fl; > tr2_tgt_evt_command_mode_fl_t *pfn_command_mode_fl; > tr2_tgt_evt_alias_fl_t *pfn_alias_fl; > diff --git a/trace2/tr2_tgt_event.c b/trace2/tr2_tgt_event.c > index 6353e8ad91..578a9a5287 100644 > --- a/trace2/tr2_tgt_event.c > +++ b/trace2/tr2_tgt_event.c > @@ -261,6 +261,26 @@ static void fn_command_path_fl(const char *file, int line, const char *pathname) > jw_release(&jw); > } > > +static void fn_command_ancestry_fl(const char *file, int line, const char **parent_names) > +{ > + const char *event_name = "cmd_ancestry"; > + const char *parent_name = NULL; > + struct json_writer jw = JSON_WRITER_INIT; > + > + jw_object_begin(&jw, 0); > + event_fmt_prepare(event_name, file, line, NULL, &jw); > + jw_object_inline_begin_array(&jw, "ancestry"); > + > + while ((parent_name = *parent_names++)) > + jw_array_string(&jw, parent_name); You're building the array with the immediate parent in a[0] and the grandparent in a[1], and etc. This is the same as I did in GFW. Perhaps state this in the docs somewhere. > + > + jw_end(&jw); /* 'ancestry' array */ > + jw_end(&jw); /* event object */ > + > + tr2_dst_write_line(&tr2dst_event, &jw.json); > + jw_release(&jw); > +} > + > static void fn_command_name_fl(const char *file, int line, const char *name, > const char *hierarchy) > { > @@ -584,6 +604,7 @@ struct tr2_tgt tr2_tgt_event = { > fn_atexit, > fn_error_va_fl, > fn_command_path_fl, > + fn_command_ancestry_fl, > fn_command_name_fl, > fn_command_mode_fl, > fn_alias_fl, > diff --git a/trace2/tr2_tgt_normal.c b/trace2/tr2_tgt_normal.c > index 31b602c171..a5751c8864 100644 > --- a/trace2/tr2_tgt_normal.c > +++ b/trace2/tr2_tgt_normal.c > @@ -160,6 +160,24 @@ static void fn_command_path_fl(const char *file, int line, const char *pathname) > strbuf_release(&buf_payload); > } > > +static void fn_command_ancestry_fl(const char *file, int line, const char **parent_names) > +{ > + const char *parent_name = NULL; > + struct strbuf buf_payload = STRBUF_INIT; > + > + /* cmd_ancestry parent <- grandparent <- great-grandparent */ > + strbuf_addstr(&buf_payload, "cmd_ancestry "); > + while ((parent_name = *parent_names++)) { > + strbuf_addstr(&buf_payload, parent_name); Did you want to quote each parent's name? > + /* if we'll write another one after this, add a delimiter */ > + if (parent_names && *parent_names) > + strbuf_addstr(&buf_payload, " <- "); > + } > + > + normal_io_write_fl(file, line, &buf_payload); > + strbuf_release(&buf_payload); > +} > + > static void fn_command_name_fl(const char *file, int line, const char *name, > const char *hierarchy) > { > @@ -306,6 +324,7 @@ struct tr2_tgt tr2_tgt_normal = { > fn_atexit, > fn_error_va_fl, > fn_command_path_fl, > + fn_command_ancestry_fl, > fn_command_name_fl, > fn_command_mode_fl, > fn_alias_fl, > diff --git a/trace2/tr2_tgt_perf.c b/trace2/tr2_tgt_perf.c > index a8018f18cc..af4d65a0a5 100644 > --- a/trace2/tr2_tgt_perf.c > +++ b/trace2/tr2_tgt_perf.c > @@ -253,6 +253,21 @@ static void fn_command_path_fl(const char *file, int line, const char *pathname) > strbuf_release(&buf_payload); > } > > +static void fn_command_ancestry_fl(const char *file, int line, const char **parent_names) > +{ > + const char *event_name = "cmd_ancestry"; > + struct strbuf buf_payload = STRBUF_INIT; > + > + strbuf_addstr(&buf_payload, "ancestry:["); > + /* It's not an argv but the rules are basically the same. */ > + sq_append_quote_argv_pretty(&buf_payload, parent_names); This will have whitespace delimiters between the quoted strings rather than commas. Just checking if that's what you wanted. I'm not sure it matters, since this stream is intended for human parsing. > + strbuf_addch(&buf_payload, ']'); > + > + perf_io_write_fl(file, line, event_name, NULL, NULL, NULL, NULL, > + &buf_payload); > + strbuf_release(&buf_payload); > +} > + > static void fn_command_name_fl(const char *file, int line, const char *name, > const char *hierarchy) > { > @@ -532,6 +547,7 @@ struct tr2_tgt tr2_tgt_perf = { > fn_atexit, > fn_error_va_fl, > fn_command_path_fl, > + fn_command_ancestry_fl, > fn_command_name_fl, > fn_command_mode_fl, > fn_alias_fl, > We should update Documentation/technical/api-trace2.txt too. Thanks, Jeff
On Fri, May 21, 2021 at 03:15:16PM -0400, Jeff Hostetler wrote: > On 5/20/21 5:05 PM, Emily Shaffer wrote: > > - I took a look at Jeff H's advice on using a "data_json" event to log > > this and decided it would be a little more flexible to add a new event > > instead. If we want, it'd be feasible to then shoehorn the GfW parent > > tree stuff into this new event too. Doing it this way is definitely > > easier to parse for Google's trace analysis system (which for now > > completely skips "data_json" as it's polymorphic), and also - I think > > - means that we can add more fields later on if we need to (thread > > info, different fields than just /proc/n/comm like exec path, argv, > > whatever). > > I could argue both sides of this, so I guess it is fine either way. > > In GFW I log a array of argv[0] strings in a generic "data_json" event. > I could also log additional "data_json" events with more structured > data if needed. > > On the other hand, you're proposing a "cmd_ancestry" event with a > single array of strings. You would have to expand the call signature > of the trace2_cmd_ancestry() API to add additional data and inside > tr2_tgt_event.c add additional fields to the JSON being composed. > > So both are about equal. > > (I'll avoid the temptation to make a snarky comment about fixing > your post processing. :-) :-) :-) ) ;P (I don't have much to add - this is an accurate summary of what I thought about, too. Thanks for writing it out.) > > It really doesn't matter one way or the other. > > > - Jonathan N also pointed out to me that /proc/n/comm exists, and logs > > the "command name" - excluding argv, excluding path, etc. It seems > > So you're trying to log argv[0] of the process and not the full > command line. That's what I'm doing. It's close to argv[0], yeah. POSIX docs indicate it might be truncated in a way that argv[0] hasn't been, but it also doesn't include the leading path (as far as I've seen). For example, a long-running helper script I use with mutt, right now (gaffing on line length in email to help with argv clarity, sorry): $ ps aux | grep mutt emilysh+ 4119883 0.0 0.0 6892 3600 pts/6 S+ 12:44 0:00 /bin/bash /usr/local/google/home/emilyshaffer/dotfiles/open-vim-in-new-split.sh /var/tmp/mutt-podkayne-413244-1263002-7433772284891386689 # comm is truncated to 15ch, except apparently in the cases of some # kernel worker processes I saw with much longer names? $ cat /proc/4119883/comm open-vim-in-new # exe is a link to the executable, which means bash as this is a # script $ ls -lha /proc/4119883/exe lrwxrwxrwx 1 emilyshaffer primarygroup 0 May 21 12:44 /proc/4119883/exe -> /usr/bin/bash # cmdline has the whole argv, separated on NUL so it runs together in # editor $ cat /proc/4119883/cmdline /bin/bash/usr/local/google/home/emilyshaffer/dotfiles/open-vim-in-new-split.sh/var/tmp/mutt-podkayne-413244-1263002-7433772284891386689 Jonathan N pointed out that the process name (the thing in 'comm') can also be manually manipulated by the process itself, and 'man procfs' also talks about 'PR_SET_NAME' and 'PR_GET_NAME' operations in 'prctl()', so that tracks. (It doesn't look like we can use prctl() to find out the names of processes besides the current process, though, so the procfs stuff is still needed. Dang.) > > > like this is a little more safe about excluding personal information > > from the traces which take the form of "myscript.sh > > --password=hunter2", but would still be worrisome for something like > > "mysupersecretproject.sh". I'm not sure whether that means we still > > want to guard it with a config flag, though. > > You might check whether you get the name of the script or just get > a lot of entries with just "/usr/bin/bash". See above :) > There's lots of PII in the data stream to worry about. > The name of the command is just one aspect, but I digress. Yes, that's what we've noticed too, so a process name isn't worrying us that much more. > > > - I also added a lot to the commit message; hopefully it's not too > > rambly, but I hoped to explain why just setting GIT_TRACE2_PARENT_SID > > wasn't going to cut it. > > - As for testing, I followed the lead of GfW's parentage info - "this > > isn't portable so writing tests for it will suck, just scrub it from > > the tests". Maybe it makes sense to do some more > > platform-specific-ness in the test suite instead? I wasn't sure. > > yeah, that's probably best. Unless you can tokenize it properly > so that you can predict the results in a HEREDOC in the test source. > > For example, you might try to test tracing a command (where a top-level > "git foo" (SPACE form) spawns a "git-foo" (DASHED form) and check the > output for the child. Yeah, I had trouble with even deciding when to attempt such a check or not. > > + if (reason == TRACE2_PROCESS_INFO_STARTUP) > > + { > > + /* > > + * NEEDSWORK: we could do the entire ptree in an array instead, > > + * see compat/win32/trace2_win32_process_info.c. > > + */ > > + char *names[2]; > > + names[0] = get_process_name(getppid()); > > + names[1] = NULL; > > You're only logging 1 parent. That's fine to get started. > > I'm logging IIRC 10 parents on GFW. That might seem overkill, > but there are lots of intermediate parents that hide what is > happening. For example, a "git push" might spawn "git remote-https" > which spawns "git-remote-https" which spawn "git send-pack" which > spawns "git pack-objects". > > And that doesn't include who called push. > > And it's not uncommon to see 2 or 3 "bash" entries in the array > because of the bash scripts being run. Agree. But it's expensive - I didn't find a handy library call to find "parent ID of given process ID", so I think we'd have to manipulate procfs; and so far I only see parent ID in summary infos like /proc/n/status or /proc/n/stat, which contain lots of other info too and would need parsing. We could reduce the cost a little bit by grabbing the process name from the status or stat as well, and therefore still only opening one file per process, but I'd want to check whether the formats are expected to be stable for those things. > > +static void fn_command_ancestry_fl(const char *file, int line, const char **parent_names) > > +{ > > + const char *event_name = "cmd_ancestry"; > > + const char *parent_name = NULL; > > + struct json_writer jw = JSON_WRITER_INIT; > > + > > + jw_object_begin(&jw, 0); > > + event_fmt_prepare(event_name, file, line, NULL, &jw); > > + jw_object_inline_begin_array(&jw, "ancestry"); > > + > > + while ((parent_name = *parent_names++)) > > + jw_array_string(&jw, parent_name); > > You're building the array with the immediate parent in a[0] > and the grandparent in a[1], and etc. This is the same as > I did in GFW. > > Perhaps state this in the docs somewhere. Sure, makes sense. I think I neglected any doc work whatsoever in this patch anyways, whoops :) > > + /* cmd_ancestry parent <- grandparent <- great-grandparent */ > > + strbuf_addstr(&buf_payload, "cmd_ancestry "); > > + while ((parent_name = *parent_names++)) { > > + strbuf_addstr(&buf_payload, parent_name); > > Did you want to quote each parent's name? I'd rather not - since they're going into an array anyway, I'd expect the array delimiters to be enough. Am I being naive? 'normal' looks to me like it's supposed to be mostly human readable anyways, rather than parseable? > > + strbuf_addstr(&buf_payload, "ancestry:["); > > + /* It's not an argv but the rules are basically the same. */ > > + sq_append_quote_argv_pretty(&buf_payload, parent_names); > > This will have whitespace delimiters between the quoted strings > rather than commas. Just checking if that's what you wanted. > > I'm not sure it matters, since this stream is intended for human > parsing. Yeah, it seems fine to me as is. > > We should update Documentation/technical/api-trace2.txt too. Yep, thanks. I appreciate the review, Jeff. - Emily
<emilyshaffer@google.com> On May 21, 2021 4:05 PM, Emily Shaffer wrote: >On Fri, May 21, 2021 at 03:15:16PM -0400, Jeff Hostetler wrote: >> On 5/20/21 5:05 PM, Emily Shaffer wrote: >> > - I took a look at Jeff H's advice on using a "data_json" event to log >> > this and decided it would be a little more flexible to add a new event >> > instead. If we want, it'd be feasible to then shoehorn the GfW parent >> > tree stuff into this new event too. Doing it this way is definitely >> > easier to parse for Google's trace analysis system (which for now >> > completely skips "data_json" as it's polymorphic), and also - I think >> > - means that we can add more fields later on if we need to (thread >> > info, different fields than just /proc/n/comm like exec path, argv, >> > whatever). >> >> I could argue both sides of this, so I guess it is fine either way. >> >> In GFW I log a array of argv[0] strings in a generic "data_json" event. >> I could also log additional "data_json" events with more structured >> data if needed. >> >> On the other hand, you're proposing a "cmd_ancestry" event with a >> single array of strings. You would have to expand the call signature >> of the trace2_cmd_ancestry() API to add additional data and inside >> tr2_tgt_event.c add additional fields to the JSON being composed. >> >> So both are about equal. >> >> (I'll avoid the temptation to make a snarky comment about fixing your >> post processing. :-) :-) :-) ) > >;P > >(I don't have much to add - this is an accurate summary of what I thought about, too. Thanks for writing it out.) > >> >> It really doesn't matter one way or the other. >> >> > - Jonathan N also pointed out to me that /proc/n/comm exists, and logs >> > the "command name" - excluding argv, excluding path, etc. It >> > seems >> >> So you're trying to log argv[0] of the process and not the full >> command line. That's what I'm doing. > >It's close to argv[0], yeah. POSIX docs indicate it might be truncated in a way that argv[0] hasn't been, but it also doesn't include the >leading path (as far as I've seen). For example, a long-running helper script I use with mutt, right now (gaffing on line length in email to >help with argv clarity, sorry): > > $ ps aux | grep mutt > emilysh+ 4119883 0.0 0.0 6892 3600 pts/6 S+ 12:44 0:00 /bin/bash /usr/local/google/home/emilyshaffer/dotfiles/open-vim-in- >new-split.sh /var/tmp/mutt-podkayne-413244-1263002-7433772284891386689 > # comm is truncated to 15ch, except apparently in the cases of some > # kernel worker processes I saw with much longer names? > $ cat /proc/4119883/comm > open-vim-in-new > # exe is a link to the executable, which means bash as this is a > # script > $ ls -lha /proc/4119883/exe > lrwxrwxrwx 1 emilyshaffer primarygroup 0 May 21 12:44 > /proc/4119883/exe -> /usr/bin/bash > # cmdline has the whole argv, separated on NUL so it runs together in > # editor > $ cat /proc/4119883/cmdline > /bin/bash/usr/local/google/home/emilyshaffer/dotfiles/open-vim-in-new-split.sh/var/tmp/mutt-podkayne-413244-1263002- >7433772284891386689 > >Jonathan N pointed out that the process name (the thing in 'comm') can also be manually manipulated by the process itself, and 'man >procfs' >also talks about 'PR_SET_NAME' and 'PR_GET_NAME' operations in 'prctl()', so that tracks. (It doesn't look like we can use prctl() to find >out the names of processes besides the current process, though, so the procfs stuff is still needed. Dang.) > >> >> > like this is a little more safe about excluding personal information >> > from the traces which take the form of "myscript.sh >> > --password=hunter2", but would still be worrisome for something like >> > "mysupersecretproject.sh". I'm not sure whether that means we still >> > want to guard it with a config flag, though. >> >> You might check whether you get the name of the script or just get a >> lot of entries with just "/usr/bin/bash". > >See above :) > >> There's lots of PII in the data stream to worry about. >> The name of the command is just one aspect, but I digress. > >Yes, that's what we've noticed too, so a process name isn't worrying us that much more. > >> >> > - I also added a lot to the commit message; hopefully it's not too >> > rambly, but I hoped to explain why just setting GIT_TRACE2_PARENT_SID >> > wasn't going to cut it. >> > - As for testing, I followed the lead of GfW's parentage info - "this >> > isn't portable so writing tests for it will suck, just scrub it from >> > the tests". Maybe it makes sense to do some more >> > platform-specific-ness in the test suite instead? I wasn't sure. >> >> yeah, that's probably best. Unless you can tokenize it properly so >> that you can predict the results in a HEREDOC in the test source. >> >> For example, you might try to test tracing a command (where a >> top-level "git foo" (SPACE form) spawns a "git-foo" (DASHED form) and >> check the output for the child. > >Yeah, I had trouble with even deciding when to attempt such a check or not. >> > + if (reason == TRACE2_PROCESS_INFO_STARTUP) >> > + { >> > + /* >> > + * NEEDSWORK: we could do the entire ptree in an array instead, >> > + * see compat/win32/trace2_win32_process_info.c. >> > + */ >> > + char *names[2]; >> > + names[0] = get_process_name(getppid()); >> > + names[1] = NULL; >> >> You're only logging 1 parent. That's fine to get started. >> >> I'm logging IIRC 10 parents on GFW. That might seem overkill, but >> there are lots of intermediate parents that hide what is happening. >> For example, a "git push" might spawn "git remote-https" >> which spawns "git-remote-https" which spawn "git send-pack" which >> spawns "git pack-objects". >> >> And that doesn't include who called push. >> >> And it's not uncommon to see 2 or 3 "bash" entries in the array >> because of the bash scripts being run. > >Agree. But it's expensive - I didn't find a handy library call to find "parent ID of given process ID", so I think we'd have to manipulate >procfs; and so far I only see parent ID in summary infos like /proc/n/status or /proc/n/stat, which contain lots of other info too and would >need parsing. > >We could reduce the cost a little bit by grabbing the process name from the status or stat as well, and therefore still only opening one file >per process, but I'd want to check whether the formats are expected to be stable for those things. > >> > +static void fn_command_ancestry_fl(const char *file, int line, >> > +const char **parent_names) { >> > + const char *event_name = "cmd_ancestry"; >> > + const char *parent_name = NULL; >> > + struct json_writer jw = JSON_WRITER_INIT; >> > + >> > + jw_object_begin(&jw, 0); >> > + event_fmt_prepare(event_name, file, line, NULL, &jw); >> > + jw_object_inline_begin_array(&jw, "ancestry"); >> > + >> > + while ((parent_name = *parent_names++)) >> > + jw_array_string(&jw, parent_name); >> >> You're building the array with the immediate parent in a[0] and the >> grandparent in a[1], and etc. This is the same as I did in GFW. >> >> Perhaps state this in the docs somewhere. > >Sure, makes sense. I think I neglected any doc work whatsoever in this patch anyways, whoops :) > >> > + /* cmd_ancestry parent <- grandparent <- great-grandparent */ >> > + strbuf_addstr(&buf_payload, "cmd_ancestry "); >> > + while ((parent_name = *parent_names++)) { >> > + strbuf_addstr(&buf_payload, parent_name); >> >> Did you want to quote each parent's name? > >I'd rather not - since they're going into an array anyway, I'd expect the array delimiters to be enough. Am I being naive? 'normal' looks to >me like it's supposed to be mostly human readable anyways, rather than parseable? > >> > + strbuf_addstr(&buf_payload, "ancestry:["); >> > + /* It's not an argv but the rules are basically the same. */ >> > + sq_append_quote_argv_pretty(&buf_payload, parent_names); >> >> This will have whitespace delimiters between the quoted strings rather >> than commas. Just checking if that's what you wanted. >> >> I'm not sure it matters, since this stream is intended for human >> parsing. > >Yeah, it seems fine to me as is. > >> >> We should update Documentation/technical/api-trace2.txt too. > >Yep, thanks. > >I appreciate the review, Jeff. I checked the performance of my NonStop parent lookup implementation. It's fast enough that no one would notice (8 microseconds on the oldest slowest machine I could find). Just an FYI. -Randall
Emily Shaffer <emilyshaffer@google.com> writes:
>> > we will need to discover the name another way. However, the process ID
>> > should be sufficient regardless of platform.
>>
>> Not a strong objection, but I wonder if seeing random integer(s) is
>> better than not having cmd_ancestry info at all. The latter better
>> signals that the platform does not yet have the "parent process
>> name" feature, I would think.
>
> Hm, we could...
Please don't. There is a misreading here.
You mentioned "However, the process ID should be sufficient" and I
read it as "In the worst case we can emit the process ID if we do
not know how to turn it into name", and to that I said "showing
process IDs is not all that useful as they are random integers
without extra info on processes that were running back when the log
entry was taken". Similarly, my later "OK, we do not show pid as a
placeholder." is "Contrary to what I thought you said earlier, you
do not give raw process IDs and instead honestly say we do not have
that information by omitting the record. I am happy to see what the
actual patch does".
Thanks.
On 5/21/21 4:05 PM, Emily Shaffer wrote: > On Fri, May 21, 2021 at 03:15:16PM -0400, Jeff Hostetler wrote: >> On 5/20/21 5:05 PM, Emily Shaffer wrote: >>> - I took a look at Jeff H's advice on using a "data_json" event to log >>> this and decided it would be a little more flexible to add a new event >>> instead. If we want, it'd be feasible to then shoehorn the GfW parent >>> tree stuff into this new event too. Doing it this way is definitely >>> easier to parse for Google's trace analysis system (which for now >>> completely skips "data_json" as it's polymorphic), and also - I think >>> - means that we can add more fields later on if we need to (thread >>> info, different fields than just /proc/n/comm like exec path, argv, >>> whatever). >> >> I could argue both sides of this, so I guess it is fine either way. >> >> In GFW I log a array of argv[0] strings in a generic "data_json" event. >> I could also log additional "data_json" events with more structured >> data if needed. >> >> On the other hand, you're proposing a "cmd_ancestry" event with a >> single array of strings. You would have to expand the call signature >> of the trace2_cmd_ancestry() API to add additional data and inside >> tr2_tgt_event.c add additional fields to the JSON being composed. >> >> So both are about equal. >> >> (I'll avoid the temptation to make a snarky comment about fixing >> your post processing. :-) :-) :-) ) > > ;P > > (I don't have much to add - this is an accurate summary of what I > thought about, too. Thanks for writing it out.) > >> >> It really doesn't matter one way or the other. >> >>> - Jonathan N also pointed out to me that /proc/n/comm exists, and logs >>> the "command name" - excluding argv, excluding path, etc. It seems >> >> So you're trying to log argv[0] of the process and not the full >> command line. That's what I'm doing. > > It's close to argv[0], yeah. POSIX docs indicate it might be truncated > in a way that argv[0] hasn't been, but it also doesn't include the > leading path (as far as I've seen). For example, a long-running helper > script I use with mutt, right now (gaffing on line length in email to > help with argv clarity, sorry): > > $ ps aux | grep mutt > emilysh+ 4119883 0.0 0.0 6892 3600 pts/6 S+ 12:44 0:00 /bin/bash /usr/local/google/home/emilyshaffer/dotfiles/open-vim-in-new-split.sh /var/tmp/mutt-podkayne-413244-1263002-7433772284891386689 > # comm is truncated to 15ch, except apparently in the cases of some > # kernel worker processes I saw with much longer names? > $ cat /proc/4119883/comm > open-vim-in-new > # exe is a link to the executable, which means bash as this is a > # script > $ ls -lha /proc/4119883/exe > lrwxrwxrwx 1 emilyshaffer primarygroup 0 May 21 12:44 > /proc/4119883/exe -> /usr/bin/bash > # cmdline has the whole argv, separated on NUL so it runs together in > # editor > $ cat /proc/4119883/cmdline > /bin/bash/usr/local/google/home/emilyshaffer/dotfiles/open-vim-in-new-split.sh/var/tmp/mutt-podkayne-413244-1263002-7433772284891386689 > > Jonathan N pointed out that the process name (the thing in 'comm') can > also be manually manipulated by the process itself, and 'man procfs' > also talks about 'PR_SET_NAME' and 'PR_GET_NAME' operations in > 'prctl()', so that tracks. (It doesn't look like we can use prctl() to > find out the names of processes besides the current process, though, so > the procfs stuff is still needed. Dang.) > >> >>> like this is a little more safe about excluding personal information >>> from the traces which take the form of "myscript.sh >>> --password=hunter2", but would still be worrisome for something like >>> "mysupersecretproject.sh". I'm not sure whether that means we still >>> want to guard it with a config flag, though. >> >> You might check whether you get the name of the script or just get >> a lot of entries with just "/usr/bin/bash". > > See above :) > >> There's lots of PII in the data stream to worry about. >> The name of the command is just one aspect, but I digress. > > Yes, that's what we've noticed too, so a process name isn't worrying us > that much more. > >> >>> - I also added a lot to the commit message; hopefully it's not too >>> rambly, but I hoped to explain why just setting GIT_TRACE2_PARENT_SID >>> wasn't going to cut it. >>> - As for testing, I followed the lead of GfW's parentage info - "this >>> isn't portable so writing tests for it will suck, just scrub it from >>> the tests". Maybe it makes sense to do some more >>> platform-specific-ness in the test suite instead? I wasn't sure. >> >> yeah, that's probably best. Unless you can tokenize it properly >> so that you can predict the results in a HEREDOC in the test source. >> >> For example, you might try to test tracing a command (where a top-level >> "git foo" (SPACE form) spawns a "git-foo" (DASHED form) and check the >> output for the child. > > Yeah, I had trouble with even deciding when to attempt such a check or > not. >>> + if (reason == TRACE2_PROCESS_INFO_STARTUP) >>> + { >>> + /* >>> + * NEEDSWORK: we could do the entire ptree in an array instead, >>> + * see compat/win32/trace2_win32_process_info.c. >>> + */ >>> + char *names[2]; >>> + names[0] = get_process_name(getppid()); >>> + names[1] = NULL; >> >> You're only logging 1 parent. That's fine to get started. >> >> I'm logging IIRC 10 parents on GFW. That might seem overkill, >> but there are lots of intermediate parents that hide what is >> happening. For example, a "git push" might spawn "git remote-https" >> which spawns "git-remote-https" which spawn "git send-pack" which >> spawns "git pack-objects". >> >> And that doesn't include who called push. >> >> And it's not uncommon to see 2 or 3 "bash" entries in the array >> because of the bash scripts being run. > > Agree. But it's expensive - I didn't find a handy library call to find > "parent ID of given process ID", so I think we'd have to manipulate > procfs; and so far I only see parent ID in summary infos like > /proc/n/status or /proc/n/stat, which contain lots of other info too and > would need parsing. > > We could reduce the cost a little bit by grabbing the process name from > the status or stat as well, and therefore still only opening one file per > process, but I'd want to check whether the formats are expected to be > stable for those things. Yeah, don't force it collect a bunch of data for n parents that you don't need. I guess my point was that there were lots of time when there were too many "unhelpful" parents in the ancestry chain. So maybe confirm that you can get what you need from just 1 parent. I needed more than 1 to tell that a particular command was interactive vs launched by VS or VSCode or etc. > >>> +static void fn_command_ancestry_fl(const char *file, int line, const char **parent_names) >>> +{ >>> + const char *event_name = "cmd_ancestry"; >>> + const char *parent_name = NULL; >>> + struct json_writer jw = JSON_WRITER_INIT; >>> + >>> + jw_object_begin(&jw, 0); >>> + event_fmt_prepare(event_name, file, line, NULL, &jw); >>> + jw_object_inline_begin_array(&jw, "ancestry"); >>> + >>> + while ((parent_name = *parent_names++)) >>> + jw_array_string(&jw, parent_name); >> >> You're building the array with the immediate parent in a[0] >> and the grandparent in a[1], and etc. This is the same as >> I did in GFW. >> >> Perhaps state this in the docs somewhere. > > Sure, makes sense. I think I neglected any doc work whatsoever in this > patch anyways, whoops :) > >>> + /* cmd_ancestry parent <- grandparent <- great-grandparent */ >>> + strbuf_addstr(&buf_payload, "cmd_ancestry "); >>> + while ((parent_name = *parent_names++)) { >>> + strbuf_addstr(&buf_payload, parent_name); >> >> Did you want to quote each parent's name? > > I'd rather not - since they're going into an array anyway, I'd expect > the array delimiters to be enough. Am I being naive? 'normal' looks to > me like it's supposed to be mostly human readable anyways, rather than > parseable? > >>> + strbuf_addstr(&buf_payload, "ancestry:["); >>> + /* It's not an argv but the rules are basically the same. */ >>> + sq_append_quote_argv_pretty(&buf_payload, parent_names); >> >> This will have whitespace delimiters between the quoted strings >> rather than commas. Just checking if that's what you wanted. >> >> I'm not sure it matters, since this stream is intended for human >> parsing. > > Yeah, it seems fine to me as is. > >> >> We should update Documentation/technical/api-trace2.txt too. > > Yep, thanks. > > I appreciate the review, Jeff. > > - Emily >
On Sat, May 22, 2021 at 08:22:46AM +0900, Junio C Hamano wrote:
>
> Emily Shaffer <emilyshaffer@google.com> writes:
>
> >> > we will need to discover the name another way. However, the process ID
> >> > should be sufficient regardless of platform.
> >>
> >> Not a strong objection, but I wonder if seeing random integer(s) is
> >> better than not having cmd_ancestry info at all. The latter better
> >> signals that the platform does not yet have the "parent process
> >> name" feature, I would think.
> >
> > Hm, we could...
>
> Please don't. There is a misreading here.
>
> You mentioned "However, the process ID should be sufficient" and I
> read it as "In the worst case we can emit the process ID if we do
> not know how to turn it into name", and to that I said "showing
> process IDs is not all that useful as they are random integers
> without extra info on processes that were running back when the log
> entry was taken". Similarly, my later "OK, we do not show pid as a
> placeholder." is "Contrary to what I thought you said earlier, you
> do not give raw process IDs and instead honestly say we do not have
> that information by omitting the record. I am happy to see what the
> actual patch does".
Ah, thanks for clarifying. I'll see if I can make the "PID should be
enough" statement less confusing, instead - what I meant was "on all
systems, the result of getppid() should be sufficient to look up the
process name, so this code is probably shareable", and Randall has
pointed out elsewhere to me that that's false.
- Emily
It can be useful to tell who invoked Git - was it invoked manually by a user via CLI or script? By an IDE? In some cases - like 'repo' tool - we can influence the source code and set the GIT_TRACE2_PARENT_SID environment variable from the caller process. In 'repo''s case, that parent SID is manipulated to include the string "repo", which means we can positively identify when Git was invoked by 'repo' tool. However, identifying parents that way requires both that we know which tools invoke Git and that we have the ability to modify the source code of those tools. It cannot scale to keep up with the various IDEs and wrappers which use Git, most of which we don't know about. Learning which tools and wrappers invoke Git, and how, would give us insight to decide where to improve Git's usability and performance. Unfortunately, there's no cross-platform reliable way to gather the name of the parent process. If procfs is present, we can use that; otherwise we will need to discover the name another way. However, the process ID should be sufficient to look up the process name on most platforms, so that code may be shareable. Git for Windows gathers similar information and logs it as a "data_json" event. However, since "data_json" has a variable format, it is difficult to parse effectively in some languages; instead, let's pursue a dedicated "cmd_ancestry" event to record information about the ancestry of the current process and a consistent, parseable way. Git for Windows also gathers information about more than one parent. In Linux further ancestry info can be gathered with procfs, but it's unwieldy to do so. In the interest of later moving Git for Windows ancestry logging to the 'cmd_ancestry' event, and in the interest of later adding more ancestry to the Linux implementation - or of adding this functionality to other platforms which have an easier time walking the process tree - let's make 'cmd_ancestry' accept an array of parentage. Signed-off-by: Emily Shaffer <emilyshaffer@google.com> --- Since v2, only some nits fixed - a few brace placements and some update to the commit message language. Thanks Jeff, Junio, and Randall for looking through the patch more thoroughly. This iteration also has passing CI: https://github.com/nasamuffin/git/actions/runs/872341549 I personally think it's ready to go; Jeff H mentioned adding more parents but later said there's value in confirming we can get the info we want from just one parent, first, which matches my intent too. I see room for additional work, but it can happen later - adding more parents to the Linux impl if we want, and logging the Windows impl to the same cmd_ancestry event if we want. - Emily Makefile | 5 ++++ compat/procinfo.c | 51 +++++++++++++++++++++++++++++++++++++++ config.mak.uname | 1 + git-compat-util.h | 6 +++++ t/t0210/scrub_normal.perl | 6 +++++ t/t0211/scrub_perf.perl | 5 ++++ t/t0212/parse_events.perl | 5 +++- trace2.c | 13 ++++++++++ trace2.h | 12 ++++++++- trace2/tr2_tgt.h | 3 +++ trace2/tr2_tgt_event.c | 21 ++++++++++++++++ trace2/tr2_tgt_normal.c | 19 +++++++++++++++ trace2/tr2_tgt_perf.c | 16 ++++++++++++ 13 files changed, 161 insertions(+), 2 deletions(-) create mode 100644 compat/procinfo.c diff --git a/Makefile b/Makefile index 93664d6714..330e4fa011 100644 --- a/Makefile +++ b/Makefile @@ -1889,6 +1889,11 @@ ifneq ($(PROCFS_EXECUTABLE_PATH),) BASIC_CFLAGS += '-DPROCFS_EXECUTABLE_PATH="$(procfs_executable_path_SQ)"' endif +ifdef HAVE_PROCFS_LINUX + BASIC_CFLAGS += -DHAVE_PROCFS_LINUX + COMPAT_OBJS += compat/procinfo.o +endif + ifdef HAVE_NS_GET_EXECUTABLE_PATH BASIC_CFLAGS += -DHAVE_NS_GET_EXECUTABLE_PATH endif diff --git a/compat/procinfo.c b/compat/procinfo.c new file mode 100644 index 0000000000..0e92fb8b7c --- /dev/null +++ b/compat/procinfo.c @@ -0,0 +1,51 @@ +#include "cache.h" + +#include "strbuf.h" +#include "trace2.h" + +char *get_process_name(int pid) +{ +#ifdef HAVE_PROCFS_LINUX + struct strbuf procfs_path = STRBUF_INIT; + struct strbuf out = STRBUF_INIT; + /* try to use procfs if it's present. */ + strbuf_addf(&procfs_path, "/proc/%d/comm", pid); + if (!strbuf_read_file(&out, procfs_path.buf, 0)) { + /* All done with file reads, clean up early */ + strbuf_release(&procfs_path); + return strbuf_detach(&out, NULL); + } +#endif + + /* NEEDSWORK: add non-procfs implementations here. */ + return NULL; +} + +void trace2_collect_process_info(enum trace2_process_info_reason reason) +{ + if (!trace2_is_enabled()) + return; + + /* someday we may want to write something extra here, but not today */ + if (reason == TRACE2_PROCESS_INFO_EXIT) + return; + + if (reason == TRACE2_PROCESS_INFO_STARTUP) { + /* + * NEEDSWORK: we could do the entire ptree in an array instead, + * see compat/win32/trace2_win32_process_info.c. + */ + char *names[2]; + names[0] = get_process_name(getppid()); + names[1] = NULL; + + if (!names[0]) + return; + + trace2_cmd_ancestry((const char**)names); + + free(names[0]); + } + + return; +} diff --git a/config.mak.uname b/config.mak.uname index cb443b4e02..7ad110a1d2 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -58,6 +58,7 @@ ifeq ($(uname_S),Linux) FREAD_READS_DIRECTORIES = UnfortunatelyYes BASIC_CFLAGS += -DHAVE_SYSINFO PROCFS_EXECUTABLE_PATH = /proc/self/exe + HAVE_PROCFS_LINUX = YesPlease endif ifeq ($(uname_S),GNU/kFreeBSD) HAVE_ALLOCA_H = YesPlease diff --git a/git-compat-util.h b/git-compat-util.h index a508dbe5a3..cc7d5d8a2a 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -1382,4 +1382,10 @@ static inline void *container_of_or_null_offset(void *ptr, size_t offset) void sleep_millisec(int millisec); +/* + * Convert PID to process name (as would show in top/task manager). Returns + * NULL if unimplemented - be sure to check for NULL at callsite. + */ +char *get_process_name(int pid); + #endif diff --git a/t/t0210/scrub_normal.perl b/t/t0210/scrub_normal.perl index c65d1a815e..7cc4de392a 100644 --- a/t/t0210/scrub_normal.perl +++ b/t/t0210/scrub_normal.perl @@ -42,6 +42,12 @@ # so just omit it for testing purposes. # print "cmd_path _EXE_\n"; } + elsif ($line =~ m/^cmd_ancestry/) { + # 'cmd_ancestry' is not implemented everywhere, so for portability's + # sake, skip it when parsing normal. + # + # print "$line"; + } else { print "$line"; } diff --git a/t/t0211/scrub_perf.perl b/t/t0211/scrub_perf.perl index 351af7844e..d164b750ff 100644 --- a/t/t0211/scrub_perf.perl +++ b/t/t0211/scrub_perf.perl @@ -44,6 +44,11 @@ # $tokens[$col_rest] = "_EXE_"; goto SKIP_LINE; } + elsif ($tokens[$col_event] =~ m/cmd_ancestry/) { + # 'cmd_ancestry' is platform-specific and not implemented everywhere, + # so skip it. + goto SKIP_LINE; + } elsif ($tokens[$col_event] =~ m/child_exit/) { $tokens[$col_rest] =~ s/ pid:\d* / pid:_PID_ /; } diff --git a/t/t0212/parse_events.perl b/t/t0212/parse_events.perl index 6584bb5634..b6408560c0 100644 --- a/t/t0212/parse_events.perl +++ b/t/t0212/parse_events.perl @@ -132,7 +132,10 @@ # just omit it for testing purposes. # $processes->{$sid}->{'path'} = "_EXE_"; } - + elsif ($event eq 'cmd_ancestry') { + # 'cmd_ancestry' is platform-specific and not implemented everywhere, so + # just skip it for testing purposes. + } elsif ($event eq 'cmd_name') { $processes->{$sid}->{'name'} = $line->{'name'}; $processes->{$sid}->{'hierarchy'} = $line->{'hierarchy'}; diff --git a/trace2.c b/trace2.c index 256120c7fd..b9b154ac44 100644 --- a/trace2.c +++ b/trace2.c @@ -260,6 +260,19 @@ void trace2_cmd_path_fl(const char *file, int line, const char *pathname) tgt_j->pfn_command_path_fl(file, line, pathname); } +void trace2_cmd_ancestry_fl(const char *file, int line, const char **parent_names) +{ + struct tr2_tgt *tgt_j; + int j; + + if (!trace2_enabled) + return; + + for_each_wanted_builtin (j, tgt_j) + if (tgt_j->pfn_command_ancestry_fl) + tgt_j->pfn_command_ancestry_fl(file, line, parent_names); +} + void trace2_cmd_name_fl(const char *file, int line, const char *name) { struct tr2_tgt *tgt_j; diff --git a/trace2.h b/trace2.h index ede18c2e06..23743ac62b 100644 --- a/trace2.h +++ b/trace2.h @@ -133,6 +133,16 @@ void trace2_cmd_path_fl(const char *file, int line, const char *pathname); #define trace2_cmd_path(p) trace2_cmd_path_fl(__FILE__, __LINE__, (p)) +/* + * Emit an 'ancestry' event with the process name of the current process's + * parent process. + * This gives post-processors a way to determine what invoked the command and + * learn more about usage patterns. + */ +void trace2_cmd_ancestry_fl(const char *file, int line, const char **parent_names); + +#define trace2_cmd_ancestry(v) trace2_cmd_ancestry_fl(__FILE__, __LINE__, (v)) + /* * Emit a 'cmd_name' event with the canonical name of the command. * This gives post-processors a simple field to identify the command @@ -492,7 +502,7 @@ enum trace2_process_info_reason { TRACE2_PROCESS_INFO_EXIT, }; -#if defined(GIT_WINDOWS_NATIVE) +#if ( defined(GIT_WINDOWS_NATIVE) || defined(HAVE_PROCFS_LINUX) ) void trace2_collect_process_info(enum trace2_process_info_reason reason); #else #define trace2_collect_process_info(reason) \ diff --git a/trace2/tr2_tgt.h b/trace2/tr2_tgt.h index 7b90469212..1f66fd6573 100644 --- a/trace2/tr2_tgt.h +++ b/trace2/tr2_tgt.h @@ -27,6 +27,8 @@ typedef void(tr2_tgt_evt_error_va_fl_t)(const char *file, int line, typedef void(tr2_tgt_evt_command_path_fl_t)(const char *file, int line, const char *command_path); +typedef void(tr2_tgt_evt_command_ancestry_fl_t)(const char *file, int line, + const char **parent_names); typedef void(tr2_tgt_evt_command_name_fl_t)(const char *file, int line, const char *name, const char *hierarchy); @@ -108,6 +110,7 @@ struct tr2_tgt { tr2_tgt_evt_atexit_t *pfn_atexit; tr2_tgt_evt_error_va_fl_t *pfn_error_va_fl; tr2_tgt_evt_command_path_fl_t *pfn_command_path_fl; + tr2_tgt_evt_command_ancestry_fl_t *pfn_command_ancestry_fl; tr2_tgt_evt_command_name_fl_t *pfn_command_name_fl; tr2_tgt_evt_command_mode_fl_t *pfn_command_mode_fl; tr2_tgt_evt_alias_fl_t *pfn_alias_fl; diff --git a/trace2/tr2_tgt_event.c b/trace2/tr2_tgt_event.c index 6353e8ad91..578a9a5287 100644 --- a/trace2/tr2_tgt_event.c +++ b/trace2/tr2_tgt_event.c @@ -261,6 +261,26 @@ static void fn_command_path_fl(const char *file, int line, const char *pathname) jw_release(&jw); } +static void fn_command_ancestry_fl(const char *file, int line, const char **parent_names) +{ + const char *event_name = "cmd_ancestry"; + const char *parent_name = NULL; + struct json_writer jw = JSON_WRITER_INIT; + + jw_object_begin(&jw, 0); + event_fmt_prepare(event_name, file, line, NULL, &jw); + jw_object_inline_begin_array(&jw, "ancestry"); + + while ((parent_name = *parent_names++)) + jw_array_string(&jw, parent_name); + + jw_end(&jw); /* 'ancestry' array */ + jw_end(&jw); /* event object */ + + tr2_dst_write_line(&tr2dst_event, &jw.json); + jw_release(&jw); +} + static void fn_command_name_fl(const char *file, int line, const char *name, const char *hierarchy) { @@ -584,6 +604,7 @@ struct tr2_tgt tr2_tgt_event = { fn_atexit, fn_error_va_fl, fn_command_path_fl, + fn_command_ancestry_fl, fn_command_name_fl, fn_command_mode_fl, fn_alias_fl, diff --git a/trace2/tr2_tgt_normal.c b/trace2/tr2_tgt_normal.c index 31b602c171..a5751c8864 100644 --- a/trace2/tr2_tgt_normal.c +++ b/trace2/tr2_tgt_normal.c @@ -160,6 +160,24 @@ static void fn_command_path_fl(const char *file, int line, const char *pathname) strbuf_release(&buf_payload); } +static void fn_command_ancestry_fl(const char *file, int line, const char **parent_names) +{ + const char *parent_name = NULL; + struct strbuf buf_payload = STRBUF_INIT; + + /* cmd_ancestry parent <- grandparent <- great-grandparent */ + strbuf_addstr(&buf_payload, "cmd_ancestry "); + while ((parent_name = *parent_names++)) { + strbuf_addstr(&buf_payload, parent_name); + /* if we'll write another one after this, add a delimiter */ + if (parent_names && *parent_names) + strbuf_addstr(&buf_payload, " <- "); + } + + normal_io_write_fl(file, line, &buf_payload); + strbuf_release(&buf_payload); +} + static void fn_command_name_fl(const char *file, int line, const char *name, const char *hierarchy) { @@ -306,6 +324,7 @@ struct tr2_tgt tr2_tgt_normal = { fn_atexit, fn_error_va_fl, fn_command_path_fl, + fn_command_ancestry_fl, fn_command_name_fl, fn_command_mode_fl, fn_alias_fl, diff --git a/trace2/tr2_tgt_perf.c b/trace2/tr2_tgt_perf.c index a8018f18cc..af4d65a0a5 100644 --- a/trace2/tr2_tgt_perf.c +++ b/trace2/tr2_tgt_perf.c @@ -253,6 +253,21 @@ static void fn_command_path_fl(const char *file, int line, const char *pathname) strbuf_release(&buf_payload); } +static void fn_command_ancestry_fl(const char *file, int line, const char **parent_names) +{ + const char *event_name = "cmd_ancestry"; + struct strbuf buf_payload = STRBUF_INIT; + + strbuf_addstr(&buf_payload, "ancestry:["); + /* It's not an argv but the rules are basically the same. */ + sq_append_quote_argv_pretty(&buf_payload, parent_names); + strbuf_addch(&buf_payload, ']'); + + perf_io_write_fl(file, line, event_name, NULL, NULL, NULL, NULL, + &buf_payload); + strbuf_release(&buf_payload); +} + static void fn_command_name_fl(const char *file, int line, const char *name, const char *hierarchy) { @@ -532,6 +547,7 @@ struct tr2_tgt tr2_tgt_perf = { fn_atexit, fn_error_va_fl, fn_command_path_fl, + fn_command_ancestry_fl, fn_command_name_fl, fn_command_mode_fl, fn_alias_fl, -- 2.31.1.818.g46aad6cb9e-goog
On Mon, May 24, 2021 at 01:10:07PM -0700, Emily Shaffer wrote:
>
> Git for Windows also gathers information about more than one parent. In
Oh, bah, I guess I missed changing this to "more than one generation of
parent". It doesn't seem worth a reroll, but I'll make the change
locally now so I don't miss it with any v3->v4 nits that may come in.
- Emily
On Fri, May 21 2021, Emily Shaffer wrote:
> On Fri, May 21, 2021 at 03:15:16PM -0400, Jeff Hostetler wrote:
>> On 5/20/21 5:05 PM, Emily Shaffer wrote:
>> > - I took a look at Jeff H's advice on using a "data_json" event to log
>> > this and decided it would be a little more flexible to add a new event
>> > instead. If we want, it'd be feasible to then shoehorn the GfW parent
>> > tree stuff into this new event too. Doing it this way is definitely
>> > easier to parse for Google's trace analysis system (which for now
>> > completely skips "data_json" as it's polymorphic), and also - I think
>> > - means that we can add more fields later on if we need to (thread
>> > info, different fields than just /proc/n/comm like exec path, argv,
>> > whatever).
>>
>> I could argue both sides of this, so I guess it is fine either way.
>>
>> In GFW I log a array of argv[0] strings in a generic "data_json" event.
>> I could also log additional "data_json" events with more structured
>> data if needed.
>>
>> On the other hand, you're proposing a "cmd_ancestry" event with a
>> single array of strings. You would have to expand the call signature
>> of the trace2_cmd_ancestry() API to add additional data and inside
>> tr2_tgt_event.c add additional fields to the JSON being composed.
>>
>> So both are about equal.
>>
>> (I'll avoid the temptation to make a snarky comment about fixing
>> your post processing. :-) :-) :-) )
>
> ;P
>
> (I don't have much to add - this is an accurate summary of what I
> thought about, too. Thanks for writing it out.)
>
>>
>> It really doesn't matter one way or the other.
>>
>> > - Jonathan N also pointed out to me that /proc/n/comm exists, and logs
>> > the "command name" - excluding argv, excluding path, etc. It seems
>>
>> So you're trying to log argv[0] of the process and not the full
>> command line. That's what I'm doing.
>
> It's close to argv[0], yeah. POSIX docs indicate it might be truncated
> in a way that argv[0] hasn't been, but it also doesn't include the
> leading path (as far as I've seen). For example, a long-running helper
> script I use with mutt, right now (gaffing on line length in email to
> help with argv clarity, sorry):
>
> $ ps aux | grep mutt
> emilysh+ 4119883 0.0 0.0 6892 3600 pts/6 S+ 12:44 0:00 /bin/bash
> /usr/local/google/home/emilyshaffer/dotfiles/open-vim-in-new-split.sh
> /var/tmp/mutt-podkayne-413244-1263002-7433772284891386689
> # comm is truncated to 15ch, except apparently in the cases of some
> # kernel worker processes I saw with much longer names?
> $ cat /proc/4119883/comm
> open-vim-in-new
> # exe is a link to the executable, which means bash as this is a
> # script
> $ ls -lha /proc/4119883/exe
> lrwxrwxrwx 1 emilyshaffer primarygroup 0 May 21 12:44
> /proc/4119883/exe -> /usr/bin/bash
> # cmdline has the whole argv, separated on NUL so it runs together in
> # editor
> $ cat /proc/4119883/cmdline
> /bin/bash/usr/local/google/home/emilyshaffer/dotfiles/open-vim-in-new-split.sh/var/tmp/mutt-podkayne-413244-1263002-7433772284891386689
>
> Jonathan N pointed out that the process name (the thing in 'comm') can
> also be manually manipulated by the process itself, and 'man procfs'
> also talks about 'PR_SET_NAME' and 'PR_GET_NAME' operations in
> 'prctl()', so that tracks. (It doesn't look like we can use prctl() to
> find out the names of processes besides the current process, though, so
> the procfs stuff is still needed. Dang.)
>
>>
>> > like this is a little more safe about excluding personal information
>> > from the traces which take the form of "myscript.sh
>> > --password=hunter2", but would still be worrisome for something like
>> > "mysupersecretproject.sh". I'm not sure whether that means we still
>> > want to guard it with a config flag, though.
>>
>> You might check whether you get the name of the script or just get
>> a lot of entries with just "/usr/bin/bash".
>
> See above :)
>
>> There's lots of PII in the data stream to worry about.
>> The name of the command is just one aspect, but I digress.
>
> Yes, that's what we've noticed too, so a process name isn't worrying us
> that much more.
>
>>
>> > - I also added a lot to the commit message; hopefully it's not too
>> > rambly, but I hoped to explain why just setting GIT_TRACE2_PARENT_SID
>> > wasn't going to cut it.
>> > - As for testing, I followed the lead of GfW's parentage info - "this
>> > isn't portable so writing tests for it will suck, just scrub it from
>> > the tests". Maybe it makes sense to do some more
>> > platform-specific-ness in the test suite instead? I wasn't sure.
>>
>> yeah, that's probably best. Unless you can tokenize it properly
>> so that you can predict the results in a HEREDOC in the test source.
>>
>> For example, you might try to test tracing a command (where a top-level
>> "git foo" (SPACE form) spawns a "git-foo" (DASHED form) and check the
>> output for the child.
>
> Yeah, I had trouble with even deciding when to attempt such a check or
> not.
>> > + if (reason == TRACE2_PROCESS_INFO_STARTUP)
>> > + {
>> > + /*
>> > + * NEEDSWORK: we could do the entire ptree in an array instead,
>> > + * see compat/win32/trace2_win32_process_info.c.
>> > + */
>> > + char *names[2];
>> > + names[0] = get_process_name(getppid());
>> > + names[1] = NULL;
>>
>> You're only logging 1 parent. That's fine to get started.
>>
>> I'm logging IIRC 10 parents on GFW. That might seem overkill,
>> but there are lots of intermediate parents that hide what is
>> happening. For example, a "git push" might spawn "git remote-https"
>> which spawns "git-remote-https" which spawn "git send-pack" which
>> spawns "git pack-objects".
>>
>> And that doesn't include who called push.
>>
>> And it's not uncommon to see 2 or 3 "bash" entries in the array
>> because of the bash scripts being run.
>
> Agree. But it's expensive - I didn't find a handy library call to find
> "parent ID of given process ID", so I think we'd have to manipulate
> procfs; and so far I only see parent ID in summary infos like
> /proc/n/status or /proc/n/stat, which contain lots of other info too and
> would need parsing.
It sounds a bit like you're fumbling your way towards (re?)discovering:
pstree -s <pid>
You can look at its implementation (or strace it) to see what it does,
and yes, on Linux there's no handy C library for this, iterating over
procfs is the library.
Aside from the privacy, PII, usefulness of this data etc. discussions in
this & related threads I don't think that per-se should be an issue on a
modern Linux system. After all we'd just need to do it once on
startup. For any sub-process we spawn we'd carry it forward after the
initial /usr/bin/git invocation.
Emily Shaffer <emilyshaffer@google.com> writes: > diff --git a/compat/procinfo.c b/compat/procinfo.c > new file mode 100644 > index 0000000000..0e92fb8b7c > --- /dev/null > +++ b/compat/procinfo.c > @@ -0,0 +1,51 @@ > +#include "cache.h" > + > +#include "strbuf.h" > +#include "trace2.h" > + > +char *get_process_name(int pid) > +{ > +#ifdef HAVE_PROCFS_LINUX > + struct strbuf procfs_path = STRBUF_INIT; > + struct strbuf out = STRBUF_INIT; > + /* try to use procfs if it's present. */ > + strbuf_addf(&procfs_path, "/proc/%d/comm", pid); > + if (!strbuf_read_file(&out, procfs_path.buf, 0)) { > + /* All done with file reads, clean up early */ > + strbuf_release(&procfs_path); > + return strbuf_detach(&out, NULL); > + } > +#endif > + > + /* NEEDSWORK: add non-procfs implementations here. */ > + return NULL; > +} Is the reason why this takes "int" and not "pid_t" because we may port to non-POSIX platforms that do not have pid_t defined? ... goes and greps ... Nah, we use pid_t everywhere (including compat/mingw.c); unless there is a reason not to, let's use that type. > +void trace2_collect_process_info(enum trace2_process_info_reason reason) > +{ > + if (!trace2_is_enabled()) > + return; > + > + /* someday we may want to write something extra here, but not today */ > + if (reason == TRACE2_PROCESS_INFO_EXIT) > + return; > + > + if (reason == TRACE2_PROCESS_INFO_STARTUP) { > + /* > + * NEEDSWORK: we could do the entire ptree in an array instead, > + * see compat/win32/trace2_win32_process_info.c. > + */ > + char *names[2]; > + names[0] = get_process_name(getppid()); > + names[1] = NULL; Makes me wonder if get_process_name() is an appropriate abstraction; specifically, something like const char **names = get_ancestry_names(); int cnt; if (names) trace2_cmd_ancestry(names); for (cnt = 0; names[cnt]; cnt++) free((char *)names[cnt]); free(names); would allow platforms to decide how many levels is easy for them to grab for reporting, for example (and they do not even have to have to assume that getting process IDs to feed get_process_name() one by one is the easiest way to show ancestry).
On May 24, 2021 11:54 PM, Junio C Hamano wrote:
>Emily Shaffer <emilyshaffer@google.com> writes:
>
>> diff --git a/compat/procinfo.c b/compat/procinfo.c new file mode
>> 100644 index 0000000000..0e92fb8b7c
>> --- /dev/null
>> +++ b/compat/procinfo.c
>> @@ -0,0 +1,51 @@
>> +#include "cache.h"
>> +
>> +#include "strbuf.h"
>> +#include "trace2.h"
>> +
>> +char *get_process_name(int pid)
>> +{
>> +#ifdef HAVE_PROCFS_LINUX
>> + struct strbuf procfs_path = STRBUF_INIT;
>> + struct strbuf out = STRBUF_INIT;
>> + /* try to use procfs if it's present. */
>> + strbuf_addf(&procfs_path, "/proc/%d/comm", pid);
>> + if (!strbuf_read_file(&out, procfs_path.buf, 0)) {
>> + /* All done with file reads, clean up early */
>> + strbuf_release(&procfs_path);
>> + return strbuf_detach(&out, NULL);
>> + }
>> +#endif
>> +
>> + /* NEEDSWORK: add non-procfs implementations here. */
>> + return NULL;
>> +}
>
>Is the reason why this takes "int" and not "pid_t" because we may port to non-POSIX platforms that do not have pid_t defined?
>
> ... goes and greps ...
>
>Nah, we use pid_t everywhere (including compat/mingw.c); unless there is a reason not to, let's use that type.
>
>> +void trace2_collect_process_info(enum trace2_process_info_reason
>> +reason) {
>> + if (!trace2_is_enabled())
>> + return;
>> +
>> + /* someday we may want to write something extra here, but not today */
>> + if (reason == TRACE2_PROCESS_INFO_EXIT)
>> + return;
>> +
>> + if (reason == TRACE2_PROCESS_INFO_STARTUP) {
>> + /*
>> + * NEEDSWORK: we could do the entire ptree in an array instead,
>> + * see compat/win32/trace2_win32_process_info.c.
>> + */
>> + char *names[2];
>> + names[0] = get_process_name(getppid());
>> + names[1] = NULL;
>
>Makes me wonder if get_process_name() is an appropriate abstraction; specifically, something like
>
> const char **names = get_ancestry_names();
> int cnt;
> if (names)
> trace2_cmd_ancestry(names);
> for (cnt = 0; names[cnt]; cnt++)
> free((char *)names[cnt]);
> free(names);
>
>would allow platforms to decide how many levels is easy for them to grab for reporting, for example (and they do not even have to have
>to assume that getting process IDs to feed get_process_name() one by one is the easiest way to show ancestry).
Passing pid_t == 1 to get_process_names is non-informative. We can't tell what is really intended inside get_process_names(). Knowing that we are getting the ancestor, however, gives us an the important glue that the parent is wanted so we can interpret 1 as a non-ancestor. NonStop has pid_t defined regardless of whether it is used or relevant, so a higher-level of abstraction makes sense. The git process always has a valid pid_t, but the ancestor may not, but we do not know this at compile time. The platform code previously shared appears to be the correct technique for us. Going with get_ancestry_names() is interesting, but I think a count (how many ancestors do you want) is important (1 - just immediate, 2 - parent, grandparent, -1 - everyone). The ancestry tree can be very large (although each lookup is only about 8us). During a trace, I really would not want to care about more than 2 ancestors, typically, not the likely 8 or 10 that I can see happening in my situation (too much
noise). The number of ancestors to trace probably needs to be in .git.config.
It can be useful to tell who invoked Git - was it invoked manually by a user via CLI or script? By an IDE? In some cases - like 'repo' tool - we can influence the source code and set the GIT_TRACE2_PARENT_SID environment variable from the caller process. In 'repo''s case, that parent SID is manipulated to include the string "repo", which means we can positively identify when Git was invoked by 'repo' tool. However, identifying parents that way requires both that we know which tools invoke Git and that we have the ability to modify the source code of those tools. It cannot scale to keep up with the various IDEs and wrappers which use Git, most of which we don't know about. Learning which tools and wrappers invoke Git, and how, would give us insight to decide where to improve Git's usability and performance. Unfortunately, there's no cross-platform reliable way to gather the name of the parent process. If procfs is present, we can use that; otherwise we will need to discover the name another way. However, the process ID should be sufficient to look up the process name on most platforms, so that code may be shareable. Git for Windows gathers similar information and logs it as a "data_json" event. However, since "data_json" has a variable format, it is difficult to parse effectively in some languages; instead, let's pursue a dedicated "cmd_ancestry" event to record information about the ancestry of the current process and a consistent, parseable way. Git for Windows also gathers information about more than one generation of parent. In Linux further ancestry info can be gathered with procfs, but it's unwieldy to do so. In the interest of later moving Git for Windows ancestry logging to the 'cmd_ancestry' event, and in the interest of later adding more ancestry to the Linux implementation - or of adding this functionality to other platforms which have an easier time walking the process tree - let's make 'cmd_ancestry' accept an array of parentage. Signed-off-by: Emily Shaffer <emilyshaffer@google.com> --- Notes: Since v3: Junio and Randall suggested ditching "get_process_name(ppid)" as the API to implement on various platforms, and I liked the suggestion a lot. So instead, I added "get_ancestry_names(strvec *names)". - Using a strvec instead of a char** makes cleanup easier, since strvec takes care of counting the number of strings in the array for us. Otherwise we'd need to include size as a return or out-param in get_ancestry_names(), and that's what the util class is for, right? :) - I made get_ancestry_names() static instead of putting it into git-compat-util.h. I think I had put get_process_name() into that header to facilitate non-procfs implementations, but compat/procinfo.c doesn't seem to me to indicate "procfs only", and if we do need to implement get_process_name() somewhere else, it'll be pretty easy to move it. - I added a description of "cmd_ancestry" to Documentation/technical/api-trace2.txt. I didn't see any user-facing docs to update (for example, "git grep cmd_path" produces only that one doc file). Thanks, all. - Emily Documentation/technical/api-trace2.txt | 14 ++++++ Makefile | 5 +++ compat/procinfo.c | 61 ++++++++++++++++++++++++++ config.mak.uname | 1 + t/t0210/scrub_normal.perl | 6 +++ t/t0211/scrub_perf.perl | 5 +++ t/t0212/parse_events.perl | 5 ++- trace2.c | 13 ++++++ trace2.h | 12 ++++- trace2/tr2_tgt.h | 3 ++ trace2/tr2_tgt_event.c | 21 +++++++++ trace2/tr2_tgt_normal.c | 19 ++++++++ trace2/tr2_tgt_perf.c | 16 +++++++ 13 files changed, 179 insertions(+), 2 deletions(-) create mode 100644 compat/procinfo.c diff --git a/Documentation/technical/api-trace2.txt b/Documentation/technical/api-trace2.txt index 3f52f981a2..8a0b360a0e 100644 --- a/Documentation/technical/api-trace2.txt +++ b/Documentation/technical/api-trace2.txt @@ -493,6 +493,20 @@ about specific error arguments. } ------------ +`"cmd_ancestry"`:: + This event contains the text command name for the parent (and earlier + generations of parents) of the current process, in an array ordered from + nearest parent to furthest great-grandparent. It may not be implemented + on all platforms. ++ +------------ +{ + "event":"cmd_ancestry", + ... + "ancestry":["bash","tmux: server","systemd"] +} +------------ + `"cmd_name"`:: This event contains the command name for this git process and the hierarchy of commands from parent git processes. diff --git a/Makefile b/Makefile index 93664d6714..330e4fa011 100644 --- a/Makefile +++ b/Makefile @@ -1889,6 +1889,11 @@ ifneq ($(PROCFS_EXECUTABLE_PATH),) BASIC_CFLAGS += '-DPROCFS_EXECUTABLE_PATH="$(procfs_executable_path_SQ)"' endif +ifdef HAVE_PROCFS_LINUX + BASIC_CFLAGS += -DHAVE_PROCFS_LINUX + COMPAT_OBJS += compat/procinfo.o +endif + ifdef HAVE_NS_GET_EXECUTABLE_PATH BASIC_CFLAGS += -DHAVE_NS_GET_EXECUTABLE_PATH endif diff --git a/compat/procinfo.c b/compat/procinfo.c new file mode 100644 index 0000000000..9417c3aa54 --- /dev/null +++ b/compat/procinfo.c @@ -0,0 +1,61 @@ +#include "cache.h" + +#include "strbuf.h" +#include "strvec.h" +#include "trace2.h" + +static void get_ancestry_names(struct strvec *names) +{ +#ifdef HAVE_PROCFS_LINUX + /* + * NEEDSWORK: We could gather the entire pstree into an array to match + * functionality with compat/win32/trace2_win32_process_info.c. + * To do so, we may want to examine /proc/<pid>/stat. For now, just + * gather the immediate parent name which is readily accessible from + * /proc/$(getppid())/comm. + */ + struct strbuf procfs_path = STRBUF_INIT; + struct strbuf name = STRBUF_INIT; + + /* try to use procfs if it's present. */ + strbuf_addf(&procfs_path, "/proc/%d/comm", getppid()); + if (strbuf_read_file(&name, procfs_path.buf, 0)) { + strbuf_release(&procfs_path); + strvec_push(names, strbuf_detach(&name, NULL)); + } + + return; +#endif + /* NEEDSWORK: add non-procfs-linux implementations here */ +} + +void trace2_collect_process_info(enum trace2_process_info_reason reason) +{ + if (!trace2_is_enabled()) + return; + + /* someday we may want to write something extra here, but not today */ + if (reason == TRACE2_PROCESS_INFO_EXIT) + return; + + if (reason == TRACE2_PROCESS_INFO_STARTUP) { + /* + * NEEDSWORK: we could do the entire ptree in an array instead, + * see compat/win32/trace2_win32_process_info.c. + */ + struct strvec names = STRVEC_INIT; + + get_ancestry_names(&names); + + if (names.nr == 0) { + strvec_clear(&names); + return; + } + + trace2_cmd_ancestry(names.v); + + strvec_clear(&names); + } + + return; +} diff --git a/config.mak.uname b/config.mak.uname index cb443b4e02..7ad110a1d2 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -58,6 +58,7 @@ ifeq ($(uname_S),Linux) FREAD_READS_DIRECTORIES = UnfortunatelyYes BASIC_CFLAGS += -DHAVE_SYSINFO PROCFS_EXECUTABLE_PATH = /proc/self/exe + HAVE_PROCFS_LINUX = YesPlease endif ifeq ($(uname_S),GNU/kFreeBSD) HAVE_ALLOCA_H = YesPlease diff --git a/t/t0210/scrub_normal.perl b/t/t0210/scrub_normal.perl index c65d1a815e..7cc4de392a 100644 --- a/t/t0210/scrub_normal.perl +++ b/t/t0210/scrub_normal.perl @@ -42,6 +42,12 @@ # so just omit it for testing purposes. # print "cmd_path _EXE_\n"; } + elsif ($line =~ m/^cmd_ancestry/) { + # 'cmd_ancestry' is not implemented everywhere, so for portability's + # sake, skip it when parsing normal. + # + # print "$line"; + } else { print "$line"; } diff --git a/t/t0211/scrub_perf.perl b/t/t0211/scrub_perf.perl index 351af7844e..d164b750ff 100644 --- a/t/t0211/scrub_perf.perl +++ b/t/t0211/scrub_perf.perl @@ -44,6 +44,11 @@ # $tokens[$col_rest] = "_EXE_"; goto SKIP_LINE; } + elsif ($tokens[$col_event] =~ m/cmd_ancestry/) { + # 'cmd_ancestry' is platform-specific and not implemented everywhere, + # so skip it. + goto SKIP_LINE; + } elsif ($tokens[$col_event] =~ m/child_exit/) { $tokens[$col_rest] =~ s/ pid:\d* / pid:_PID_ /; } diff --git a/t/t0212/parse_events.perl b/t/t0212/parse_events.perl index 6584bb5634..b6408560c0 100644 --- a/t/t0212/parse_events.perl +++ b/t/t0212/parse_events.perl @@ -132,7 +132,10 @@ # just omit it for testing purposes. # $processes->{$sid}->{'path'} = "_EXE_"; } - + elsif ($event eq 'cmd_ancestry') { + # 'cmd_ancestry' is platform-specific and not implemented everywhere, so + # just skip it for testing purposes. + } elsif ($event eq 'cmd_name') { $processes->{$sid}->{'name'} = $line->{'name'}; $processes->{$sid}->{'hierarchy'} = $line->{'hierarchy'}; diff --git a/trace2.c b/trace2.c index 256120c7fd..b9b154ac44 100644 --- a/trace2.c +++ b/trace2.c @@ -260,6 +260,19 @@ void trace2_cmd_path_fl(const char *file, int line, const char *pathname) tgt_j->pfn_command_path_fl(file, line, pathname); } +void trace2_cmd_ancestry_fl(const char *file, int line, const char **parent_names) +{ + struct tr2_tgt *tgt_j; + int j; + + if (!trace2_enabled) + return; + + for_each_wanted_builtin (j, tgt_j) + if (tgt_j->pfn_command_ancestry_fl) + tgt_j->pfn_command_ancestry_fl(file, line, parent_names); +} + void trace2_cmd_name_fl(const char *file, int line, const char *name) { struct tr2_tgt *tgt_j; diff --git a/trace2.h b/trace2.h index ede18c2e06..23743ac62b 100644 --- a/trace2.h +++ b/trace2.h @@ -133,6 +133,16 @@ void trace2_cmd_path_fl(const char *file, int line, const char *pathname); #define trace2_cmd_path(p) trace2_cmd_path_fl(__FILE__, __LINE__, (p)) +/* + * Emit an 'ancestry' event with the process name of the current process's + * parent process. + * This gives post-processors a way to determine what invoked the command and + * learn more about usage patterns. + */ +void trace2_cmd_ancestry_fl(const char *file, int line, const char **parent_names); + +#define trace2_cmd_ancestry(v) trace2_cmd_ancestry_fl(__FILE__, __LINE__, (v)) + /* * Emit a 'cmd_name' event with the canonical name of the command. * This gives post-processors a simple field to identify the command @@ -492,7 +502,7 @@ enum trace2_process_info_reason { TRACE2_PROCESS_INFO_EXIT, }; -#if defined(GIT_WINDOWS_NATIVE) +#if ( defined(GIT_WINDOWS_NATIVE) || defined(HAVE_PROCFS_LINUX) ) void trace2_collect_process_info(enum trace2_process_info_reason reason); #else #define trace2_collect_process_info(reason) \ diff --git a/trace2/tr2_tgt.h b/trace2/tr2_tgt.h index 7b90469212..1f66fd6573 100644 --- a/trace2/tr2_tgt.h +++ b/trace2/tr2_tgt.h @@ -27,6 +27,8 @@ typedef void(tr2_tgt_evt_error_va_fl_t)(const char *file, int line, typedef void(tr2_tgt_evt_command_path_fl_t)(const char *file, int line, const char *command_path); +typedef void(tr2_tgt_evt_command_ancestry_fl_t)(const char *file, int line, + const char **parent_names); typedef void(tr2_tgt_evt_command_name_fl_t)(const char *file, int line, const char *name, const char *hierarchy); @@ -108,6 +110,7 @@ struct tr2_tgt { tr2_tgt_evt_atexit_t *pfn_atexit; tr2_tgt_evt_error_va_fl_t *pfn_error_va_fl; tr2_tgt_evt_command_path_fl_t *pfn_command_path_fl; + tr2_tgt_evt_command_ancestry_fl_t *pfn_command_ancestry_fl; tr2_tgt_evt_command_name_fl_t *pfn_command_name_fl; tr2_tgt_evt_command_mode_fl_t *pfn_command_mode_fl; tr2_tgt_evt_alias_fl_t *pfn_alias_fl; diff --git a/trace2/tr2_tgt_event.c b/trace2/tr2_tgt_event.c index 6353e8ad91..578a9a5287 100644 --- a/trace2/tr2_tgt_event.c +++ b/trace2/tr2_tgt_event.c @@ -261,6 +261,26 @@ static void fn_command_path_fl(const char *file, int line, const char *pathname) jw_release(&jw); } +static void fn_command_ancestry_fl(const char *file, int line, const char **parent_names) +{ + const char *event_name = "cmd_ancestry"; + const char *parent_name = NULL; + struct json_writer jw = JSON_WRITER_INIT; + + jw_object_begin(&jw, 0); + event_fmt_prepare(event_name, file, line, NULL, &jw); + jw_object_inline_begin_array(&jw, "ancestry"); + + while ((parent_name = *parent_names++)) + jw_array_string(&jw, parent_name); + + jw_end(&jw); /* 'ancestry' array */ + jw_end(&jw); /* event object */ + + tr2_dst_write_line(&tr2dst_event, &jw.json); + jw_release(&jw); +} + static void fn_command_name_fl(const char *file, int line, const char *name, const char *hierarchy) { @@ -584,6 +604,7 @@ struct tr2_tgt tr2_tgt_event = { fn_atexit, fn_error_va_fl, fn_command_path_fl, + fn_command_ancestry_fl, fn_command_name_fl, fn_command_mode_fl, fn_alias_fl, diff --git a/trace2/tr2_tgt_normal.c b/trace2/tr2_tgt_normal.c index 31b602c171..a5751c8864 100644 --- a/trace2/tr2_tgt_normal.c +++ b/trace2/tr2_tgt_normal.c @@ -160,6 +160,24 @@ static void fn_command_path_fl(const char *file, int line, const char *pathname) strbuf_release(&buf_payload); } +static void fn_command_ancestry_fl(const char *file, int line, const char **parent_names) +{ + const char *parent_name = NULL; + struct strbuf buf_payload = STRBUF_INIT; + + /* cmd_ancestry parent <- grandparent <- great-grandparent */ + strbuf_addstr(&buf_payload, "cmd_ancestry "); + while ((parent_name = *parent_names++)) { + strbuf_addstr(&buf_payload, parent_name); + /* if we'll write another one after this, add a delimiter */ + if (parent_names && *parent_names) + strbuf_addstr(&buf_payload, " <- "); + } + + normal_io_write_fl(file, line, &buf_payload); + strbuf_release(&buf_payload); +} + static void fn_command_name_fl(const char *file, int line, const char *name, const char *hierarchy) { @@ -306,6 +324,7 @@ struct tr2_tgt tr2_tgt_normal = { fn_atexit, fn_error_va_fl, fn_command_path_fl, + fn_command_ancestry_fl, fn_command_name_fl, fn_command_mode_fl, fn_alias_fl, diff --git a/trace2/tr2_tgt_perf.c b/trace2/tr2_tgt_perf.c index a8018f18cc..af4d65a0a5 100644 --- a/trace2/tr2_tgt_perf.c +++ b/trace2/tr2_tgt_perf.c @@ -253,6 +253,21 @@ static void fn_command_path_fl(const char *file, int line, const char *pathname) strbuf_release(&buf_payload); } +static void fn_command_ancestry_fl(const char *file, int line, const char **parent_names) +{ + const char *event_name = "cmd_ancestry"; + struct strbuf buf_payload = STRBUF_INIT; + + strbuf_addstr(&buf_payload, "ancestry:["); + /* It's not an argv but the rules are basically the same. */ + sq_append_quote_argv_pretty(&buf_payload, parent_names); + strbuf_addch(&buf_payload, ']'); + + perf_io_write_fl(file, line, event_name, NULL, NULL, NULL, NULL, + &buf_payload); + strbuf_release(&buf_payload); +} + static void fn_command_name_fl(const char *file, int line, const char *name, const char *hierarchy) { @@ -532,6 +547,7 @@ struct tr2_tgt tr2_tgt_perf = { fn_atexit, fn_error_va_fl, fn_command_path_fl, + fn_command_ancestry_fl, fn_command_name_fl, fn_command_mode_fl, fn_alias_fl, -- 2.32.0.rc1.229.g3e70b5a671-goog
On Tue, Jun 08, 2021 at 11:58:55AM -0700, Emily Shaffer wrote: Hmph, this is failing CI. I'll investigate it today and send a v5.
It can be useful to tell who invoked Git - was it invoked manually by a user via CLI or script? By an IDE? In some cases - like 'repo' tool - we can influence the source code and set the GIT_TRACE2_PARENT_SID environment variable from the caller process. In 'repo''s case, that parent SID is manipulated to include the string "repo", which means we can positively identify when Git was invoked by 'repo' tool. However, identifying parents that way requires both that we know which tools invoke Git and that we have the ability to modify the source code of those tools. It cannot scale to keep up with the various IDEs and wrappers which use Git, most of which we don't know about. Learning which tools and wrappers invoke Git, and how, would give us insight to decide where to improve Git's usability and performance. Unfortunately, there's no cross-platform reliable way to gather the name of the parent process. If procfs is present, we can use that; otherwise we will need to discover the name another way. However, the process ID should be sufficient to look up the process name on most platforms, so that code may be shareable. Git for Windows gathers similar information and logs it as a "data_json" event. However, since "data_json" has a variable format, it is difficult to parse effectively in some languages; instead, let's pursue a dedicated "cmd_ancestry" event to record information about the ancestry of the current process and a consistent, parseable way. Git for Windows also gathers information about more than one generation of parent. In Linux further ancestry info can be gathered with procfs, but it's unwieldy to do so. In the interest of later moving Git for Windows ancestry logging to the 'cmd_ancestry' event, and in the interest of later adding more ancestry to the Linux implementation - or of adding this functionality to other platforms which have an easier time walking the process tree - let's make 'cmd_ancestry' accept an array of parentage. Signed-off-by: Emily Shaffer <emilyshaffer@google.com> --- Since v4: Since I'm reading from a file to discover the name of the process, the file is newline-terminated. This newline caused some havoc in tests, so it's better to strip it if it's there. It's not part of the process name itself. Passing tests at https://github.com/nasamuffin/git/actions/runs/919722509 Range-diff against v4: 1: efb0a3ccb4 ! 1: 7a7e1ebbfa tr2: log parent process name @@ compat/procinfo.c (new) + strbuf_addf(&procfs_path, "/proc/%d/comm", getppid()); + if (strbuf_read_file(&name, procfs_path.buf, 0)) { + strbuf_release(&procfs_path); ++ strbuf_trim_trailing_newline(&name); + strvec_push(names, strbuf_detach(&name, NULL)); + } + Documentation/technical/api-trace2.txt | 14 ++++++ Makefile | 5 +++ compat/procinfo.c | 62 ++++++++++++++++++++++++++ config.mak.uname | 1 + t/t0210/scrub_normal.perl | 6 +++ t/t0211/scrub_perf.perl | 5 +++ t/t0212/parse_events.perl | 5 ++- trace2.c | 13 ++++++ trace2.h | 12 ++++- trace2/tr2_tgt.h | 3 ++ trace2/tr2_tgt_event.c | 21 +++++++++ trace2/tr2_tgt_normal.c | 19 ++++++++ trace2/tr2_tgt_perf.c | 16 +++++++ 13 files changed, 180 insertions(+), 2 deletions(-) create mode 100644 compat/procinfo.c diff --git a/Documentation/technical/api-trace2.txt b/Documentation/technical/api-trace2.txt index 3f52f981a2..8a0b360a0e 100644 --- a/Documentation/technical/api-trace2.txt +++ b/Documentation/technical/api-trace2.txt @@ -493,6 +493,20 @@ about specific error arguments. } ------------ +`"cmd_ancestry"`:: + This event contains the text command name for the parent (and earlier + generations of parents) of the current process, in an array ordered from + nearest parent to furthest great-grandparent. It may not be implemented + on all platforms. ++ +------------ +{ + "event":"cmd_ancestry", + ... + "ancestry":["bash","tmux: server","systemd"] +} +------------ + `"cmd_name"`:: This event contains the command name for this git process and the hierarchy of commands from parent git processes. diff --git a/Makefile b/Makefile index 93664d6714..330e4fa011 100644 --- a/Makefile +++ b/Makefile @@ -1889,6 +1889,11 @@ ifneq ($(PROCFS_EXECUTABLE_PATH),) BASIC_CFLAGS += '-DPROCFS_EXECUTABLE_PATH="$(procfs_executable_path_SQ)"' endif +ifdef HAVE_PROCFS_LINUX + BASIC_CFLAGS += -DHAVE_PROCFS_LINUX + COMPAT_OBJS += compat/procinfo.o +endif + ifdef HAVE_NS_GET_EXECUTABLE_PATH BASIC_CFLAGS += -DHAVE_NS_GET_EXECUTABLE_PATH endif diff --git a/compat/procinfo.c b/compat/procinfo.c new file mode 100644 index 0000000000..f8763cacf8 --- /dev/null +++ b/compat/procinfo.c @@ -0,0 +1,62 @@ +#include "cache.h" + +#include "strbuf.h" +#include "strvec.h" +#include "trace2.h" + +static void get_ancestry_names(struct strvec *names) +{ +#ifdef HAVE_PROCFS_LINUX + /* + * NEEDSWORK: We could gather the entire pstree into an array to match + * functionality with compat/win32/trace2_win32_process_info.c. + * To do so, we may want to examine /proc/<pid>/stat. For now, just + * gather the immediate parent name which is readily accessible from + * /proc/$(getppid())/comm. + */ + struct strbuf procfs_path = STRBUF_INIT; + struct strbuf name = STRBUF_INIT; + + /* try to use procfs if it's present. */ + strbuf_addf(&procfs_path, "/proc/%d/comm", getppid()); + if (strbuf_read_file(&name, procfs_path.buf, 0)) { + strbuf_release(&procfs_path); + strbuf_trim_trailing_newline(&name); + strvec_push(names, strbuf_detach(&name, NULL)); + } + + return; +#endif + /* NEEDSWORK: add non-procfs-linux implementations here */ +} + +void trace2_collect_process_info(enum trace2_process_info_reason reason) +{ + if (!trace2_is_enabled()) + return; + + /* someday we may want to write something extra here, but not today */ + if (reason == TRACE2_PROCESS_INFO_EXIT) + return; + + if (reason == TRACE2_PROCESS_INFO_STARTUP) { + /* + * NEEDSWORK: we could do the entire ptree in an array instead, + * see compat/win32/trace2_win32_process_info.c. + */ + struct strvec names = STRVEC_INIT; + + get_ancestry_names(&names); + + if (names.nr == 0) { + strvec_clear(&names); + return; + } + + trace2_cmd_ancestry(names.v); + + strvec_clear(&names); + } + + return; +} diff --git a/config.mak.uname b/config.mak.uname index cb443b4e02..7ad110a1d2 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -58,6 +58,7 @@ ifeq ($(uname_S),Linux) FREAD_READS_DIRECTORIES = UnfortunatelyYes BASIC_CFLAGS += -DHAVE_SYSINFO PROCFS_EXECUTABLE_PATH = /proc/self/exe + HAVE_PROCFS_LINUX = YesPlease endif ifeq ($(uname_S),GNU/kFreeBSD) HAVE_ALLOCA_H = YesPlease diff --git a/t/t0210/scrub_normal.perl b/t/t0210/scrub_normal.perl index c65d1a815e..7cc4de392a 100644 --- a/t/t0210/scrub_normal.perl +++ b/t/t0210/scrub_normal.perl @@ -42,6 +42,12 @@ # so just omit it for testing purposes. # print "cmd_path _EXE_\n"; } + elsif ($line =~ m/^cmd_ancestry/) { + # 'cmd_ancestry' is not implemented everywhere, so for portability's + # sake, skip it when parsing normal. + # + # print "$line"; + } else { print "$line"; } diff --git a/t/t0211/scrub_perf.perl b/t/t0211/scrub_perf.perl index 351af7844e..d164b750ff 100644 --- a/t/t0211/scrub_perf.perl +++ b/t/t0211/scrub_perf.perl @@ -44,6 +44,11 @@ # $tokens[$col_rest] = "_EXE_"; goto SKIP_LINE; } + elsif ($tokens[$col_event] =~ m/cmd_ancestry/) { + # 'cmd_ancestry' is platform-specific and not implemented everywhere, + # so skip it. + goto SKIP_LINE; + } elsif ($tokens[$col_event] =~ m/child_exit/) { $tokens[$col_rest] =~ s/ pid:\d* / pid:_PID_ /; } diff --git a/t/t0212/parse_events.perl b/t/t0212/parse_events.perl index 6584bb5634..b6408560c0 100644 --- a/t/t0212/parse_events.perl +++ b/t/t0212/parse_events.perl @@ -132,7 +132,10 @@ # just omit it for testing purposes. # $processes->{$sid}->{'path'} = "_EXE_"; } - + elsif ($event eq 'cmd_ancestry') { + # 'cmd_ancestry' is platform-specific and not implemented everywhere, so + # just skip it for testing purposes. + } elsif ($event eq 'cmd_name') { $processes->{$sid}->{'name'} = $line->{'name'}; $processes->{$sid}->{'hierarchy'} = $line->{'hierarchy'}; diff --git a/trace2.c b/trace2.c index 256120c7fd..b9b154ac44 100644 --- a/trace2.c +++ b/trace2.c @@ -260,6 +260,19 @@ void trace2_cmd_path_fl(const char *file, int line, const char *pathname) tgt_j->pfn_command_path_fl(file, line, pathname); } +void trace2_cmd_ancestry_fl(const char *file, int line, const char **parent_names) +{ + struct tr2_tgt *tgt_j; + int j; + + if (!trace2_enabled) + return; + + for_each_wanted_builtin (j, tgt_j) + if (tgt_j->pfn_command_ancestry_fl) + tgt_j->pfn_command_ancestry_fl(file, line, parent_names); +} + void trace2_cmd_name_fl(const char *file, int line, const char *name) { struct tr2_tgt *tgt_j; diff --git a/trace2.h b/trace2.h index ede18c2e06..23743ac62b 100644 --- a/trace2.h +++ b/trace2.h @@ -133,6 +133,16 @@ void trace2_cmd_path_fl(const char *file, int line, const char *pathname); #define trace2_cmd_path(p) trace2_cmd_path_fl(__FILE__, __LINE__, (p)) +/* + * Emit an 'ancestry' event with the process name of the current process's + * parent process. + * This gives post-processors a way to determine what invoked the command and + * learn more about usage patterns. + */ +void trace2_cmd_ancestry_fl(const char *file, int line, const char **parent_names); + +#define trace2_cmd_ancestry(v) trace2_cmd_ancestry_fl(__FILE__, __LINE__, (v)) + /* * Emit a 'cmd_name' event with the canonical name of the command. * This gives post-processors a simple field to identify the command @@ -492,7 +502,7 @@ enum trace2_process_info_reason { TRACE2_PROCESS_INFO_EXIT, }; -#if defined(GIT_WINDOWS_NATIVE) +#if ( defined(GIT_WINDOWS_NATIVE) || defined(HAVE_PROCFS_LINUX) ) void trace2_collect_process_info(enum trace2_process_info_reason reason); #else #define trace2_collect_process_info(reason) \ diff --git a/trace2/tr2_tgt.h b/trace2/tr2_tgt.h index 7b90469212..1f66fd6573 100644 --- a/trace2/tr2_tgt.h +++ b/trace2/tr2_tgt.h @@ -27,6 +27,8 @@ typedef void(tr2_tgt_evt_error_va_fl_t)(const char *file, int line, typedef void(tr2_tgt_evt_command_path_fl_t)(const char *file, int line, const char *command_path); +typedef void(tr2_tgt_evt_command_ancestry_fl_t)(const char *file, int line, + const char **parent_names); typedef void(tr2_tgt_evt_command_name_fl_t)(const char *file, int line, const char *name, const char *hierarchy); @@ -108,6 +110,7 @@ struct tr2_tgt { tr2_tgt_evt_atexit_t *pfn_atexit; tr2_tgt_evt_error_va_fl_t *pfn_error_va_fl; tr2_tgt_evt_command_path_fl_t *pfn_command_path_fl; + tr2_tgt_evt_command_ancestry_fl_t *pfn_command_ancestry_fl; tr2_tgt_evt_command_name_fl_t *pfn_command_name_fl; tr2_tgt_evt_command_mode_fl_t *pfn_command_mode_fl; tr2_tgt_evt_alias_fl_t *pfn_alias_fl; diff --git a/trace2/tr2_tgt_event.c b/trace2/tr2_tgt_event.c index 6353e8ad91..578a9a5287 100644 --- a/trace2/tr2_tgt_event.c +++ b/trace2/tr2_tgt_event.c @@ -261,6 +261,26 @@ static void fn_command_path_fl(const char *file, int line, const char *pathname) jw_release(&jw); } +static void fn_command_ancestry_fl(const char *file, int line, const char **parent_names) +{ + const char *event_name = "cmd_ancestry"; + const char *parent_name = NULL; + struct json_writer jw = JSON_WRITER_INIT; + + jw_object_begin(&jw, 0); + event_fmt_prepare(event_name, file, line, NULL, &jw); + jw_object_inline_begin_array(&jw, "ancestry"); + + while ((parent_name = *parent_names++)) + jw_array_string(&jw, parent_name); + + jw_end(&jw); /* 'ancestry' array */ + jw_end(&jw); /* event object */ + + tr2_dst_write_line(&tr2dst_event, &jw.json); + jw_release(&jw); +} + static void fn_command_name_fl(const char *file, int line, const char *name, const char *hierarchy) { @@ -584,6 +604,7 @@ struct tr2_tgt tr2_tgt_event = { fn_atexit, fn_error_va_fl, fn_command_path_fl, + fn_command_ancestry_fl, fn_command_name_fl, fn_command_mode_fl, fn_alias_fl, diff --git a/trace2/tr2_tgt_normal.c b/trace2/tr2_tgt_normal.c index 31b602c171..a5751c8864 100644 --- a/trace2/tr2_tgt_normal.c +++ b/trace2/tr2_tgt_normal.c @@ -160,6 +160,24 @@ static void fn_command_path_fl(const char *file, int line, const char *pathname) strbuf_release(&buf_payload); } +static void fn_command_ancestry_fl(const char *file, int line, const char **parent_names) +{ + const char *parent_name = NULL; + struct strbuf buf_payload = STRBUF_INIT; + + /* cmd_ancestry parent <- grandparent <- great-grandparent */ + strbuf_addstr(&buf_payload, "cmd_ancestry "); + while ((parent_name = *parent_names++)) { + strbuf_addstr(&buf_payload, parent_name); + /* if we'll write another one after this, add a delimiter */ + if (parent_names && *parent_names) + strbuf_addstr(&buf_payload, " <- "); + } + + normal_io_write_fl(file, line, &buf_payload); + strbuf_release(&buf_payload); +} + static void fn_command_name_fl(const char *file, int line, const char *name, const char *hierarchy) { @@ -306,6 +324,7 @@ struct tr2_tgt tr2_tgt_normal = { fn_atexit, fn_error_va_fl, fn_command_path_fl, + fn_command_ancestry_fl, fn_command_name_fl, fn_command_mode_fl, fn_alias_fl, diff --git a/trace2/tr2_tgt_perf.c b/trace2/tr2_tgt_perf.c index a8018f18cc..af4d65a0a5 100644 --- a/trace2/tr2_tgt_perf.c +++ b/trace2/tr2_tgt_perf.c @@ -253,6 +253,21 @@ static void fn_command_path_fl(const char *file, int line, const char *pathname) strbuf_release(&buf_payload); } +static void fn_command_ancestry_fl(const char *file, int line, const char **parent_names) +{ + const char *event_name = "cmd_ancestry"; + struct strbuf buf_payload = STRBUF_INIT; + + strbuf_addstr(&buf_payload, "ancestry:["); + /* It's not an argv but the rules are basically the same. */ + sq_append_quote_argv_pretty(&buf_payload, parent_names); + strbuf_addch(&buf_payload, ']'); + + perf_io_write_fl(file, line, event_name, NULL, NULL, NULL, NULL, + &buf_payload); + strbuf_release(&buf_payload); +} + static void fn_command_name_fl(const char *file, int line, const char *name, const char *hierarchy) { @@ -532,6 +547,7 @@ struct tr2_tgt tr2_tgt_perf = { fn_atexit, fn_error_va_fl, fn_command_path_fl, + fn_command_ancestry_fl, fn_command_name_fl, fn_command_mode_fl, fn_alias_fl, -- 2.32.0.rc1.229.g3e70b5a671-goog
On June 8, 2021 6:11 PM, Emily Shaffer wrote:
>It can be useful to tell who invoked Git - was it invoked manually by a user via CLI or script? By an IDE? In some cases - like 'repo' tool -
>we can influence the source code and set the GIT_TRACE2_PARENT_SID environment variable from the caller process. In 'repo''s case,
>that parent SID is manipulated to include the string "repo", which means we can positively identify when Git was invoked by 'repo' tool.
>However, identifying parents that way requires both that we know which tools invoke Git and that we have the ability to modify the source
>code of those tools. It cannot scale to keep up with the various IDEs and wrappers which use Git, most of which we don't know about.
>Learning which tools and wrappers invoke Git, and how, would give us insight to decide where to improve Git's usability and performance.
>
>Unfortunately, there's no cross-platform reliable way to gather the name of the parent process. If procfs is present, we can use that;
>otherwise we will need to discover the name another way. However, the process ID should be sufficient to look up the process name on
>most platforms, so that code may be shareable.
>
>Git for Windows gathers similar information and logs it as a "data_json"
>event. However, since "data_json" has a variable format, it is difficult to parse effectively in some languages; instead, let's pursue a
>dedicated "cmd_ancestry" event to record information about the ancestry of the current process and a consistent, parseable way.
>
>Git for Windows also gathers information about more than one generation of parent. In Linux further ancestry info can be gathered with
>procfs, but it's unwieldy to do so. In the interest of later moving Git for Windows ancestry logging to the 'cmd_ancestry' event, and in the
>interest of later adding more ancestry to the Linux implementation - or of adding this functionality to other platforms which have an
>easier time walking the process tree - let's make 'cmd_ancestry' accept an array of parentage.
We are probably going to have to discuss this one at more length. On NonStop, in some cases, I have access to the program arguments of the parent (rather like ps -ef) in POSIX-land, but not from the other personality. I do have access to the program object name, in both sides, although if someone replaces the object - which is not actually possible for a running program, but a rename is - the object may end up being somewhat meaningless or mangled. My suspicion is that I'm going to have to supply different things for the two personalities, but I'm not sure what as of yet.
Regards,
Randall
On Tue, Jun 08, 2021 at 06:16:57PM -0400, Randall S. Becker wrote:
>
> On June 8, 2021 6:11 PM, Emily Shaffer wrote:
> >It can be useful to tell who invoked Git - was it invoked manually by a user via CLI or script? By an IDE? In some cases - like 'repo' tool -
> >we can influence the source code and set the GIT_TRACE2_PARENT_SID environment variable from the caller process. In 'repo''s case,
> >that parent SID is manipulated to include the string "repo", which means we can positively identify when Git was invoked by 'repo' tool.
> >However, identifying parents that way requires both that we know which tools invoke Git and that we have the ability to modify the source
> >code of those tools. It cannot scale to keep up with the various IDEs and wrappers which use Git, most of which we don't know about.
> >Learning which tools and wrappers invoke Git, and how, would give us insight to decide where to improve Git's usability and performance.
> >
> >Unfortunately, there's no cross-platform reliable way to gather the name of the parent process. If procfs is present, we can use that;
> >otherwise we will need to discover the name another way. However, the process ID should be sufficient to look up the process name on
> >most platforms, so that code may be shareable.
> >
> >Git for Windows gathers similar information and logs it as a "data_json"
> >event. However, since "data_json" has a variable format, it is difficult to parse effectively in some languages; instead, let's pursue a
> >dedicated "cmd_ancestry" event to record information about the ancestry of the current process and a consistent, parseable way.
> >
> >Git for Windows also gathers information about more than one generation of parent. In Linux further ancestry info can be gathered with
> >procfs, but it's unwieldy to do so. In the interest of later moving Git for Windows ancestry logging to the 'cmd_ancestry' event, and in the
> >interest of later adding more ancestry to the Linux implementation - or of adding this functionality to other platforms which have an
> >easier time walking the process tree - let's make 'cmd_ancestry' accept an array of parentage.
>
> We are probably going to have to discuss this one at more length. On
> NonStop, in some cases, I have access to the program arguments of the
> parent (rather like ps -ef) in POSIX-land, but not from the other
> personality. I do have access to the program object name, in both
> sides, although if someone replaces the object - which is not actually
> possible for a running program, but a rename is - the object may end
> up being somewhat meaningless or mangled. My suspicion is that I'm
> going to have to supply different things for the two personalities,
> but I'm not sure what as of yet.
I guess I'm having trouble understanding - it sounds like you're
describing one process with a graph of ancestry instead of a line. Does
that mean we shouldn't be tracing the ancestry the way we are?
- Emily
On June 8, 2021 6:25 PM, Emily Shaffer wrote" >To: Randall S. Becker <rsbecker@nexbridge.com> >Cc: git@vger.kernel.org; 'Ævar Arnfjörð Bjarmason' <avarab@gmail.com>; 'Junio C Hamano' <gitster@pobox.com>; 'Jeff Hostetler' ><git@jeffhostetler.com>; 'Bagas Sanjaya' <bagasdotme@gmail.com> >Subject: Re: [PATCH v5] tr2: log parent process name > >On Tue, Jun 08, 2021 at 06:16:57PM -0400, Randall S. Becker wrote: >> >> On June 8, 2021 6:11 PM, Emily Shaffer wrote: >> >It can be useful to tell who invoked Git - was it invoked manually by >> >a user via CLI or script? By an IDE? In some cases - like 'repo' >> >tool - we can influence the source code and set the GIT_TRACE2_PARENT_SID environment variable from the caller process. In 'repo''s >case, that parent SID is manipulated to include the string "repo", which means we can positively identify when Git was invoked by 'repo' >tool. >> >However, identifying parents that way requires both that we know >> >which tools invoke Git and that we have the ability to modify the source code of those tools. It cannot scale to keep up with the various >IDEs and wrappers which use Git, most of which we don't know about. >> >Learning which tools and wrappers invoke Git, and how, would give us insight to decide where to improve Git's usability and >performance. >> > >> >Unfortunately, there's no cross-platform reliable way to gather the >> >name of the parent process. If procfs is present, we can use that; >> >otherwise we will need to discover the name another way. However, the process ID should be sufficient to look up the process name on >most platforms, so that code may be shareable. >> > >> >Git for Windows gathers similar information and logs it as a "data_json" >> >event. However, since "data_json" has a variable format, it is >> >difficult to parse effectively in some languages; instead, let's pursue a dedicated "cmd_ancestry" event to record information about the >ancestry of the current process and a consistent, parseable way. >> > >> >Git for Windows also gathers information about more than one >> >generation of parent. In Linux further ancestry info can be gathered >> >with procfs, but it's unwieldy to do so. In the interest of later >> >moving Git for Windows ancestry logging to the 'cmd_ancestry' event, and in the interest of later adding more ancestry to the Linux >implementation - or of adding this functionality to other platforms which have an easier time walking the process tree - let's make >'cmd_ancestry' accept an array of parentage. >> >> We are probably going to have to discuss this one at more length. On >> NonStop, in some cases, I have access to the program arguments of the >> parent (rather like ps -ef) in POSIX-land, but not from the other >> personality. I do have access to the program object name, in both >> sides, although if someone replaces the object - which is not actually >> possible for a running program, but a rename is - the object may end >> up being somewhat meaningless or mangled. My suspicion is that I'm >> going to have to supply different things for the two personalities, >> but I'm not sure what as of yet. > >I guess I'm having trouble understanding - it sounds like you're describing one process with a graph of ancestry instead of a line. Does that >mean we shouldn't be tracing the ancestry the way we are? It's more like this (g = Guardian, p=Posix, for illustration), for a typical interactive situation coming from the non-POSIX side): gMonitor -> gAncestor1 -> gAncestor2 -> pAncestor3 (/bin/sh) -> git And when started from an SSH window: gMonitor -> gAncestor1 (SSH) -> pAncestor2 (/bin/sh) -> git I can get the program object name from any of the above, and the pid from a POSIX process, or the name (or cpu and process number) of a Guardian process. In the case of POSIX, obtaining program arguments may be possible. An ancestor, as with Linux, can have multiple children in a tree but a child can only have one parent - well, technically one at a time anyway because there are some funky exceptions where a child can adopt a different parent in Guardian-land. In both cases, I can get the program object file of the process (like /usr/local/bin/git), but if someone renames git because an install happened during a long-running operation, like git gc --aggressive, the object may be named something else, like /usr/local/bin/ZZLDAG01, for argument sake). I'm not sure any of this is really relevant, but describes some of what is possible. It also might be useful to pull out the tty that the process was running on. That is easy to get if the terminal is still connected. I think that particular bit of information might be very useful, as well as user information. -Randall
On Tue, Jun 08, 2021 at 06:39:15PM -0400, Randall S. Becker wrote: > On June 8, 2021 6:25 PM, Emily Shaffer wrote" > >> We are probably going to have to discuss this one at more length. On > >> NonStop, in some cases, I have access to the program arguments of the > >> parent (rather like ps -ef) in POSIX-land, but not from the other > >> personality. I do have access to the program object name, in both > >> sides, although if someone replaces the object - which is not actually > >> possible for a running program, but a rename is - the object may end > >> up being somewhat meaningless or mangled. My suspicion is that I'm > >> going to have to supply different things for the two personalities, > >> but I'm not sure what as of yet. > > > >I guess I'm having trouble understanding - it sounds like you're > >describing one process with a graph of ancestry instead of a line. > >Does that mean we shouldn't be tracing the ancestry the way we are? > > It's more like this (g = Guardian, p=Posix, for illustration), for a > typical interactive situation coming from the non-POSIX side): > > gMonitor -> gAncestor1 -> gAncestor2 -> pAncestor3 (/bin/sh) -> git > > And when started from an SSH window: > > gMonitor -> gAncestor1 (SSH) -> pAncestor2 (/bin/sh) -> git > > I can get the program object name from any of the above, and the pid > from a POSIX process, or the name (or cpu and process number) of a > Guardian process. In the case of POSIX, obtaining program arguments > may be possible. An ancestor, as with Linux, can have multiple > children in a tree but a child can only have one parent - well, > technically one at a time anyway because there are some funky > exceptions where a child can adopt a different parent in > Guardian-land. In both cases, I can get the program object file of the > process (like /usr/local/bin/git), but if someone renames git because > an install happened during a long-running operation, like git gc > --aggressive, the object may be named something else, like > /usr/local/bin/ZZLDAG01, for argument sake). Interesting. One thing that might be helpful (if you don't care about the later name, since it looks like it might be junk) is that trace2_collect_process_info() provides some hint about when it was invoked, via the 'enum trace2_process_info_reason' arg. So if you're worried about a long-running process being renamed, the TRACE2_PROCESS_INFO_STARTUP reason happens very early in common-main.c. (And you could capture info like "the name changed later" at TRACE2_PROCESS_INFO_EXIT if you wanted.) > I'm not sure any of this is really relevant, but describes some of > what is possible. It also might be useful to pull out the tty that the > process was running on. That is easy to get if the terminal is still > connected. I think that particular bit of information might be very > useful, as well as user information. > > -Randall > (I don't have much insight into what would or wouldn't be useful to log here in your case, but I will say that it all sounds very cool and I appreciate your thorough explanation.) - Emily
Emily Shaffer <emilyshaffer@google.com> writes:
Other than your back-and-forth with Randall on NonStop specifics
that didn't result in any code change, there was no comment on this
patch. Are other people totally happy with this version, or are
they totally uninterested?
> + if (reason == TRACE2_PROCESS_INFO_STARTUP) {
> + /*
> + * NEEDSWORK: we could do the entire ptree in an array instead,
> + * see compat/win32/trace2_win32_process_info.c.
> + */
> + struct strvec names = STRVEC_INIT;
> +
> + get_ancestry_names(&names);
> +
> + if (names.nr == 0) {
>
> + strvec_clear(&names);
> + return;
> + }
>
> + trace2_cmd_ancestry(names.v);
> +
> + strvec_clear(&names);
Micronit. CodingGuidelines tells us not to explicitly compare with
constant 0, '\0', or NULL. It may be more concise and easier to
follow if written like this:
get_ancestry_names(&names);
if (names.nr)
trace2_cmd_ancestry(names.v);
strvec_clear(&names);
Thanks.
On 6/8/21 6:10 PM, Emily Shaffer wrote: > It can be useful to tell who invoked Git - was it invoked manually by a > user via CLI or script? By an IDE? In some cases - like 'repo' tool - > we can influence the source code and set the GIT_TRACE2_PARENT_SID > environment variable from the caller process. In 'repo''s case, that > parent SID is manipulated to include the string "repo", which means we > can positively identify when Git was invoked by 'repo' tool. However, > identifying parents that way requires both that we know which tools > invoke Git and that we have the ability to modify the source code of > those tools. It cannot scale to keep up with the various IDEs and > wrappers which use Git, most of which we don't know about. Learning > which tools and wrappers invoke Git, and how, would give us insight to > decide where to improve Git's usability and performance. > > Unfortunately, there's no cross-platform reliable way to gather the name > of the parent process. If procfs is present, we can use that; otherwise > we will need to discover the name another way. However, the process ID > should be sufficient to look up the process name on most platforms, so > that code may be shareable. > > Git for Windows gathers similar information and logs it as a "data_json" > event. However, since "data_json" has a variable format, it is difficult > to parse effectively in some languages; instead, let's pursue a > dedicated "cmd_ancestry" event to record information about the ancestry > of the current process and a consistent, parseable way. > > Git for Windows also gathers information about more than one generation > of parent. In Linux further ancestry info can be gathered with procfs, > but it's unwieldy to do so. In the interest of later moving Git for > Windows ancestry logging to the 'cmd_ancestry' event, and in the > interest of later adding more ancestry to the Linux implementation - or > of adding this functionality to other platforms which have an easier > time walking the process tree - let's make 'cmd_ancestry' accept an > array of parentage. > > Signed-off-by: Emily Shaffer <emilyshaffer@google.com> > --- > > Since v4: > > Since I'm reading from a file to discover the name of the process, the > file is newline-terminated. This newline caused some havoc in tests, so > it's better to strip it if it's there. It's not part of the process name > itself. > > Passing tests at > https://github.com/nasamuffin/git/actions/runs/919722509 > > Range-diff against v4: > 1: efb0a3ccb4 ! 1: 7a7e1ebbfa tr2: log parent process name > @@ compat/procinfo.c (new) > + strbuf_addf(&procfs_path, "/proc/%d/comm", getppid()); > + if (strbuf_read_file(&name, procfs_path.buf, 0)) { > + strbuf_release(&procfs_path); > ++ strbuf_trim_trailing_newline(&name); > + strvec_push(names, strbuf_detach(&name, NULL)); > + } > + You're only getting the name of the command (argv[0]) and not the full command line, right? That is a good thing. > > Documentation/technical/api-trace2.txt | 14 ++++++ > Makefile | 5 +++ > compat/procinfo.c | 62 ++++++++++++++++++++++++++ > config.mak.uname | 1 + > t/t0210/scrub_normal.perl | 6 +++ > t/t0211/scrub_perf.perl | 5 +++ > t/t0212/parse_events.perl | 5 ++- > trace2.c | 13 ++++++ > trace2.h | 12 ++++- > trace2/tr2_tgt.h | 3 ++ > trace2/tr2_tgt_event.c | 21 +++++++++ > trace2/tr2_tgt_normal.c | 19 ++++++++ > trace2/tr2_tgt_perf.c | 16 +++++++ > 13 files changed, 180 insertions(+), 2 deletions(-) > create mode 100644 compat/procinfo.c > > diff --git a/Documentation/technical/api-trace2.txt b/Documentation/technical/api-trace2.txt > index 3f52f981a2..8a0b360a0e 100644 > --- a/Documentation/technical/api-trace2.txt > +++ b/Documentation/technical/api-trace2.txt > @@ -493,6 +493,20 @@ about specific error arguments. > } > ------------ > > +`"cmd_ancestry"`:: > + This event contains the text command name for the parent (and earlier > + generations of parents) of the current process, in an array ordered from > + nearest parent to furthest great-grandparent. It may not be implemented > + on all platforms. > ++ > +------------ > +{ > + "event":"cmd_ancestry", > + ... > + "ancestry":["bash","tmux: server","systemd"] Is the second element really "tmux: server". Seems odd that that's what the command name (argv[0]) is. Perhaps I misread something?? > +} This array is bounded and that implies that you captured all of the grand parents back to "init" (or whatever it is called these days). Is there value in having a final "..." or "(truncated)" element to indicate that the list incomplete? I did the latter in the Windows version. > +------------ > + > `"cmd_name"`:: > This event contains the command name for this git process > and the hierarchy of commands from parent git processes. > diff --git a/Makefile b/Makefile > index 93664d6714..330e4fa011 100644 > --- a/Makefile > +++ b/Makefile > @@ -1889,6 +1889,11 @@ ifneq ($(PROCFS_EXECUTABLE_PATH),) > BASIC_CFLAGS += '-DPROCFS_EXECUTABLE_PATH="$(procfs_executable_path_SQ)"' > endif > > +ifdef HAVE_PROCFS_LINUX > + BASIC_CFLAGS += -DHAVE_PROCFS_LINUX > + COMPAT_OBJS += compat/procinfo.o > +endif > + > ifdef HAVE_NS_GET_EXECUTABLE_PATH > BASIC_CFLAGS += -DHAVE_NS_GET_EXECUTABLE_PATH > endif > diff --git a/compat/procinfo.c b/compat/procinfo.c > new file mode 100644 > index 0000000000..f8763cacf8 > --- /dev/null > +++ b/compat/procinfo.c > @@ -0,0 +1,62 @@ > +#include "cache.h" > + > +#include "strbuf.h" > +#include "strvec.h" > +#include "trace2.h" > + > +static void get_ancestry_names(struct strvec *names) > +{ > +#ifdef HAVE_PROCFS_LINUX > + /* > + * NEEDSWORK: We could gather the entire pstree into an array to match > + * functionality with compat/win32/trace2_win32_process_info.c. > + * To do so, we may want to examine /proc/<pid>/stat. For now, just > + * gather the immediate parent name which is readily accessible from > + * /proc/$(getppid())/comm. > + */ > + struct strbuf procfs_path = STRBUF_INIT; > + struct strbuf name = STRBUF_INIT; > + > + /* try to use procfs if it's present. */ > + strbuf_addf(&procfs_path, "/proc/%d/comm", getppid()); > + if (strbuf_read_file(&name, procfs_path.buf, 0)) { > + strbuf_release(&procfs_path); > + strbuf_trim_trailing_newline(&name); > + strvec_push(names, strbuf_detach(&name, NULL)); > + } > + > + return; > +#endif > + /* NEEDSWORK: add non-procfs-linux implementations here */ > +} Perhaps this has already been discussed, but would it be better to have a "compat/linux/trace2_linux_process_info.c" or "compat/procfs/trace2_procfs_process_info.c" source file and only compile it in Linux-compatible builds -- rather than #ifdef'ing the source. This is a highly platform-specific feature. For example, if I convert the Win32 version to use your new event, I wouldn't want to move the code. I just noticed that you have both "BASIC_CFLAGS+=" and a "COMPAT_OBSJ+=" lines. If you made this source file procfs-specific, you wouldn't need the ifdef and you could avoid the new CFLAG. > + > +void trace2_collect_process_info(enum trace2_process_info_reason reason) > +{ > + if (!trace2_is_enabled()) > + return; > + > + /* someday we may want to write something extra here, but not today */ > + if (reason == TRACE2_PROCESS_INFO_EXIT) > + return; > + > + if (reason == TRACE2_PROCESS_INFO_STARTUP) { > + /* > + * NEEDSWORK: we could do the entire ptree in an array instead, > + * see compat/win32/trace2_win32_process_info.c. > + */ > + struct strvec names = STRVEC_INIT; > + > + get_ancestry_names(&names); > + > + if (names.nr == 0) { > + strvec_clear(&names); > + return; > + } > + > + trace2_cmd_ancestry(names.v); > + > + strvec_clear(&names); I agree with Junio here, it would be simpler to say it like this: get_ancestry_names(&names); if (names.nr) trace2_cmd_ancestry(names.v); strvec_clear(&names); > + } > + > + return; > +} > diff --git a/config.mak.uname b/config.mak.uname > index cb443b4e02..7ad110a1d2 100644 > --- a/config.mak.uname > +++ b/config.mak.uname > @@ -58,6 +58,7 @@ ifeq ($(uname_S),Linux) > FREAD_READS_DIRECTORIES = UnfortunatelyYes > BASIC_CFLAGS += -DHAVE_SYSINFO > PROCFS_EXECUTABLE_PATH = /proc/self/exe > + HAVE_PROCFS_LINUX = YesPlease > endif > ifeq ($(uname_S),GNU/kFreeBSD) > HAVE_ALLOCA_H = YesPlease > diff --git a/t/t0210/scrub_normal.perl b/t/t0210/scrub_normal.perl > index c65d1a815e..7cc4de392a 100644 > --- a/t/t0210/scrub_normal.perl > +++ b/t/t0210/scrub_normal.perl > @@ -42,6 +42,12 @@ > # so just omit it for testing purposes. > # print "cmd_path _EXE_\n"; > } > + elsif ($line =~ m/^cmd_ancestry/) { > + # 'cmd_ancestry' is not implemented everywhere, so for portability's > + # sake, skip it when parsing normal. > + # > + # print "$line"; > + } > else { > print "$line"; > } > diff --git a/t/t0211/scrub_perf.perl b/t/t0211/scrub_perf.perl > index 351af7844e..d164b750ff 100644 > --- a/t/t0211/scrub_perf.perl > +++ b/t/t0211/scrub_perf.perl > @@ -44,6 +44,11 @@ > # $tokens[$col_rest] = "_EXE_"; > goto SKIP_LINE; > } > + elsif ($tokens[$col_event] =~ m/cmd_ancestry/) { > + # 'cmd_ancestry' is platform-specific and not implemented everywhere, > + # so skip it. > + goto SKIP_LINE; > + } > elsif ($tokens[$col_event] =~ m/child_exit/) { > $tokens[$col_rest] =~ s/ pid:\d* / pid:_PID_ /; > } > diff --git a/t/t0212/parse_events.perl b/t/t0212/parse_events.perl > index 6584bb5634..b6408560c0 100644 > --- a/t/t0212/parse_events.perl > +++ b/t/t0212/parse_events.perl > @@ -132,7 +132,10 @@ > # just omit it for testing purposes. > # $processes->{$sid}->{'path'} = "_EXE_"; > } > - > + elsif ($event eq 'cmd_ancestry') { > + # 'cmd_ancestry' is platform-specific and not implemented everywhere, so > + # just skip it for testing purposes. > + } > elsif ($event eq 'cmd_name') { > $processes->{$sid}->{'name'} = $line->{'name'}; > $processes->{$sid}->{'hierarchy'} = $line->{'hierarchy'}; > diff --git a/trace2.c b/trace2.c > index 256120c7fd..b9b154ac44 100644 > --- a/trace2.c > +++ b/trace2.c > @@ -260,6 +260,19 @@ void trace2_cmd_path_fl(const char *file, int line, const char *pathname) > tgt_j->pfn_command_path_fl(file, line, pathname); > } > > +void trace2_cmd_ancestry_fl(const char *file, int line, const char **parent_names) > +{ > + struct tr2_tgt *tgt_j; > + int j; > + > + if (!trace2_enabled) > + return; > + > + for_each_wanted_builtin (j, tgt_j) > + if (tgt_j->pfn_command_ancestry_fl) > + tgt_j->pfn_command_ancestry_fl(file, line, parent_names); > +} > + > void trace2_cmd_name_fl(const char *file, int line, const char *name) > { > struct tr2_tgt *tgt_j; > diff --git a/trace2.h b/trace2.h > index ede18c2e06..23743ac62b 100644 > --- a/trace2.h > +++ b/trace2.h > @@ -133,6 +133,16 @@ void trace2_cmd_path_fl(const char *file, int line, const char *pathname); > > #define trace2_cmd_path(p) trace2_cmd_path_fl(__FILE__, __LINE__, (p)) > > +/* > + * Emit an 'ancestry' event with the process name of the current process's > + * parent process. > + * This gives post-processors a way to determine what invoked the command and > + * learn more about usage patterns. > + */ > +void trace2_cmd_ancestry_fl(const char *file, int line, const char **parent_names); > + > +#define trace2_cmd_ancestry(v) trace2_cmd_ancestry_fl(__FILE__, __LINE__, (v)) > + > /* > * Emit a 'cmd_name' event with the canonical name of the command. > * This gives post-processors a simple field to identify the command > @@ -492,7 +502,7 @@ enum trace2_process_info_reason { > TRACE2_PROCESS_INFO_EXIT, > }; > > -#if defined(GIT_WINDOWS_NATIVE) > +#if ( defined(GIT_WINDOWS_NATIVE) || defined(HAVE_PROCFS_LINUX) ) > void trace2_collect_process_info(enum trace2_process_info_reason reason); > #else > #define trace2_collect_process_info(reason) \ > diff --git a/trace2/tr2_tgt.h b/trace2/tr2_tgt.h > index 7b90469212..1f66fd6573 100644 > --- a/trace2/tr2_tgt.h > +++ b/trace2/tr2_tgt.h > @@ -27,6 +27,8 @@ typedef void(tr2_tgt_evt_error_va_fl_t)(const char *file, int line, > > typedef void(tr2_tgt_evt_command_path_fl_t)(const char *file, int line, > const char *command_path); > +typedef void(tr2_tgt_evt_command_ancestry_fl_t)(const char *file, int line, > + const char **parent_names); > typedef void(tr2_tgt_evt_command_name_fl_t)(const char *file, int line, > const char *name, > const char *hierarchy); > @@ -108,6 +110,7 @@ struct tr2_tgt { > tr2_tgt_evt_atexit_t *pfn_atexit; > tr2_tgt_evt_error_va_fl_t *pfn_error_va_fl; > tr2_tgt_evt_command_path_fl_t *pfn_command_path_fl; > + tr2_tgt_evt_command_ancestry_fl_t *pfn_command_ancestry_fl; > tr2_tgt_evt_command_name_fl_t *pfn_command_name_fl; > tr2_tgt_evt_command_mode_fl_t *pfn_command_mode_fl; > tr2_tgt_evt_alias_fl_t *pfn_alias_fl; > diff --git a/trace2/tr2_tgt_event.c b/trace2/tr2_tgt_event.c > index 6353e8ad91..578a9a5287 100644 > --- a/trace2/tr2_tgt_event.c > +++ b/trace2/tr2_tgt_event.c > @@ -261,6 +261,26 @@ static void fn_command_path_fl(const char *file, int line, const char *pathname) > jw_release(&jw); > } > > +static void fn_command_ancestry_fl(const char *file, int line, const char **parent_names) > +{ > + const char *event_name = "cmd_ancestry"; > + const char *parent_name = NULL; > + struct json_writer jw = JSON_WRITER_INIT; > + > + jw_object_begin(&jw, 0); > + event_fmt_prepare(event_name, file, line, NULL, &jw); > + jw_object_inline_begin_array(&jw, "ancestry"); > + > + while ((parent_name = *parent_names++)) > + jw_array_string(&jw, parent_name); > + > + jw_end(&jw); /* 'ancestry' array */ > + jw_end(&jw); /* event object */ > + > + tr2_dst_write_line(&tr2dst_event, &jw.json); > + jw_release(&jw); > +} > + > static void fn_command_name_fl(const char *file, int line, const char *name, > const char *hierarchy) > { > @@ -584,6 +604,7 @@ struct tr2_tgt tr2_tgt_event = { > fn_atexit, > fn_error_va_fl, > fn_command_path_fl, > + fn_command_ancestry_fl, > fn_command_name_fl, > fn_command_mode_fl, > fn_alias_fl, > diff --git a/trace2/tr2_tgt_normal.c b/trace2/tr2_tgt_normal.c > index 31b602c171..a5751c8864 100644 > --- a/trace2/tr2_tgt_normal.c > +++ b/trace2/tr2_tgt_normal.c > @@ -160,6 +160,24 @@ static void fn_command_path_fl(const char *file, int line, const char *pathname) > strbuf_release(&buf_payload); > } > > +static void fn_command_ancestry_fl(const char *file, int line, const char **parent_names) > +{ > + const char *parent_name = NULL; > + struct strbuf buf_payload = STRBUF_INIT; > + > + /* cmd_ancestry parent <- grandparent <- great-grandparent */ > + strbuf_addstr(&buf_payload, "cmd_ancestry "); > + while ((parent_name = *parent_names++)) { > + strbuf_addstr(&buf_payload, parent_name); > + /* if we'll write another one after this, add a delimiter */ > + if (parent_names && *parent_names) > + strbuf_addstr(&buf_payload, " <- "); > + } > + > + normal_io_write_fl(file, line, &buf_payload); > + strbuf_release(&buf_payload); > +} > + > static void fn_command_name_fl(const char *file, int line, const char *name, > const char *hierarchy) > { > @@ -306,6 +324,7 @@ struct tr2_tgt tr2_tgt_normal = { > fn_atexit, > fn_error_va_fl, > fn_command_path_fl, > + fn_command_ancestry_fl, > fn_command_name_fl, > fn_command_mode_fl, > fn_alias_fl, > diff --git a/trace2/tr2_tgt_perf.c b/trace2/tr2_tgt_perf.c > index a8018f18cc..af4d65a0a5 100644 > --- a/trace2/tr2_tgt_perf.c > +++ b/trace2/tr2_tgt_perf.c > @@ -253,6 +253,21 @@ static void fn_command_path_fl(const char *file, int line, const char *pathname) > strbuf_release(&buf_payload); > } > > +static void fn_command_ancestry_fl(const char *file, int line, const char **parent_names) > +{ > + const char *event_name = "cmd_ancestry"; > + struct strbuf buf_payload = STRBUF_INIT; > + > + strbuf_addstr(&buf_payload, "ancestry:["); > + /* It's not an argv but the rules are basically the same. */ > + sq_append_quote_argv_pretty(&buf_payload, parent_names); > + strbuf_addch(&buf_payload, ']'); > + > + perf_io_write_fl(file, line, event_name, NULL, NULL, NULL, NULL, > + &buf_payload); > + strbuf_release(&buf_payload); > +} > + > static void fn_command_name_fl(const char *file, int line, const char *name, > const char *hierarchy) > { > @@ -532,6 +547,7 @@ struct tr2_tgt tr2_tgt_perf = { > fn_atexit, > fn_error_va_fl, > fn_command_path_fl, > + fn_command_ancestry_fl, > fn_command_name_fl, > fn_command_mode_fl, > fn_alias_fl, > Otherwise, this looks good to me. Jeff
On Mon, Jun 28, 2021 at 12:45:24PM -0400, Jeff Hostetler wrote: > On 6/8/21 6:10 PM, Emily Shaffer wrote: > > Range-diff against v4: > > 1: efb0a3ccb4 ! 1: 7a7e1ebbfa tr2: log parent process name > > @@ compat/procinfo.c (new) > > + strbuf_addf(&procfs_path, "/proc/%d/comm", getppid()); > > + if (strbuf_read_file(&name, procfs_path.buf, 0)) { > > + strbuf_release(&procfs_path); > > ++ strbuf_trim_trailing_newline(&name); > > + strvec_push(names, strbuf_detach(&name, NULL)); > > + } > > + > > You're only getting the name of the command (argv[0]) and not the > full command line, right? That is a good thing. Roughly. The name can be reset by the process itself (that's what happened, I guess, in the tmux case I pasted below) but by default it's argv[0]. It's also truncated to 15ch or something. > > ------------ > > +`"cmd_ancestry"`:: > > + This event contains the text command name for the parent (and earlier > > + generations of parents) of the current process, in an array ordered from > > + nearest parent to furthest great-grandparent. It may not be implemented > > + on all platforms. > > ++ > > +------------ > > +{ > > + "event":"cmd_ancestry", > > + ... > > + "ancestry":["bash","tmux: server","systemd"] > > Is the second element really "tmux: server". Seems odd that that's what > the command name (argv[0]) is. Perhaps I misread something?? See above. This is what shows up in pstree, though, and by poking around in /proc I confirmed that this is indeed the content of /proc/<tmux-pid>/comm: ├─tmux: server─┬─bash───mutt───open-vim-in-new───vim │ ├─bash───pstree │ └─mutt This is a somewhat contrived example, though, because in Linux as of this patch, only one ancestor is gathered. So maybe I had better make the doc reflect what's actually possible. I'm planning on sending a follow-on sometime soon exposing more generations of ancestry, so I guess I could update the docs back to this state around then. > > > +} > > This array is bounded and that implies that you captured all of > the grand parents back to "init" (or whatever it is called these > days). In this case it does - pid 1 is systemd, which hasn't got a parent process. > Is there value in having a final "..." or "(truncated)" element > to indicate that the list incomplete? I did the latter in the > Windows version. Hrm. I'm not the one who wants to parse these - it's someone else who's working with our team internally - so I'll ask around and see what they think is best. > > +#ifdef HAVE_PROCFS_LINUX > > + /* > > + * NEEDSWORK: We could gather the entire pstree into an array to match > > + * functionality with compat/win32/trace2_win32_process_info.c. > > + * To do so, we may want to examine /proc/<pid>/stat. For now, just > > + * gather the immediate parent name which is readily accessible from > > + * /proc/$(getppid())/comm. > > + */ > > + struct strbuf procfs_path = STRBUF_INIT; > > + struct strbuf name = STRBUF_INIT; > > + > > + /* try to use procfs if it's present. */ > > + strbuf_addf(&procfs_path, "/proc/%d/comm", getppid()); > > + if (strbuf_read_file(&name, procfs_path.buf, 0)) { > > + strbuf_release(&procfs_path); > > + strbuf_trim_trailing_newline(&name); > > + strvec_push(names, strbuf_detach(&name, NULL)); > > + } > > + > > + return; > > +#endif > > + /* NEEDSWORK: add non-procfs-linux implementations here */ > > +} > > Perhaps this has already been discussed, but would it be better > to have a "compat/linux/trace2_linux_process_info.c" > or "compat/procfs/trace2_procfs_process_info.c" source file and > only compile it in Linux-compatible builds -- rather than #ifdef'ing > the source. This is a highly platform-specific feature. > > For example, if I convert the Win32 version to use your new event, > I wouldn't want to move the code. > > I just noticed that you have both "BASIC_CFLAGS+=" and a "COMPAT_OBSJ+=" > lines. If you made this source file procfs-specific, you wouldn't need > the ifdef and you could avoid the new CFLAG. Sure, I'll investigate it, thanks. > > + > > + if (names.nr == 0) { > > + strvec_clear(&names); > > + return; > > + } > > + > > + trace2_cmd_ancestry(names.v); > > + > > + strvec_clear(&names); > > I agree with Junio here, it would be simpler to say it like this: > > get_ancestry_names(&names); > if (names.nr) > trace2_cmd_ancestry(names.v); > strvec_clear(&names); > Thanks both, done locally. > Otherwise, this looks good to me. Thanks. Look for a v6 from me this week, hopefully with the build stuff sorted out. - Emily
On Tue, Jun 29 2021, Emily Shaffer wrote: > On Mon, Jun 28, 2021 at 12:45:24PM -0400, Jeff Hostetler wrote: >> On 6/8/21 6:10 PM, Emily Shaffer wrote: >> > Range-diff against v4: >> > 1: efb0a3ccb4 ! 1: 7a7e1ebbfa tr2: log parent process name >> > @@ compat/procinfo.c (new) >> > + strbuf_addf(&procfs_path, "/proc/%d/comm", getppid()); >> > + if (strbuf_read_file(&name, procfs_path.buf, 0)) { >> > + strbuf_release(&procfs_path); >> > ++ strbuf_trim_trailing_newline(&name); >> > + strvec_push(names, strbuf_detach(&name, NULL)); >> > + } >> > + >> >> You're only getting the name of the command (argv[0]) and not the >> full command line, right? That is a good thing. > > Roughly. The name can be reset by the process itself (that's what > happened, I guess, in the tmux case I pasted below) but by default it's > argv[0]. It's also truncated to 15ch or something. 16 including the \0. See prctl(2). Linux has two different ways to set/get the name, one is the argv method, the other is prctl(PR_SET_NAME). They don't need to match at all. The ps(1) utility and some top-like utilities allow you to switch between viewing the two versions. As noted in the linked manual pages you'll also potentially need to deal with multithreaded programs having different names for each thread. I don't think we use this now, but FWIW one thing I've wanted to do for a while was to have the progress.c code update this, so you see if git's at N% counting objects or whatever in top. >> > +#ifdef HAVE_PROCFS_LINUX >> > + /* >> > + * NEEDSWORK: We could gather the entire pstree into an array to match >> > + * functionality with compat/win32/trace2_win32_process_info.c. >> > + * To do so, we may want to examine /proc/<pid>/stat. For now, just >> > + * gather the immediate parent name which is readily accessible from >> > + * /proc/$(getppid())/comm. >> > + */ >> > + struct strbuf procfs_path = STRBUF_INIT; >> > + struct strbuf name = STRBUF_INIT; >> > + >> > + /* try to use procfs if it's present. */ >> > + strbuf_addf(&procfs_path, "/proc/%d/comm", getppid()); >> > + if (strbuf_read_file(&name, procfs_path.buf, 0)) { >> > + strbuf_release(&procfs_path); >> > + strbuf_trim_trailing_newline(&name); >> > + strvec_push(names, strbuf_detach(&name, NULL)); >> > + } >> > + >> > + return; >> > +#endif >> > + /* NEEDSWORK: add non-procfs-linux implementations here */ >> > +} >> >> Perhaps this has already been discussed, but would it be better >> to have a "compat/linux/trace2_linux_process_info.c" >> or "compat/procfs/trace2_procfs_process_info.c" source file and >> only compile it in Linux-compatible builds -- rather than #ifdef'ing >> the source. This is a highly platform-specific feature. >> >> For example, if I convert the Win32 version to use your new event, >> I wouldn't want to move the code. >> >> I just noticed that you have both "BASIC_CFLAGS+=" and a "COMPAT_OBSJ+=" >> lines. If you made this source file procfs-specific, you wouldn't need >> the ifdef and you could avoid the new CFLAG. In general we've preferred not using ifdefs at all except for the small bits that absolutely need it. So e.g. in this case the whole code should compile on non-Linux, we just need a small boolean guard somewhere to check what the platform is. It means we don't have significant pieces of code that don't compile except on platform X. It's easy to get into your code not compiling if you overuse ifdefs.
On Wed, Jun 30, 2021 at 08:10:59AM +0200, Ævar Arnfjörð Bjarmason wrote:
>
>
> On Tue, Jun 29 2021, Emily Shaffer wrote:
>
> > On Mon, Jun 28, 2021 at 12:45:24PM -0400, Jeff Hostetler wrote:
> >> On 6/8/21 6:10 PM, Emily Shaffer wrote:
> >> > Range-diff against v4:
> >> > 1: efb0a3ccb4 ! 1: 7a7e1ebbfa tr2: log parent process name
> >> > @@ compat/procinfo.c (new)
> >> > + strbuf_addf(&procfs_path, "/proc/%d/comm", getppid());
> >> > + if (strbuf_read_file(&name, procfs_path.buf, 0)) {
> >> > + strbuf_release(&procfs_path);
> >> > ++ strbuf_trim_trailing_newline(&name);
> >> > + strvec_push(names, strbuf_detach(&name, NULL));
> >> > + }
> >> > +
> >>
> >> You're only getting the name of the command (argv[0]) and not the
> >> full command line, right? That is a good thing.
> >
> > Roughly. The name can be reset by the process itself (that's what
> > happened, I guess, in the tmux case I pasted below) but by default it's
> > argv[0]. It's also truncated to 15ch or something.
>
> 16 including the \0. See prctl(2). Linux has two different ways to
> set/get the name, one is the argv method, the other is
> prctl(PR_SET_NAME). They don't need to match at all. The ps(1) utility
> and some top-like utilities allow you to switch between viewing the two
> versions.
>
> As noted in the linked manual pages you'll also potentially need to deal
> with multithreaded programs having different names for each thread.
>
> I don't think we use this now, but FWIW one thing I've wanted to do for
> a while was to have the progress.c code update this, so you see if git's
> at N% counting objects or whatever in top.
>
> >> > +#ifdef HAVE_PROCFS_LINUX
> >> > + /*
> >> > + * NEEDSWORK: We could gather the entire pstree into an array to match
> >> > + * functionality with compat/win32/trace2_win32_process_info.c.
> >> > + * To do so, we may want to examine /proc/<pid>/stat. For now, just
> >> > + * gather the immediate parent name which is readily accessible from
> >> > + * /proc/$(getppid())/comm.
> >> > + */
> >> > + struct strbuf procfs_path = STRBUF_INIT;
> >> > + struct strbuf name = STRBUF_INIT;
> >> > +
> >> > + /* try to use procfs if it's present. */
> >> > + strbuf_addf(&procfs_path, "/proc/%d/comm", getppid());
> >> > + if (strbuf_read_file(&name, procfs_path.buf, 0)) {
> >> > + strbuf_release(&procfs_path);
> >> > + strbuf_trim_trailing_newline(&name);
> >> > + strvec_push(names, strbuf_detach(&name, NULL));
> >> > + }
> >> > +
> >> > + return;
> >> > +#endif
> >> > + /* NEEDSWORK: add non-procfs-linux implementations here */
> >> > +}
> >>
> >> Perhaps this has already been discussed, but would it be better
> >> to have a "compat/linux/trace2_linux_process_info.c"
> >> or "compat/procfs/trace2_procfs_process_info.c" source file and
> >> only compile it in Linux-compatible builds -- rather than #ifdef'ing
> >> the source. This is a highly platform-specific feature.
> >>
> >> For example, if I convert the Win32 version to use your new event,
> >> I wouldn't want to move the code.
> >>
> >> I just noticed that you have both "BASIC_CFLAGS+=" and a "COMPAT_OBSJ+="
> >> lines. If you made this source file procfs-specific, you wouldn't need
> >> the ifdef and you could avoid the new CFLAG.
>
>
> In general we've preferred not using ifdefs at all except for the small
> bits that absolutely need it.
>
> So e.g. in this case the whole code should compile on non-Linux, we just
> need a small boolean guard somewhere to check what the platform is.
>
> It means we don't have significant pieces of code that don't compile
> except on platform X. It's easy to get into your code not compiling if
> you overuse ifdefs.
Hmm. I see what you mean.
However, since win32 is already using some conditionally compiling code,
that's the approach I went for. It's still running CI to make sure I
didn't break Windows in the process, but I'll post it (and the Actions
runs) later today, assuming it passed.
Yes, the implementation I wrote would compile on any platform, but as I
understand it, other platforms may need drastically different
implementations which may not compile as easily as this. So, I'm not
sure if it's really appropriate to do run-time platform checks here
(which is what I think you were describing).
Anyway, I'll look forward to seeing what you folks think of the next
iteration (again, hopefully later today).
- Emily
Since v5, I reshuffled some of the platform-specific compilation recipe around per Jeff H's comments on v5. I think these are at odds with Ævar's comments, though, so I guess let's take a look and see how strongly we feel? :) Otherwise I also made the logic inversion Junio suggested to make the logic easier to follow in procinfo.c:trace2_collect_process_info. Thanks, - Emily Emily Shaffer (2): tr2: make process info collection platform-generic tr2: log parent process name Documentation/technical/api-trace2.txt | 14 +++++++ Makefile | 4 ++ compat/linux/procinfo.c | 55 ++++++++++++++++++++++++++ compat/stub/procinfo.c | 11 ++++++ config.mak.uname | 3 ++ t/t0210/scrub_normal.perl | 6 +++ t/t0211/scrub_perf.perl | 5 +++ t/t0212/parse_events.perl | 5 ++- trace2.c | 13 ++++++ trace2.h | 16 +++++--- trace2/tr2_tgt.h | 3 ++ trace2/tr2_tgt_event.c | 21 ++++++++++ trace2/tr2_tgt_normal.c | 19 +++++++++ trace2/tr2_tgt_perf.c | 16 ++++++++ 14 files changed, 184 insertions(+), 7 deletions(-) create mode 100644 compat/linux/procinfo.c create mode 100644 compat/stub/procinfo.c Range-diff against v5: -: ---------- > 1: 80084448e4 tr2: make process info collection platform-generic 1: 7a7e1ebbfa ! 2: 485f9a24f0 tr2: log parent process name @@ Documentation/technical/api-trace2.txt: about specific error arguments. This event contains the command name for this git process and the hierarchy of commands from parent git processes. - ## Makefile ## -@@ Makefile: ifneq ($(PROCFS_EXECUTABLE_PATH),) - BASIC_CFLAGS += '-DPROCFS_EXECUTABLE_PATH="$(procfs_executable_path_SQ)"' - endif - -+ifdef HAVE_PROCFS_LINUX -+ BASIC_CFLAGS += -DHAVE_PROCFS_LINUX -+ COMPAT_OBJS += compat/procinfo.o -+endif -+ - ifdef HAVE_NS_GET_EXECUTABLE_PATH - BASIC_CFLAGS += -DHAVE_NS_GET_EXECUTABLE_PATH - endif - - ## compat/procinfo.c (new) ## + ## compat/linux/procinfo.c (new) ## @@ +#include "cache.h" + @@ compat/procinfo.c (new) + +static void get_ancestry_names(struct strvec *names) +{ -+#ifdef HAVE_PROCFS_LINUX + /* + * NEEDSWORK: We could gather the entire pstree into an array to match + * functionality with compat/win32/trace2_win32_process_info.c. @@ compat/procinfo.c (new) + } + + return; -+#endif + /* NEEDSWORK: add non-procfs-linux implementations here */ +} + @@ compat/procinfo.c (new) + + get_ancestry_names(&names); + -+ if (names.nr == 0) { -+ strvec_clear(&names); -+ return; -+ } -+ -+ trace2_cmd_ancestry(names.v); -+ ++ if (names.nr) ++ trace2_cmd_ancestry(names.v); + strvec_clear(&names); + } + @@ config.mak.uname: ifeq ($(uname_S),Linux) FREAD_READS_DIRECTORIES = UnfortunatelyYes BASIC_CFLAGS += -DHAVE_SYSINFO PROCFS_EXECUTABLE_PATH = /proc/self/exe -+ HAVE_PROCFS_LINUX = YesPlease ++ HAVE_PLATFORM_PROCINFO = YesPlease ++ COMPAT_OBJS += compat/linux/procinfo.o endif ifeq ($(uname_S),GNU/kFreeBSD) HAVE_ALLOCA_H = YesPlease @@ trace2.h: void trace2_cmd_path_fl(const char *file, int line, const char *pathna /* * Emit a 'cmd_name' event with the canonical name of the command. * This gives post-processors a simple field to identify the command -@@ trace2.h: enum trace2_process_info_reason { - TRACE2_PROCESS_INFO_EXIT, - }; - --#if defined(GIT_WINDOWS_NATIVE) -+#if ( defined(GIT_WINDOWS_NATIVE) || defined(HAVE_PROCFS_LINUX) ) - void trace2_collect_process_info(enum trace2_process_info_reason reason); - #else - #define trace2_collect_process_info(reason) \ ## trace2/tr2_tgt.h ## @@ trace2/tr2_tgt.h: typedef void(tr2_tgt_evt_error_va_fl_t)(const char *file, int line, -- 2.32.0.402.g57bb445576-goog
To pave the way for non-Windows platforms to define trace2_collect_process_info(), reorganize the stub-or-definition schema to something which doesn't directly reference Windows. Platforms which want to collect parent process information in the future should: 1. Add an implementation to compat/ (e.g. compat/somearch/procinfo.c) 2. Add that object to COMPAT_OBJS to config.mak.uname (e.g. COMPAT_OBJS += compat/somearch/procinfo.o) 3. Define HAVE_PLATFORM_PROCINFO in config.mak.uname In the Windows case, this definition lives in compat/win32/trace2_win32_process_info.c, which is already conditionally added to COMPAT_OBJS; so let's add HAVE_PLATFORM_PROCINFO to hint to the build that compat/stub/procinfo.c should not be used. Signed-off-by: Emily Shaffer <emilyshaffer@google.com> Helped-by: Jonathan Nieder <jrnieder@gmail.com> --- Makefile | 4 ++++ compat/stub/procinfo.c | 11 +++++++++++ config.mak.uname | 1 + trace2.h | 6 ------ 4 files changed, 16 insertions(+), 6 deletions(-) create mode 100644 compat/stub/procinfo.c diff --git a/Makefile b/Makefile index 93664d6714..bd76689c79 100644 --- a/Makefile +++ b/Makefile @@ -1889,6 +1889,10 @@ ifneq ($(PROCFS_EXECUTABLE_PATH),) BASIC_CFLAGS += '-DPROCFS_EXECUTABLE_PATH="$(procfs_executable_path_SQ)"' endif +ifndef HAVE_PLATFORM_PROCINFO + COMPAT_OBJS += compat/stub/procinfo.o +endif + ifdef HAVE_NS_GET_EXECUTABLE_PATH BASIC_CFLAGS += -DHAVE_NS_GET_EXECUTABLE_PATH endif diff --git a/compat/stub/procinfo.c b/compat/stub/procinfo.c new file mode 100644 index 0000000000..12c0a23c9e --- /dev/null +++ b/compat/stub/procinfo.c @@ -0,0 +1,11 @@ +#include "git-compat-util.h" + +#include "trace2.h" + +/* + * Stub. See sample implementations in compat/linux/procinfo.c and + * compat/win32/trace2_win32_process_info.c. + */ +void trace2_collect_process_info(enum trace2_process_info_reason reason) +{ +} diff --git a/config.mak.uname b/config.mak.uname index cb443b4e02..185ff79b14 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -612,6 +612,7 @@ ifneq (,$(findstring MINGW,$(uname_S))) ETAGS_TARGET = ETAGS NO_POSIX_GOODIES = UnfortunatelyYes DEFAULT_HELP_FORMAT = html + HAVE_PLATFORM_PROCINFO = YesPlease BASIC_LDFLAGS += -municode COMPAT_CFLAGS += -DNOGDI -Icompat -Icompat/win32 COMPAT_CFLAGS += -DSTRIP_EXTENSION=\".exe\" diff --git a/trace2.h b/trace2.h index ede18c2e06..0d990db817 100644 --- a/trace2.h +++ b/trace2.h @@ -492,13 +492,7 @@ enum trace2_process_info_reason { TRACE2_PROCESS_INFO_EXIT, }; -#if defined(GIT_WINDOWS_NATIVE) void trace2_collect_process_info(enum trace2_process_info_reason reason); -#else -#define trace2_collect_process_info(reason) \ - do { \ - } while (0) -#endif const char *trace2_session_id(void); -- 2.32.0.402.g57bb445576-goog
It can be useful to tell who invoked Git - was it invoked manually by a user via CLI or script? By an IDE? In some cases - like 'repo' tool - we can influence the source code and set the GIT_TRACE2_PARENT_SID environment variable from the caller process. In 'repo''s case, that parent SID is manipulated to include the string "repo", which means we can positively identify when Git was invoked by 'repo' tool. However, identifying parents that way requires both that we know which tools invoke Git and that we have the ability to modify the source code of those tools. It cannot scale to keep up with the various IDEs and wrappers which use Git, most of which we don't know about. Learning which tools and wrappers invoke Git, and how, would give us insight to decide where to improve Git's usability and performance. Unfortunately, there's no cross-platform reliable way to gather the name of the parent process. If procfs is present, we can use that; otherwise we will need to discover the name another way. However, the process ID should be sufficient to look up the process name on most platforms, so that code may be shareable. Git for Windows gathers similar information and logs it as a "data_json" event. However, since "data_json" has a variable format, it is difficult to parse effectively in some languages; instead, let's pursue a dedicated "cmd_ancestry" event to record information about the ancestry of the current process and a consistent, parseable way. Git for Windows also gathers information about more than one generation of parent. In Linux further ancestry info can be gathered with procfs, but it's unwieldy to do so. In the interest of later moving Git for Windows ancestry logging to the 'cmd_ancestry' event, and in the interest of later adding more ancestry to the Linux implementation - or of adding this functionality to other platforms which have an easier time walking the process tree - let's make 'cmd_ancestry' accept an array of parentage. Signed-off-by: Emily Shaffer <emilyshaffer@google.com> --- Notes: Since v3: Junio and Randall suggested ditching "get_process_name(ppid)" as the API to implement on various platforms, and I liked the suggestion a lot. So instead, I added "get_ancestry_names(strvec *names)". - Using a strvec instead of a char** makes cleanup easier, since strvec takes care of counting the number of strings in the array for us. Otherwise we'd need to include size as a return or out-param in get_ancestry_names(), and that's what the util class is for, right? :) - I made get_ancestry_names() static instead of putting it into git-compat-util.h. I think I had put get_process_name() into that header to facilitate non-procfs implementations, but compat/procinfo.c doesn't seem to me to indicate "procfs only", and if we do need to implement get_process_name() somewhere else, it'll be pretty easy to move it. - I added a description of "cmd_ancestry" to Documentation/technical/api-trace2.txt. I didn't see any user-facing docs to update (for example, "git grep cmd_path" produces only that one doc file). Thanks, all. - Emily Documentation/technical/api-trace2.txt | 14 +++++++ compat/linux/procinfo.c | 55 ++++++++++++++++++++++++++ config.mak.uname | 2 + t/t0210/scrub_normal.perl | 6 +++ t/t0211/scrub_perf.perl | 5 +++ t/t0212/parse_events.perl | 5 ++- trace2.c | 13 ++++++ trace2.h | 10 +++++ trace2/tr2_tgt.h | 3 ++ trace2/tr2_tgt_event.c | 21 ++++++++++ trace2/tr2_tgt_normal.c | 19 +++++++++ trace2/tr2_tgt_perf.c | 16 ++++++++ 12 files changed, 168 insertions(+), 1 deletion(-) create mode 100644 compat/linux/procinfo.c diff --git a/Documentation/technical/api-trace2.txt b/Documentation/technical/api-trace2.txt index 3f52f981a2..8a0b360a0e 100644 --- a/Documentation/technical/api-trace2.txt +++ b/Documentation/technical/api-trace2.txt @@ -493,6 +493,20 @@ about specific error arguments. } ------------ +`"cmd_ancestry"`:: + This event contains the text command name for the parent (and earlier + generations of parents) of the current process, in an array ordered from + nearest parent to furthest great-grandparent. It may not be implemented + on all platforms. ++ +------------ +{ + "event":"cmd_ancestry", + ... + "ancestry":["bash","tmux: server","systemd"] +} +------------ + `"cmd_name"`:: This event contains the command name for this git process and the hierarchy of commands from parent git processes. diff --git a/compat/linux/procinfo.c b/compat/linux/procinfo.c new file mode 100644 index 0000000000..578fed4cd3 --- /dev/null +++ b/compat/linux/procinfo.c @@ -0,0 +1,55 @@ +#include "cache.h" + +#include "strbuf.h" +#include "strvec.h" +#include "trace2.h" + +static void get_ancestry_names(struct strvec *names) +{ + /* + * NEEDSWORK: We could gather the entire pstree into an array to match + * functionality with compat/win32/trace2_win32_process_info.c. + * To do so, we may want to examine /proc/<pid>/stat. For now, just + * gather the immediate parent name which is readily accessible from + * /proc/$(getppid())/comm. + */ + struct strbuf procfs_path = STRBUF_INIT; + struct strbuf name = STRBUF_INIT; + + /* try to use procfs if it's present. */ + strbuf_addf(&procfs_path, "/proc/%d/comm", getppid()); + if (strbuf_read_file(&name, procfs_path.buf, 0)) { + strbuf_release(&procfs_path); + strbuf_trim_trailing_newline(&name); + strvec_push(names, strbuf_detach(&name, NULL)); + } + + return; + /* NEEDSWORK: add non-procfs-linux implementations here */ +} + +void trace2_collect_process_info(enum trace2_process_info_reason reason) +{ + if (!trace2_is_enabled()) + return; + + /* someday we may want to write something extra here, but not today */ + if (reason == TRACE2_PROCESS_INFO_EXIT) + return; + + if (reason == TRACE2_PROCESS_INFO_STARTUP) { + /* + * NEEDSWORK: we could do the entire ptree in an array instead, + * see compat/win32/trace2_win32_process_info.c. + */ + struct strvec names = STRVEC_INIT; + + get_ancestry_names(&names); + + if (names.nr) + trace2_cmd_ancestry(names.v); + strvec_clear(&names); + } + + return; +} diff --git a/config.mak.uname b/config.mak.uname index 185ff79b14..d3bd4c6843 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -58,6 +58,8 @@ ifeq ($(uname_S),Linux) FREAD_READS_DIRECTORIES = UnfortunatelyYes BASIC_CFLAGS += -DHAVE_SYSINFO PROCFS_EXECUTABLE_PATH = /proc/self/exe + HAVE_PLATFORM_PROCINFO = YesPlease + COMPAT_OBJS += compat/linux/procinfo.o endif ifeq ($(uname_S),GNU/kFreeBSD) HAVE_ALLOCA_H = YesPlease diff --git a/t/t0210/scrub_normal.perl b/t/t0210/scrub_normal.perl index c65d1a815e..7cc4de392a 100644 --- a/t/t0210/scrub_normal.perl +++ b/t/t0210/scrub_normal.perl @@ -42,6 +42,12 @@ # so just omit it for testing purposes. # print "cmd_path _EXE_\n"; } + elsif ($line =~ m/^cmd_ancestry/) { + # 'cmd_ancestry' is not implemented everywhere, so for portability's + # sake, skip it when parsing normal. + # + # print "$line"; + } else { print "$line"; } diff --git a/t/t0211/scrub_perf.perl b/t/t0211/scrub_perf.perl index 351af7844e..d164b750ff 100644 --- a/t/t0211/scrub_perf.perl +++ b/t/t0211/scrub_perf.perl @@ -44,6 +44,11 @@ # $tokens[$col_rest] = "_EXE_"; goto SKIP_LINE; } + elsif ($tokens[$col_event] =~ m/cmd_ancestry/) { + # 'cmd_ancestry' is platform-specific and not implemented everywhere, + # so skip it. + goto SKIP_LINE; + } elsif ($tokens[$col_event] =~ m/child_exit/) { $tokens[$col_rest] =~ s/ pid:\d* / pid:_PID_ /; } diff --git a/t/t0212/parse_events.perl b/t/t0212/parse_events.perl index 6584bb5634..b6408560c0 100644 --- a/t/t0212/parse_events.perl +++ b/t/t0212/parse_events.perl @@ -132,7 +132,10 @@ # just omit it for testing purposes. # $processes->{$sid}->{'path'} = "_EXE_"; } - + elsif ($event eq 'cmd_ancestry') { + # 'cmd_ancestry' is platform-specific and not implemented everywhere, so + # just skip it for testing purposes. + } elsif ($event eq 'cmd_name') { $processes->{$sid}->{'name'} = $line->{'name'}; $processes->{$sid}->{'hierarchy'} = $line->{'hierarchy'}; diff --git a/trace2.c b/trace2.c index 256120c7fd..b9b154ac44 100644 --- a/trace2.c +++ b/trace2.c @@ -260,6 +260,19 @@ void trace2_cmd_path_fl(const char *file, int line, const char *pathname) tgt_j->pfn_command_path_fl(file, line, pathname); } +void trace2_cmd_ancestry_fl(const char *file, int line, const char **parent_names) +{ + struct tr2_tgt *tgt_j; + int j; + + if (!trace2_enabled) + return; + + for_each_wanted_builtin (j, tgt_j) + if (tgt_j->pfn_command_ancestry_fl) + tgt_j->pfn_command_ancestry_fl(file, line, parent_names); +} + void trace2_cmd_name_fl(const char *file, int line, const char *name) { struct tr2_tgt *tgt_j; diff --git a/trace2.h b/trace2.h index 0d990db817..9b7286c572 100644 --- a/trace2.h +++ b/trace2.h @@ -133,6 +133,16 @@ void trace2_cmd_path_fl(const char *file, int line, const char *pathname); #define trace2_cmd_path(p) trace2_cmd_path_fl(__FILE__, __LINE__, (p)) +/* + * Emit an 'ancestry' event with the process name of the current process's + * parent process. + * This gives post-processors a way to determine what invoked the command and + * learn more about usage patterns. + */ +void trace2_cmd_ancestry_fl(const char *file, int line, const char **parent_names); + +#define trace2_cmd_ancestry(v) trace2_cmd_ancestry_fl(__FILE__, __LINE__, (v)) + /* * Emit a 'cmd_name' event with the canonical name of the command. * This gives post-processors a simple field to identify the command diff --git a/trace2/tr2_tgt.h b/trace2/tr2_tgt.h index 7b90469212..1f66fd6573 100644 --- a/trace2/tr2_tgt.h +++ b/trace2/tr2_tgt.h @@ -27,6 +27,8 @@ typedef void(tr2_tgt_evt_error_va_fl_t)(const char *file, int line, typedef void(tr2_tgt_evt_command_path_fl_t)(const char *file, int line, const char *command_path); +typedef void(tr2_tgt_evt_command_ancestry_fl_t)(const char *file, int line, + const char **parent_names); typedef void(tr2_tgt_evt_command_name_fl_t)(const char *file, int line, const char *name, const char *hierarchy); @@ -108,6 +110,7 @@ struct tr2_tgt { tr2_tgt_evt_atexit_t *pfn_atexit; tr2_tgt_evt_error_va_fl_t *pfn_error_va_fl; tr2_tgt_evt_command_path_fl_t *pfn_command_path_fl; + tr2_tgt_evt_command_ancestry_fl_t *pfn_command_ancestry_fl; tr2_tgt_evt_command_name_fl_t *pfn_command_name_fl; tr2_tgt_evt_command_mode_fl_t *pfn_command_mode_fl; tr2_tgt_evt_alias_fl_t *pfn_alias_fl; diff --git a/trace2/tr2_tgt_event.c b/trace2/tr2_tgt_event.c index 6353e8ad91..578a9a5287 100644 --- a/trace2/tr2_tgt_event.c +++ b/trace2/tr2_tgt_event.c @@ -261,6 +261,26 @@ static void fn_command_path_fl(const char *file, int line, const char *pathname) jw_release(&jw); } +static void fn_command_ancestry_fl(const char *file, int line, const char **parent_names) +{ + const char *event_name = "cmd_ancestry"; + const char *parent_name = NULL; + struct json_writer jw = JSON_WRITER_INIT; + + jw_object_begin(&jw, 0); + event_fmt_prepare(event_name, file, line, NULL, &jw); + jw_object_inline_begin_array(&jw, "ancestry"); + + while ((parent_name = *parent_names++)) + jw_array_string(&jw, parent_name); + + jw_end(&jw); /* 'ancestry' array */ + jw_end(&jw); /* event object */ + + tr2_dst_write_line(&tr2dst_event, &jw.json); + jw_release(&jw); +} + static void fn_command_name_fl(const char *file, int line, const char *name, const char *hierarchy) { @@ -584,6 +604,7 @@ struct tr2_tgt tr2_tgt_event = { fn_atexit, fn_error_va_fl, fn_command_path_fl, + fn_command_ancestry_fl, fn_command_name_fl, fn_command_mode_fl, fn_alias_fl, diff --git a/trace2/tr2_tgt_normal.c b/trace2/tr2_tgt_normal.c index 31b602c171..a5751c8864 100644 --- a/trace2/tr2_tgt_normal.c +++ b/trace2/tr2_tgt_normal.c @@ -160,6 +160,24 @@ static void fn_command_path_fl(const char *file, int line, const char *pathname) strbuf_release(&buf_payload); } +static void fn_command_ancestry_fl(const char *file, int line, const char **parent_names) +{ + const char *parent_name = NULL; + struct strbuf buf_payload = STRBUF_INIT; + + /* cmd_ancestry parent <- grandparent <- great-grandparent */ + strbuf_addstr(&buf_payload, "cmd_ancestry "); + while ((parent_name = *parent_names++)) { + strbuf_addstr(&buf_payload, parent_name); + /* if we'll write another one after this, add a delimiter */ + if (parent_names && *parent_names) + strbuf_addstr(&buf_payload, " <- "); + } + + normal_io_write_fl(file, line, &buf_payload); + strbuf_release(&buf_payload); +} + static void fn_command_name_fl(const char *file, int line, const char *name, const char *hierarchy) { @@ -306,6 +324,7 @@ struct tr2_tgt tr2_tgt_normal = { fn_atexit, fn_error_va_fl, fn_command_path_fl, + fn_command_ancestry_fl, fn_command_name_fl, fn_command_mode_fl, fn_alias_fl, diff --git a/trace2/tr2_tgt_perf.c b/trace2/tr2_tgt_perf.c index a8018f18cc..af4d65a0a5 100644 --- a/trace2/tr2_tgt_perf.c +++ b/trace2/tr2_tgt_perf.c @@ -253,6 +253,21 @@ static void fn_command_path_fl(const char *file, int line, const char *pathname) strbuf_release(&buf_payload); } +static void fn_command_ancestry_fl(const char *file, int line, const char **parent_names) +{ + const char *event_name = "cmd_ancestry"; + struct strbuf buf_payload = STRBUF_INIT; + + strbuf_addstr(&buf_payload, "ancestry:["); + /* It's not an argv but the rules are basically the same. */ + sq_append_quote_argv_pretty(&buf_payload, parent_names); + strbuf_addch(&buf_payload, ']'); + + perf_io_write_fl(file, line, event_name, NULL, NULL, NULL, NULL, + &buf_payload); + strbuf_release(&buf_payload); +} + static void fn_command_name_fl(const char *file, int line, const char *name, const char *hierarchy) { @@ -532,6 +547,7 @@ struct tr2_tgt tr2_tgt_perf = { fn_atexit, fn_error_va_fl, fn_command_path_fl, + fn_command_ancestry_fl, fn_command_name_fl, fn_command_mode_fl, fn_alias_fl, -- 2.32.0.402.g57bb445576-goog
On 7/21/21 9:27 PM, Emily Shaffer wrote:
> Since v5, I reshuffled some of the platform-specific compilation recipe
> around per Jeff H's comments on v5. I think these are at odds with
> Ævar's comments, though, so I guess let's take a look and see how
> strongly we feel? :)
>
> Otherwise I also made the logic inversion Junio suggested to make the
> logic easier to follow in procinfo.c:trace2_collect_process_info.
>
> Thanks,
> - Emily
...
LGTM
Thanks,
Jeff
Emily Shaffer <emilyshaffer@google.com> writes: > Unfortunately, there's no cross-platform reliable way to gather the name > of the parent process. If procfs is present, we can use that; otherwise > we will need to discover the name another way. However, the process ID > should be sufficient to look up the process name on most platforms, so > that code may be shareable. Is the sentence that begin with "However" still relevant? The latest get_ancestry_names() does not add anything when the result read from the procfs is unusable, but the sentence gives a false impression that it may somehow fall back to show the PID. > Git for Windows also gathers information about more than one generation > of parent. In Linux further ancestry info can be gathered with procfs, > but it's unwieldy to do so. In the interest of later moving Git for > Windows ancestry logging to the 'cmd_ancestry' event, and in the > interest of later adding more ancestry to the Linux implementation - or > of adding this functionality to other platforms which have an easier > time walking the process tree - let's make 'cmd_ancestry' accept an > array of parentage. Clearly written. Nice. > +`"cmd_ancestry"`:: > + This event contains the text command name for the parent (and earlier > + generations of parents) of the current process, in an array ordered from > + nearest parent to furthest great-grandparent. It may not be implemented > + on all platforms. > ++ > +------------ > +{ > + "event":"cmd_ancestry", > + ... > + "ancestry":["bash","tmux: server","systemd"] > +} > +------------ OK. > diff --git a/compat/linux/procinfo.c b/compat/linux/procinfo.c > new file mode 100644 > index 0000000000..578fed4cd3 > --- /dev/null > +++ b/compat/linux/procinfo.c > @@ -0,0 +1,55 @@ > +#include "cache.h" > + > +#include "strbuf.h" > +#include "strvec.h" > +#include "trace2.h" > + > +static void get_ancestry_names(struct strvec *names) > +{ > + /* > + * NEEDSWORK: We could gather the entire pstree into an array to match > + * functionality with compat/win32/trace2_win32_process_info.c. > + * To do so, we may want to examine /proc/<pid>/stat. For now, just > + * gather the immediate parent name which is readily accessible from > + * /proc/$(getppid())/comm. > + */ > + struct strbuf procfs_path = STRBUF_INIT; > + struct strbuf name = STRBUF_INIT; > + > + /* try to use procfs if it's present. */ > + strbuf_addf(&procfs_path, "/proc/%d/comm", getppid()); > + if (strbuf_read_file(&name, procfs_path.buf, 0)) { > + strbuf_release(&procfs_path); > + strbuf_trim_trailing_newline(&name); > + strvec_push(names, strbuf_detach(&name, NULL)); At this point, we successfully read from /proc/$(ppid)/comm and pushed the result into names strvec. I think you would want to have an explicit "return;" here. Alternatively, you could put the rest of the function in the corresponding "else" clause, but I think an early return after each method successfully collects the result would make a lot more sense. > + } > + > + return; > + /* NEEDSWORK: add non-procfs-linux implementations here */ This looks the other way around. A future non-procfs-linux implementation written here will be unreachable code ;-) Just lose "return;" from here, have one in the if(){} body, and keep the "here is where you add more/other implementations" comment and we'd be OK, I think. > +void trace2_collect_process_info(enum trace2_process_info_reason reason) > +{ > + if (!trace2_is_enabled()) > + return; > + > + /* someday we may want to write something extra here, but not today */ > + if (reason == TRACE2_PROCESS_INFO_EXIT) > + return; > + > + if (reason == TRACE2_PROCESS_INFO_STARTUP) { > + /* > + * NEEDSWORK: we could do the entire ptree in an array instead, > + * see compat/win32/trace2_win32_process_info.c. > + */ > + struct strvec names = STRVEC_INIT; > + > + get_ancestry_names(&names); > + > + if (names.nr) > + trace2_cmd_ancestry(names.v); > + strvec_clear(&names); > + } > + > + return; > +} Good. Thanks.
On Wed, Jul 21 2021, Emily Shaffer wrote:
> +/*
> + * Stub. See sample implementations in compat/linux/procinfo.c and
> + * compat/win32/trace2_win32_process_info.c.
> + */
Both of those don't exist until 2/2, so that comment should be squashed
into that change shouldn't it?
On Wed, Jul 21 2021, Emily Shaffer wrote: > +`"cmd_ancestry"`:: > + This event contains the text command name for the parent (and earlier > + generations of parents) of the current process, in an array ordered from > + nearest parent to furthest great-grandparent. It may not be implemented > + on all platforms. > ++ > +------------ > +{ > + "event":"cmd_ancestry", > + ... > + "ancestry":["bash","tmux: server","systemd"] > +} > +------------ > + Okey, so because of later NEEDSWORK no system that runs systemd will currrently have output like this, just Windows. > + /* > + * NEEDSWORK: We could gather the entire pstree into an array to match > + * functionality with compat/win32/trace2_win32_process_info.c. > + * To do so, we may want to examine /proc/<pid>/stat. For now, just > + * gather the immediate parent name which is readily accessible from > + * /proc/$(getppid())/comm. > + */ This comment: > + struct strbuf procfs_path = STRBUF_INIT; > + struct strbuf name = STRBUF_INIT; > + > + /* try to use procfs if it's present. */ > + strbuf_addf(&procfs_path, "/proc/%d/comm", getppid()); > + if (strbuf_read_file(&name, procfs_path.buf, 0)) { > + strbuf_release(&procfs_path); > + strbuf_trim_trailing_newline(&name); > + strvec_push(names, strbuf_detach(&name, NULL)); > + } > + > + return; > + /* NEEDSWORK: add non-procfs-linux implementations here */ > +} > + > +void trace2_collect_process_info(enum trace2_process_info_reason reason) > +{ > + if (!trace2_is_enabled()) > + return; > + > + /* someday we may want to write something extra here, but not today */ > + if (reason == TRACE2_PROCESS_INFO_EXIT) > + return; > + > + if (reason == TRACE2_PROCESS_INFO_STARTUP) { This should be a switch/case, so we get the compiler asserting/warning if we don't check enum arms, in this case there's just these two, let's have that be clear for readability. > + /* > + * NEEDSWORK: we could do the entire ptree in an array instead, > + * see compat/win32/trace2_win32_process_info.c. > + */ Seems to remove the need for this comment, i.e. we comment on this limitation of linux-specific get_ancestry_names twice, both in the function and its caller. > - > + elsif ($event eq 'cmd_ancestry') { > + # 'cmd_ancestry' is platform-specific and not implemented everywhere, so > + # just skip it for testing purposes. > + } The rest of this code uses two "\n" between elsif arms, let's be consistent here.
On Wed, Jul 21 2021, Emily Shaffer wrote: > Git for Windows also gathers information about more than one generation > of parent. In Linux further ancestry info can be gathered with procfs, > but it's unwieldy to do so. Having read the win32 get_processes() implementation and read proc(5) I don't get how it's unweildy to do so on Linux? Perhaps I'm missing some special-case but this rather simple patch-on-top seems to do the job for me. This includes the unrelated enum/switch/case change I suggested. I can submit it as a patch-on-top with SOB etc, but maybe there's some subtle reason it won't work properly. It works for me, I get e.g.: { "event": "cmd_ancestry", "sid": "20210802T102731.879424Z-Hc2f5b994-P00001acc", "thread": "main", "time": "2021-08-02T10:27:31.879618Z", "file": "compat/linux/procinfo.c", "line": 66, "ancestry": [ "bash", "screen", "systemd" ] } If anything the existing Windows version seems a bit unweildy, having to apparently deal with cycles etc., I don't think that happens under procfs, but maybe I'm missing some special-case. diff --git a/compat/linux/procinfo.c b/compat/linux/procinfo.c index 578fed4cd31..671fe2395fd 100644 --- a/compat/linux/procinfo.c +++ b/compat/linux/procinfo.c @@ -4,51 +4,65 @@ #include "strvec.h" #include "trace2.h" -static void get_ancestry_names(struct strvec *names) +static int stat_parent_pid(FILE *fp, char *statcomm, int *statppid) +{ + char statstate; + int statpid; + + int ret = fscanf(fp, "%d %s %c %d", &statpid, statcomm, &statstate, + statppid); + if (ret != 4) + return -1; + return 0; +} + +static void push_ancestry_name(struct strvec *names, pid_t pid) { - /* - * NEEDSWORK: We could gather the entire pstree into an array to match - * functionality with compat/win32/trace2_win32_process_info.c. - * To do so, we may want to examine /proc/<pid>/stat. For now, just - * gather the immediate parent name which is readily accessible from - * /proc/$(getppid())/comm. - */ struct strbuf procfs_path = STRBUF_INIT; - struct strbuf name = STRBUF_INIT; + char statcomm[PATH_MAX]; + FILE *fp; + int ppid; /* try to use procfs if it's present. */ - strbuf_addf(&procfs_path, "/proc/%d/comm", getppid()); - if (strbuf_read_file(&name, procfs_path.buf, 0)) { - strbuf_release(&procfs_path); - strbuf_trim_trailing_newline(&name); - strvec_push(names, strbuf_detach(&name, NULL)); - } + strbuf_addf(&procfs_path, "/proc/%d/stat", pid); + fp = fopen(procfs_path.buf, "r"); + if (!fp) + return; - return; - /* NEEDSWORK: add non-procfs-linux implementations here */ + if (stat_parent_pid(fp, statcomm, &ppid) < 0) + return; + + /* + * The comm field is in parenthesis, use printf + offset as a + * poor man's trimming of both ends. + */ + strvec_pushf(names, "%.*s", (int)strlen(statcomm) - 2, statcomm + 1); + + /* + * Both errors and reaching the end of the process chain are + * reported as fields of 0 by proc(5) + */ + if (ppid != 0) + push_ancestry_name(names, ppid); } void trace2_collect_process_info(enum trace2_process_info_reason reason) { - if (!trace2_is_enabled()) - return; + struct strvec names = STRVEC_INIT; - /* someday we may want to write something extra here, but not today */ - if (reason == TRACE2_PROCESS_INFO_EXIT) + if (!trace2_is_enabled()) return; - if (reason == TRACE2_PROCESS_INFO_STARTUP) { - /* - * NEEDSWORK: we could do the entire ptree in an array instead, - * see compat/win32/trace2_win32_process_info.c. - */ - struct strvec names = STRVEC_INIT; - - get_ancestry_names(&names); + switch (reason) { + case TRACE2_PROCESS_INFO_EXIT: + break; + case TRACE2_PROCESS_INFO_STARTUP: + push_ancestry_name(&names, getppid()); if (names.nr) trace2_cmd_ancestry(names.v); strvec_clear(&names); + break; } return;
On Wed, Jul 21 2021, Emily Shaffer wrote:
> compat/linux/procinfo.c | 55 ++++++++++++++++++++++++++
> [...]
> + /* NEEDSWORK: add non-procfs-linux implementations here */
We'd want non-Windows and non-Linux implementations of this, but that
NEEDSWORK comment should not be in compat/linux/procinfo.c, where we're
never going to add non-Linux code.
On Mon, Aug 02 2021, Ævar Arnfjörð Bjarmason wrote: > On Wed, Jul 21 2021, Emily Shaffer wrote: > >> +`"cmd_ancestry"`:: >> + This event contains the text command name for the parent (and earlier >> + generations of parents) of the current process, in an array ordered from >> + nearest parent to furthest great-grandparent. It may not be implemented >> + on all platforms. >> ++ >> +------------ >> +{ >> + "event":"cmd_ancestry", >> + ... >> + "ancestry":["bash","tmux: server","systemd"] >> +} >> +------------ >> + > > Okey, so because of later NEEDSWORK no system that runs systemd will > currrently have output like this, just Windows. > >> + /* >> + * NEEDSWORK: We could gather the entire pstree into an array to match >> + * functionality with compat/win32/trace2_win32_process_info.c. >> + * To do so, we may want to examine /proc/<pid>/stat. For now, just >> + * gather the immediate parent name which is readily accessible from >> + * /proc/$(getppid())/comm. >> + */ > > This comment: To clarify, this was ment to be continued with "seems to remove the need..." below. But I later inserted the commentery about switch/case in-between, which made that unclear. > [...] >> [...] >> + /* >> + * NEEDSWORK: we could do the entire ptree in an array instead, >> + * see compat/win32/trace2_win32_process_info.c. >> + */ > > Seems to remove the need for this comment, i.e. we comment on this > limitation of linux-specific get_ancestry_names twice, both in the > function and its caller.
On Mon, Aug 02 2021, Ævar Arnfjörð Bjarmason wrote: > On Wed, Jul 21 2021, Emily Shaffer wrote: > >> Git for Windows also gathers information about more than one generation >> of parent. In Linux further ancestry info can be gathered with procfs, >> but it's unwieldy to do so. > > Having read the win32 get_processes() implementation and read proc(5) I > don't get how it's unweildy to do so on Linux? Perhaps I'm missing some > special-case but this rather simple patch-on-top seems to do the job for > me. This includes the unrelated enum/switch/case change I suggested. > > I can submit it as a patch-on-top with SOB etc, but maybe there's some > subtle reason it won't work properly. It works for me, I get e.g.: Just a note[...]: > + strbuf_addf(&procfs_path, "/proc/%d/stat", pid); > + fp = fopen(procfs_path.buf, "r"); > + if (!fp) > + return; I think this is mostly ready, but is still clearly in POC state needing polishing, e.g. I didn't bother to close "fp" here, but a final patch to implement this approach should of course do that.
On 8/2/21 6:22 AM, Ævar Arnfjörð Bjarmason wrote:
...
>
> If anything the existing Windows version seems a bit unweildy, having to
> apparently deal with cycles etc., I don't think that happens under
> procfs, but maybe I'm missing some special-case.
>
It's been a while since I did the parent lookup on Windows, but...
Yes, on Windows we observed cases where a PID would be recycled
and a child process could have the same PID as an ancient ancestor
that had long since exited.
I don't think the cycle problem exists on Unix because IIRC a
child process' parent-PID is set to Init when the parent exits.
Windows does not update the child's parent-PID when the parent
exits, so if a recently-used PID is given to one of our child
processes, we can get a loop.
Jeff
On August 2, 2021 6:22 AM: Ævar Arnfjörð Bjarmason wroteL
>On Wed, Jul 21 2021, Emily Shaffer wrote:
>
>> Git for Windows also gathers information about more than one
>> generation of parent. In Linux further ancestry info can be gathered
>> with procfs, but it's unwieldy to do so.
>
>Having read the win32 get_processes() implementation and read proc(5) I don't get how it's unweildy to do so on Linux? Perhaps I'm
>missing some special-case but this rather simple patch-on-top seems to do the job for me. This includes the unrelated enum/switch/case
>change I suggested.
>
>I can submit it as a patch-on-top with SOB etc, but maybe there's some subtle reason it won't work properly. It works for me, I get e.g.:
>
> {
> "event": "cmd_ancestry",
> "sid": "20210802T102731.879424Z-Hc2f5b994-P00001acc",
> "thread": "main",
> "time": "2021-08-02T10:27:31.879618Z",
> "file": "compat/linux/procinfo.c",
> "line": 66,
> "ancestry": [
> "bash",
> "screen",
> "systemd"
> ]
> }
Should not the subfields of "ancestry" also have field names? I get that they are a list, but it seems a bit restrictive.
My preference here would be:
"ancestry": [
"ancestor": [
"program": "bash",
"pid" : 1234 ],
"ancestor": [
"program": "screen"],
"ancestor": [
"program" : "systemd"],
]
With more richness available in the ancestor.
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
> On Wed, Jul 21 2021, Emily Shaffer wrote:
>
>> compat/linux/procinfo.c | 55 ++++++++++++++++++++++++++
>> [...]
>> + /* NEEDSWORK: add non-procfs-linux implementations here */
>
> We'd want non-Windows and non-Linux implementations of this, but that
> NEEDSWORK comment should not be in compat/linux/procinfo.c, where we're
> never going to add non-Linux code.
I am puzzled. This is talking about additional implementation for
Linux that does not use procfs, no (i.e. what to do with builds of
Linux that compile out the procfs support or installations that do
not mount the /proc hierarchy)?
The comment seems to be at the right place to remind us of them,
even though I do not know how important such an environment is.
On Mon, Aug 02 2021, Randall S. Becker wrote:
> On August 2, 2021 6:22 AM: Ævar Arnfjörð Bjarmason wroteL
>>On Wed, Jul 21 2021, Emily Shaffer wrote:
>>
>>> Git for Windows also gathers information about more than one
>>> generation of parent. In Linux further ancestry info can be gathered
>>> with procfs, but it's unwieldy to do so.
>>
>>Having read the win32 get_processes() implementation and read proc(5) I don't get how it's unweildy to do so on Linux? Perhaps I'm
>>missing some special-case but this rather simple patch-on-top seems to do the job for me. This includes the unrelated enum/switch/case
>>change I suggested.
>>
>>I can submit it as a patch-on-top with SOB etc, but maybe there's some subtle reason it won't work properly. It works for me, I get e.g.:
>>
>> {
>> "event": "cmd_ancestry",
>> "sid": "20210802T102731.879424Z-Hc2f5b994-P00001acc",
>> "thread": "main",
>> "time": "2021-08-02T10:27:31.879618Z",
>> "file": "compat/linux/procinfo.c",
>> "line": 66,
>> "ancestry": [
>> "bash",
>> "screen",
>> "systemd"
>> ]
>> }
>
> Should not the subfields of "ancestry" also have field names? I get that they are a list, but it seems a bit restrictive.
>
> My preference here would be:
>
> "ancestry": [
> "ancestor": [
> "program": "bash",
> "pid" : 1234 ],
> "ancestor": [
> "program": "screen"],
> "ancestor": [
> "program" : "systemd"],
> ]
>
> With more richness available in the ancestor.
That sounds sensible, but to be clear that's a relevant comment on
Emily's original patch, my "let's implement the same for Linux" is just
faithfully reproducing what we're already doing in the Windows
implementation.
But yes, I'd think that including the PID would be a sensible thing to
do...
On Mon, Aug 02 2021, Junio C Hamano wrote:
> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>> On Wed, Jul 21 2021, Emily Shaffer wrote:
>>
>>> compat/linux/procinfo.c | 55 ++++++++++++++++++++++++++
>>> [...]
>>> + /* NEEDSWORK: add non-procfs-linux implementations here */
>>
>> We'd want non-Windows and non-Linux implementations of this, but that
>> NEEDSWORK comment should not be in compat/linux/procinfo.c, where we're
>> never going to add non-Linux code.
>
> I am puzzled. This is talking about additional implementation for
> Linux that does not use procfs, no (i.e. what to do with builds of
> Linux that compile out the procfs support or installations that do
> not mount the /proc hierarchy)?
>
> The comment seems to be at the right place to remind us of them,
> even though I do not know how important such an environment is.
Yes, I see I misread that, nevermind the narrow suggestion then.
Is there a way to do this sort of thing on Linux without procfs? Other
than things that use procfs themselves, e.g. parsing "ps" output.
The just-landed 6f64eeab605 (Merge branch 'es/trace2-log-parent-process-name', 2021-08-24) added parent process name logging, but under Linux we'd only log the immediate parent, and the full process chain on Windows. This brings the Linux implementation in parity with the Windows implementation. As it turns out /proc/<PID>/stat is a bit of a pain to parse. This is preceded by some minor memory leak fixes to es/trace2-log-parent-process-name, and the fixing of a bug where we'd log the empty string as a parent if we didn't have procfs. Ævar Arnfjörð Bjarmason (6): tr2: remove NEEDSWORK comment for "non-procfs" implementations tr2: clarify TRACE2_PROCESS_INFO_EXIT comment under Linux tr2: stop leaking "thread_name" memory tr2: fix memory leak & logic error in 2f732bf15e6 tr2: do compiler enum check in trace2_collect_process_info() tr2: log N parent process names on Linux compat/linux/procinfo.c | 151 +++++++++++++++++++++++++++++++++------- trace2/tr2_tls.c | 1 + 2 files changed, 128 insertions(+), 24 deletions(-) -- 2.33.0.733.ga72a4f1c2e1
I'm fairly sure that there is no way on Linux to inspect the process tree without using procfs, any tool such as ps(1), top(1) etc. that shows this sort of information ultimately looks the information up in procfs. So let's remove this comment added in 2f732bf15e6 (tr2: log parent process name, 2021-07-21), it's setting us up for an impossible task. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- compat/linux/procinfo.c | 1 - 1 file changed, 1 deletion(-) diff --git a/compat/linux/procinfo.c b/compat/linux/procinfo.c index 578fed4cd31..15a89676c7a 100644 --- a/compat/linux/procinfo.c +++ b/compat/linux/procinfo.c @@ -25,7 +25,6 @@ static void get_ancestry_names(struct strvec *names) } return; - /* NEEDSWORK: add non-procfs-linux implementations here */ } void trace2_collect_process_info(enum trace2_process_info_reason reason) -- 2.33.0.733.ga72a4f1c2e1
Rewrite a comment added in 2f732bf15e6 (tr2: log parent process name, 2021-07-21) to describe what we might do under TRACE2_PROCESS_INFO_EXIT in the future, instead of vaguely referring to "something extra". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- compat/linux/procinfo.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/compat/linux/procinfo.c b/compat/linux/procinfo.c index 15a89676c7a..62f8aaed4cc 100644 --- a/compat/linux/procinfo.c +++ b/compat/linux/procinfo.c @@ -32,8 +32,12 @@ void trace2_collect_process_info(enum trace2_process_info_reason reason) if (!trace2_is_enabled()) return; - /* someday we may want to write something extra here, but not today */ if (reason == TRACE2_PROCESS_INFO_EXIT) + /* + * The Windows version of this calls its + * get_peak_memory_info() here. We may want to insert + * similar process-end statistics here in the future. + */ return; if (reason == TRACE2_PROCESS_INFO_STARTUP) { -- 2.33.0.733.ga72a4f1c2e1
Fix a memory leak introduced in ee4512ed481 (trace2: create new combined trace facility, 2019-02-22), we were doing a free() of other memory allocated in tr2tls_create_self(), but not the "thread_name" "strbuf strbuf". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- trace2/tr2_tls.c | 1 + 1 file changed, 1 insertion(+) diff --git a/trace2/tr2_tls.c b/trace2/tr2_tls.c index 067c23755fb..7da94aba522 100644 --- a/trace2/tr2_tls.c +++ b/trace2/tr2_tls.c @@ -95,6 +95,7 @@ void tr2tls_unset_self(void) pthread_setspecific(tr2tls_key, NULL); + strbuf_release(&ctx->thread_name); free(ctx->array_us_start); free(ctx); } -- 2.33.0.733.ga72a4f1c2e1
In a subsequent commit I'll be replacing most of this code to log N parents, but let's first fix bugs introduced in the recent 2f732bf15e6 (tr2: log parent process name, 2021-07-21). It was using the strbuf_read_file() in the wrong way, its return value is either a length or a negative value on error. If we didn't have a procfs, or otherwise couldn't access it we'd end up pushing an empty string to the trace2 ancestry array. It was also using the strvec_push() API the wrong way. That API always does an xstrdup(), so by detaching the strbuf here we'd leak memory. Let's instead pass in our pointer for strvec_push() to xstrdup(), and then free our own strbuf. Furthermore we need to free that "procfs_path" strbuf whether or not strbuf_read_file() succeeds, which was another source of memory leaks in 2f732bf15e6, i.e. we'd leak that memory as well if we weren't on a system where we could read the file from procfs. In combination with the preceding commit this makes all of t[0-9]*trace2*.sh pass under SANITIZE=leak on Linux. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- compat/linux/procinfo.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/compat/linux/procinfo.c b/compat/linux/procinfo.c index 62f8aaed4cc..c86daff2b26 100644 --- a/compat/linux/procinfo.c +++ b/compat/linux/procinfo.c @@ -18,12 +18,13 @@ static void get_ancestry_names(struct strvec *names) /* try to use procfs if it's present. */ strbuf_addf(&procfs_path, "/proc/%d/comm", getppid()); - if (strbuf_read_file(&name, procfs_path.buf, 0)) { - strbuf_release(&procfs_path); + if (strbuf_read_file(&name, procfs_path.buf, 0) > 0) { strbuf_trim_trailing_newline(&name); - strvec_push(names, strbuf_detach(&name, NULL)); + strvec_push(names, name.buf); + strbuf_release(&name); } + strbuf_release(&procfs_path); return; } -- 2.33.0.733.ga72a4f1c2e1
Change code added in 2f732bf15e6 (tr2: log parent process name, 2021-07-21) to use a switch statement without a "default" branch to have the compiler error if this code ever drifts out of sync with the members of the "enum trace2_process_info_reason". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- compat/linux/procinfo.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/compat/linux/procinfo.c b/compat/linux/procinfo.c index c86daff2b26..46a751c9a1d 100644 --- a/compat/linux/procinfo.c +++ b/compat/linux/procinfo.c @@ -30,29 +30,30 @@ static void get_ancestry_names(struct strvec *names) void trace2_collect_process_info(enum trace2_process_info_reason reason) { + struct strvec names = STRVEC_INIT; + if (!trace2_is_enabled()) return; - if (reason == TRACE2_PROCESS_INFO_EXIT) + switch (reason) { + case TRACE2_PROCESS_INFO_EXIT: /* * The Windows version of this calls its * get_peak_memory_info() here. We may want to insert * similar process-end statistics here in the future. */ - return; - - if (reason == TRACE2_PROCESS_INFO_STARTUP) { + break; + case TRACE2_PROCESS_INFO_STARTUP: /* * NEEDSWORK: we could do the entire ptree in an array instead, * see compat/win32/trace2_win32_process_info.c. */ - struct strvec names = STRVEC_INIT; - get_ancestry_names(&names); if (names.nr) trace2_cmd_ancestry(names.v); strvec_clear(&names); + break; } return; -- 2.33.0.733.ga72a4f1c2e1
In 2f732bf15e6 (tr2: log parent process name, 2021-07-21) we started logging parent process names, but only logged all parents on Windows. on Linux only the name of the immediate parent process was logged. Extend the functionality added there to also log full parent chain on Linux. In 2f732bf15e6 it was claimed that "further ancestry info can be gathered with procfs, but it's unwieldy to do so.". I don't know what the author meant by that, but I think it probably referred to needing to slurp this up from the FS, as opposed to having an API. The underlying semantics on Linux are easier to deal with than on Windows though, at least as far as finding the parent PIDs goes. See the get_processes() function used on Windows. As shown in 353d3d77f4f (trace2: collect Windows-specific process information, 2019-02-22) it needs to deal with cycles. What is more complex on Linux is getting at the process name, a simpler approach is to use fscanf(), see [1] for an implementation of that, but as noted in the comment being added here it would fail in the face of some weird process names, so we need our own parse_proc_stat() to parse it out. With this patch the "ancestry" chain for a trace2 event might look like this: $ GIT_TRACE2_EVENT=/dev/stdout ~/g/git/git version | grep ancestry | jq -r .ancestry [ "bash", "screen", "systemd" ] And in the case of naughty process names. This uses perl's ability to use prctl(PR_SET_NAME, ...). See Perl/perl5@7636ea95c5 (Set the legacy process name with prctl() on assignment to $0 on Linux, 2010-04-15)[2]: $ perl -e '$0 = "(naughty\nname)"; system "GIT_TRACE2_EVENT=/dev/stdout ~/g/git/git version"' | grep ancestry | jq -r .ancestry [ "sh", "(naughty\nname)", "bash", "screen", "systemd" ] 1. https://lore.kernel.org/git/87o8agp29o.fsf@evledraar.gmail.com/ 2. https://github.com/Perl/perl5/commit/7636ea95c57762930accf4358f7c0c2dec086b5e Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- compat/linux/procinfo.c | 134 ++++++++++++++++++++++++++++++++++------ 1 file changed, 116 insertions(+), 18 deletions(-) diff --git a/compat/linux/procinfo.c b/compat/linux/procinfo.c index 46a751c9a1d..937084126a6 100644 --- a/compat/linux/procinfo.c +++ b/compat/linux/procinfo.c @@ -4,27 +4,129 @@ #include "strvec.h" #include "trace2.h" -static void get_ancestry_names(struct strvec *names) +/* + * We need more complex parsing instat_parent_pid() and + * parse_proc_stat() below than a dumb fscanf(). That's because while + * the statcomm field is surrounded by parentheses, the process itself + * is free to insert any arbitrary byte sequence its its name. That + * can include newlines, spaces, closing parentheses etc. See + * do_task_stat() in fs/proc/array.c in linux.git, this is in contrast + * with the escaped version of the name found in /proc/%d/status. + * + * So instead of using fscanf() we'll read N bytes from it, look for + * the first "(", and then the last ")", anything in-between is our + * process name. + * + * How much N do we need? On Linux /proc/sys/kernel/pid_max is 2^15 by + * default, but it can be raised set to values of up to 2^22. So + * that's 7 digits for a PID. We have 2 PIDs in the first four fields + * we're interested in, so 2 * 7 = 14. + * + * We then have 4 spaces between those four values, which brings us up + * to 18. Add the two parentheses and it's 20. The "state" is then one + * character (now at 21). + * + * Finally the maximum length of the "comm" name itself is 15 + * characters, e.g. a setting of "123456789abcdefg" will be truncated + * to "123456789abcdef". See PR_SET_NAME in prctl(2). So all in all + * we'd need to read 21 + 15 = 36 bytes. + * + * Let's just read 2^6 (64) instead for good measure. If PID_MAX ever + * grows past 2^22 we'll be future-proof. We'll then anchor at the + * last ")" we find to locate the parent PID. + */ +#define STAT_PARENT_PID_READ_N 64 + +static int parse_proc_stat(struct strbuf *sb, struct strbuf *name, + int *statppid) { + const char *lhs = strchr(sb->buf, '('); + const char *rhs = strrchr(sb->buf, ')'); + const char *ppid_lhs, *ppid_rhs; + char *p; + pid_t ppid; + + if (!lhs || !rhs) + goto bad_kernel; + /* - * NEEDSWORK: We could gather the entire pstree into an array to match - * functionality with compat/win32/trace2_win32_process_info.c. - * To do so, we may want to examine /proc/<pid>/stat. For now, just - * gather the immediate parent name which is readily accessible from - * /proc/$(getppid())/comm. + * We're at the ")", that's followed by " X ", where X is a + * single "state" character. So advance by 4 bytes. */ + ppid_lhs = rhs + 4; + + ppid_rhs = strchr(ppid_lhs, ' '); + if (!ppid_rhs) + goto bad_kernel; + + ppid = strtol(ppid_lhs, &p, 10); + if (ppid_rhs == p) { + const char *comm = lhs + 1; + int commlen = rhs - lhs - 1; + + strbuf_addf(name, "%.*s", commlen, comm); + *statppid = ppid; + + return 0; + } + +bad_kernel: + /* + * We were able to read our STAT_PARENT_PID_READ_N bytes from + * /proc/%d/stat, but the content is bad. Broken kernel? + * Should not happen, but handle it gracefully. + */ + return -1; +} + +static int stat_parent_pid(pid_t pid, struct strbuf *name, int *statppid) +{ struct strbuf procfs_path = STRBUF_INIT; - struct strbuf name = STRBUF_INIT; + struct strbuf sb = STRBUF_INIT; + size_t n; + FILE *fp = NULL; + int ret = -1; /* try to use procfs if it's present. */ - strbuf_addf(&procfs_path, "/proc/%d/comm", getppid()); - if (strbuf_read_file(&name, procfs_path.buf, 0) > 0) { - strbuf_trim_trailing_newline(&name); - strvec_push(names, name.buf); - strbuf_release(&name); - } + strbuf_addf(&procfs_path, "/proc/%d/stat", pid); + fp = fopen(procfs_path.buf, "r"); + if (!fp) + goto cleanup; + + n = strbuf_fread(&sb, STAT_PARENT_PID_READ_N, fp); + if (n != STAT_PARENT_PID_READ_N) + goto cleanup; + if (parse_proc_stat(&sb, name, statppid) < 0) + goto cleanup; + ret = 0; +cleanup: + if (fp) + fclose(fp); strbuf_release(&procfs_path); + strbuf_release(&sb); + + return ret; +} + +static void push_ancestry_name(struct strvec *names, pid_t pid) +{ + struct strbuf name = STRBUF_INIT; + int ppid; + + if (stat_parent_pid(pid, &name, &ppid) < 0) + goto cleanup; + + strvec_push(names, name.buf); + + /* + * Both errors and reaching the end of the process chain are + * reported as fields of 0 by proc(5) + */ + if (ppid) + push_ancestry_name(names, ppid); +cleanup: + strbuf_release(&name); return; } @@ -44,11 +146,7 @@ void trace2_collect_process_info(enum trace2_process_info_reason reason) */ break; case TRACE2_PROCESS_INFO_STARTUP: - /* - * NEEDSWORK: we could do the entire ptree in an array instead, - * see compat/win32/trace2_win32_process_info.c. - */ - get_ancestry_names(&names); + push_ancestry_name(&names, getppid()); if (names.nr) trace2_cmd_ancestry(names.v); -- 2.33.0.733.ga72a4f1c2e1
On Wed, Aug 25, 2021 at 7:20 PM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote: > [...] > Extend the functionality added there to also log full parent chain on > Linux. In 2f732bf15e6 it was claimed that "further ancestry info can > be gathered with procfs, but it's unwieldy to do so.". > [...] > Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> > --- > diff --git a/compat/linux/procinfo.c b/compat/linux/procinfo.c > @@ -4,27 +4,129 @@ > +/* > + * We need more complex parsing instat_parent_pid() and s/instat_parent_pid/in stat_parent_pid/ > + * parse_proc_stat() below than a dumb fscanf(). That's because while > + * the statcomm field is surrounded by parentheses, the process itself > + * is free to insert any arbitrary byte sequence its its name. That > + * can include newlines, spaces, closing parentheses etc. See > + * do_task_stat() in fs/proc/array.c in linux.git, this is in contrast > + * with the escaped version of the name found in /proc/%d/status. > + * > + * So instead of using fscanf() we'll read N bytes from it, look for > + * the first "(", and then the last ")", anything in-between is our > + * process name. > + * > + * How much N do we need? On Linux /proc/sys/kernel/pid_max is 2^15 by > + * default, but it can be raised set to values of up to 2^22. So > + * that's 7 digits for a PID. We have 2 PIDs in the first four fields > + * we're interested in, so 2 * 7 = 14. > + * > + * We then have 4 spaces between those four values, which brings us up > + * to 18. Add the two parentheses and it's 20. The "state" is then one > + * character (now at 21). > + * > + * Finally the maximum length of the "comm" name itself is 15 > + * characters, e.g. a setting of "123456789abcdefg" will be truncated > + * to "123456789abcdef". See PR_SET_NAME in prctl(2). So all in all > + * we'd need to read 21 + 15 = 36 bytes. > + * > + * Let's just read 2^6 (64) instead for good measure. If PID_MAX ever > + * grows past 2^22 we'll be future-proof. We'll then anchor at the > + * last ")" we find to locate the parent PID. > + */
On Thu, Aug 26, 2021 at 01:19:21AM +0200, Ævar Arnfjörð Bjarmason wrote:
> Fix a memory leak introduced in ee4512ed481 (trace2: create new
> combined trace facility, 2019-02-22), we were doing a free() of other
> memory allocated in tr2tls_create_self(), but not the "thread_name"
> "strbuf strbuf".
Your patch is obviously correct, but is "strbuf strbuf" a typo for
"struct strbuf"? It doesn't really matter, and it certainly is not worth
re-submitting 6 patches for a single typo, but just curious.
Thanks,
Taylor
On Thu, Aug 26, 2021 at 01:19:22AM +0200, Ævar Arnfjörð Bjarmason wrote: > In a subsequent commit I'll be replacing most of this code to log N > parents, but let's first fix bugs introduced in the recent > 2f732bf15e6 (tr2: log parent process name, 2021-07-21). > > It was using the strbuf_read_file() in the wrong way, its return value > is either a length or a negative value on error. If we didn't have a > procfs, or otherwise couldn't access it we'd end up pushing an empty > string to the trace2 ancestry array. Sure, and I could believe that we didn't ever test the "couldn't open /proc/pid/comm" case, which explains why this error didn't fail any tests. I wondered whether it was worth writing this instead as: if (strbuf_read_file(...) < 0) warning(...); before trimming and pushing the result onto "names", but we should probably be swallowing this case and not presenting it as an error (since it's completely acceptable to not be able to read from comm, e.g., in versions of procfs on Linux older than 2.6.33). > It was also using the strvec_push() API the wrong way. That API always > does an xstrdup(), so by detaching the strbuf here we'd leak > memory. Let's instead pass in our pointer for strvec_push() to > xstrdup(), and then free our own strbuf. Right, you could make strvec_push_nodup() non-static, but I don't think that kind of churn is worthwhile. > Furthermore we need to free that "procfs_path" strbuf whether or not > strbuf_read_file() succeeds, which was another source of memory leaks > in 2f732bf15e6, i.e. we'd leak that memory as well if we weren't on a > system where we could read the file from procfs. Hmm. It's completely correct to only call strbuf_release on names inside of the body of the if statement, but it's extra work for the reader to figure that out. Why not put both strbuf_release() calls next to each other (immediately before the return) to make it more obvious (since freeing the STRBUF_INIT value is a noop)? Thanks, Taylor
On Thu, Aug 26, 2021 at 01:19:23AM +0200, Ævar Arnfjörð Bjarmason wrote:
> Change code added in 2f732bf15e6 (tr2: log parent process name,
> 2021-07-21) to use a switch statement without a "default" branch to
> have the compiler error if this code ever drifts out of sync with the
> members of the "enum trace2_process_info_reason".
OK... arguably this code should have been written with a switch
statement to begin with, but this looks like unnecessary churn to me.
Maybe the next patch adds a new value of this enum? Let's see.
Thanks,
Taylor
On Thu, Aug 26, 2021 at 01:19:24AM +0200, Ævar Arnfjörð Bjarmason wrote: > In 2f732bf15e6 (tr2: log parent process name, 2021-07-21) we started > logging parent process names, but only logged all parents on Windows. > on Linux only the name of the immediate parent process was logged. > > Extend the functionality added there to also log full parent chain on > Linux. In 2f732bf15e6 it was claimed that "further ancestry info can > be gathered with procfs, but it's unwieldy to do so.". > > I don't know what the author meant by that, but I think it probably > referred to needing to slurp this up from the FS, as opposed to having > an API. I don't think that this (specifically, "I don't know what the author meant by that") is necessary information to include in a patch message. If you're looking for a replacement (and you may not be, but just my $.02) I would suggest: "2f732bf15e6 does not log the full parent chain on Linux; implement that functionality here." > What is more complex on Linux is getting at the process name, a > simpler approach is to use fscanf(), see [1] for an implementation of > that, but as noted in the comment being added here it would fail in > the face of some weird process names, so we need our own > parse_proc_stat() to parse it out. This is helpful information to have for readers that aren't familiar with procfs (especially the detail about why the naive approach doesn't work). > diff --git a/compat/linux/procinfo.c b/compat/linux/procinfo.c > index 46a751c9a1d..937084126a6 100644 > --- a/compat/linux/procinfo.c > +++ b/compat/linux/procinfo.c > @@ -4,27 +4,129 @@ > #include "strvec.h" > #include "trace2.h" > > -static void get_ancestry_names(struct strvec *names) > +/* > + * We need more complex parsing instat_parent_pid() and > + * parse_proc_stat() below than a dumb fscanf(). That's because while > + * the statcomm field is surrounded by parentheses, the process itself > + * is free to insert any arbitrary byte sequence its its name. That > + * can include newlines, spaces, closing parentheses etc. See > + * do_task_stat() in fs/proc/array.c in linux.git, this is in contrast > + * with the escaped version of the name found in /proc/%d/status. > + * > + * So instead of using fscanf() we'll read N bytes from it, look for > + * the first "(", and then the last ")", anything in-between is our > + * process name. > + * > + * How much N do we need? On Linux /proc/sys/kernel/pid_max is 2^15 by > + * default, but it can be raised set to values of up to 2^22. So > + * that's 7 digits for a PID. We have 2 PIDs in the first four fields > + * we're interested in, so 2 * 7 = 14. > + * > + * We then have 4 spaces between those four values, which brings us up > + * to 18. Add the two parentheses and it's 20. The "state" is then one > + * character (now at 21). Hmm, aren't there three spaces, not four? > + * Finally the maximum length of the "comm" name itself is 15 > + * characters, e.g. a setting of "123456789abcdefg" will be truncated > + * to "123456789abcdef". See PR_SET_NAME in prctl(2). So all in all > + * we'd need to read 21 + 15 = 36 bytes. Ah, 36 is the right number even though you and I counted a different number of spaces, since the name is truncated when it goes over *16* characters, but that includes the NUL byte. So we both arrive at the same number in the end ;). But I agree it's safer to just read more (but not too much more) than what we need. > + * Let's just read 2^6 (64) instead for good measure. If PID_MAX ever > + * grows past 2^22 we'll be future-proof. We'll then anchor at the > + * last ")" we find to locate the parent PID. > + */ > +#define STAT_PARENT_PID_READ_N 64 > + > +static int parse_proc_stat(struct strbuf *sb, struct strbuf *name, > + int *statppid) Going to think aloud to make sure that this parsing looks right. > { > + const char *lhs = strchr(sb->buf, '('); > + const char *rhs = strrchr(sb->buf, ')'); lhs and rhs are going to be on either side of the comm field (which may be helpful to indicate by calling these comm_lhs and comm_rhs). And strrchr makes sure to handle process names that have a ')' in them. Looks right. > + const char *ppid_lhs, *ppid_rhs; > + char *p; > + pid_t ppid; > + > + if (!lhs || !rhs) > + goto bad_kernel; > + OK. > /* > - * NEEDSWORK: We could gather the entire pstree into an array to match > - * functionality with compat/win32/trace2_win32_process_info.c. > - * To do so, we may want to examine /proc/<pid>/stat. For now, just > - * gather the immediate parent name which is readily accessible from > - * /proc/$(getppid())/comm. > + * We're at the ")", that's followed by " X ", where X is a > + * single "state" character. So advance by 4 bytes. > */ > + ppid_lhs = rhs + 4; > + > + ppid_rhs = strchr(ppid_lhs, ' '); Skipping over the state field gives us the first character of ppid as you say, good. And reading until the first space character will point us right after the end, good. > + if (!ppid_rhs) > + goto bad_kernel; > + > + ppid = strtol(ppid_lhs, &p, 10); > + if (ppid_rhs == p) { Then parse the ppid and make sure we stopped at the right-hand side where we should have. Good. > + const char *comm = lhs + 1; Skipping past the '(', but now I feel like we really should s/lhs/comm_&/. > + int commlen = rhs - lhs - 1; This is right, but you could simplify the expression to be "rhs - comm", since you just took into account the left-hand parenthesis in the previous line. Also recommend a size_t here: it's obvious we're not going to overflow int here, but it saves future readers of having to wonder the same thing. > + > + strbuf_addf(name, "%.*s", commlen, comm); > + *statppid = ppid; > + > + return 0; > + } > + > +bad_kernel: > + /* > + * We were able to read our STAT_PARENT_PID_READ_N bytes from > + * /proc/%d/stat, but the content is bad. Broken kernel? > + * Should not happen, but handle it gracefully. > + */ > + return -1; > +} Phew, all seems good. Thanks for bearing with me while I read through all of that ;). > +static int stat_parent_pid(pid_t pid, struct strbuf *name, int *statppid) > +{ > struct strbuf procfs_path = STRBUF_INIT; > - struct strbuf name = STRBUF_INIT; > + struct strbuf sb = STRBUF_INIT; > + size_t n; > + FILE *fp = NULL; fopen() will return NULL, and you call it unconditionally, so no need to initialize here. > + int ret = -1; > > /* try to use procfs if it's present. */ > - strbuf_addf(&procfs_path, "/proc/%d/comm", getppid()); > - if (strbuf_read_file(&name, procfs_path.buf, 0) > 0) { > - strbuf_trim_trailing_newline(&name); > - strvec_push(names, name.buf); > - strbuf_release(&name); > - } > + strbuf_addf(&procfs_path, "/proc/%d/stat", pid); > + fp = fopen(procfs_path.buf, "r"); > + if (!fp) > + goto cleanup; > + > + n = strbuf_fread(&sb, STAT_PARENT_PID_READ_N, fp); > + if (n != STAT_PARENT_PID_READ_N) > + goto cleanup; Hmm. Wouldn't we always goto cleanup here, since STAT_PARENT_PID_READ_N is deliberately oversized (and not constant anyways, since process ids could be anywhere from 1-7 digits long)? I think we could probably drop 'n' entirely here, and instead: if (strbuf_fread(...) < 0) goto cleanup; > + if (parse_proc_stat(&sb, name, statppid) < 0) > + goto cleanup; > > + ret = 0; > +cleanup: > + if (fp) > + fclose(fp); > strbuf_release(&procfs_path); > + strbuf_release(&sb); > + > + return ret; > +} > + > +static void push_ancestry_name(struct strvec *names, pid_t pid) > +{ > + struct strbuf name = STRBUF_INIT; > + int ppid; > + > + if (stat_parent_pid(pid, &name, &ppid) < 0) > + goto cleanup; > + > + strvec_push(names, name.buf); > + > + /* > + * Both errors and reaching the end of the process chain are > + * reported as fields of 0 by proc(5) > + */ > + if (ppid) > + push_ancestry_name(names, ppid); > +cleanup: > + strbuf_release(&name); > return; > } The rest looks good to me, but it looks like you overwrote all of the work that you did in patch 4/6. I guess separating them out makes sense if this patch wasn't taken, but I probably would have gone right to this patch instead of fixing leaks that you were going to get rid of anyway. Thanks, Taylor
An early re-roll due to some very good & early review by Taylor Blau, this also fixes the grammar/typo pointed out by Eric Sunshine. Thanks both: This should address all the comments Taylor brought up, and hopefully a bit more. The only thing I left outstanding was the inclusion of the "while we're at it" enum refactoring in 5/6. It's not necessary for the end-state here, but I think since we're already reviewing most of this file it makes sense to change it while we're at it to future-proof the code vis-as-vis getting stricted checks from the compiler. Ævar Arnfjörð Bjarmason (6): tr2: remove NEEDSWORK comment for "non-procfs" implementations tr2: clarify TRACE2_PROCESS_INFO_EXIT comment under Linux tr2: stop leaking "thread_name" memory tr2: fix memory leak & logic error in 2f732bf15e6 tr2: do compiler enum check in trace2_collect_process_info() tr2: log N parent process names on Linux compat/linux/procinfo.c | 169 ++++++++++++++++++++++++++++++++++------ trace2/tr2_tls.c | 1 + 2 files changed, 146 insertions(+), 24 deletions(-) Range-diff against v1: 1: 8c649ce3b49 = 1: 8c649ce3b49 tr2: remove NEEDSWORK comment for "non-procfs" implementations 2: 0150e3402a7 = 2: 0150e3402a7 tr2: clarify TRACE2_PROCESS_INFO_EXIT comment under Linux 3: 1fa1bbb6743 ! 3: 1d835d6767e tr2: stop leaking "thread_name" memory @@ Commit message Fix a memory leak introduced in ee4512ed481 (trace2: create new combined trace facility, 2019-02-22), we were doing a free() of other memory allocated in tr2tls_create_self(), but not the "thread_name" - "strbuf strbuf". + "struct strbuf". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> 4: 73e7d4eb6ac ! 4: 1aa0dbc394e tr2: fix memory leak & logic error in 2f732bf15e6 @@ Commit message It was also using the strvec_push() API the wrong way. That API always does an xstrdup(), so by detaching the strbuf here we'd leak memory. Let's instead pass in our pointer for strvec_push() to - xstrdup(), and then free our own strbuf. + xstrdup(), and then free our own strbuf. I do have some WIP changes to + make strvec_push_nodup() non-static, which makes this and some other + callsites nicer, but let's just follow the prevailing pattern of using + strvec_push() for now. - Furthermore we need to free that "procfs_path" strbuf whether or not + We'll also need to free that "procfs_path" strbuf whether or not strbuf_read_file() succeeds, which was another source of memory leaks in 2f732bf15e6, i.e. we'd leak that memory as well if we weren't on a system where we could read the file from procfs. + Let's move all the freeing of the memory to the end of the + function. If we're still at STRBUF_INIT with "name" due to not haven + taken the branch where the strbuf_read_file() succeeds freeing it is + redundant, so we could move it into the body of the "if", but just + handling freeing the same way for all branches of the function makes + it more readable. + In combination with the preceding commit this makes all of t[0-9]*trace2*.sh pass under SANITIZE=leak on Linux. @@ compat/linux/procinfo.c: static void get_ancestry_names(struct strvec *names) strbuf_trim_trailing_newline(&name); - strvec_push(names, strbuf_detach(&name, NULL)); + strvec_push(names, name.buf); -+ strbuf_release(&name); } + strbuf_release(&procfs_path); ++ strbuf_release(&name); ++ return; } 5: 4e378da2cce = 5: 70fef093d8d tr2: do compiler enum check in trace2_collect_process_info() 6: da003330800 ! 6: f6aac902484 tr2: log N parent process names on Linux @@ Commit message on Linux only the name of the immediate parent process was logged. Extend the functionality added there to also log full parent chain on - Linux. In 2f732bf15e6 it was claimed that "further ancestry info can - be gathered with procfs, but it's unwieldy to do so.". - - I don't know what the author meant by that, but I think it probably - referred to needing to slurp this up from the FS, as opposed to having - an API. The underlying semantics on Linux are easier to deal with than - on Windows though, at least as far as finding the parent PIDs - goes. See the get_processes() function used on Windows. As shown in - 353d3d77f4f (trace2: collect Windows-specific process information, - 2019-02-22) it needs to deal with cycles. - - What is more complex on Linux is getting at the process name, a - simpler approach is to use fscanf(), see [1] for an implementation of - that, but as noted in the comment being added here it would fail in - the face of some weird process names, so we need our own - parse_proc_stat() to parse it out. + Linux. + + This requires us to lookup "/proc/<getppid()>/stat" instead of + "/proc/<getppid()>/comm". The "comm" file just contains the name of the + process, but the "stat" file has both that information, and the parent + PID of that process, see procfs(5). We parse out the parent PID of our + own parent, and recursively walk the chain of "/proc/*/stat" files all + the way up the chain. A parent PID of 0 indicates the end of the + chain. + + It's possible given the semantics of Linux's PID files that we end up + getting an entirely nonsensical chain of processes. It could happen if + e.g. we have a chain of processes like: + + 1 (init) => 321 (bash) => 123 (git) + + Let's assume that "bash" was started a while ago, and that as shown + the OS has already cycled back to using a lower PID for us than our + parent process. In the time it takes us to start up and get to + trace2_collect_process_info(TRACE2_PROCESS_INFO_STARTUP) our parent + process might exit, and be replaced by an entirely different process! + + We'd racily look up our own getppid(), but in the meantime our parent + would exit, and Linux would have cycled all the way back to starting + an entirely unrelated process as PID 321. + + If that happens we'll just silently log incorrect data in our ancestry + chain. Luckily we don't need to worry about this except in this + specific cycling scenario, as Linux does not have PID + randomization. It appears it once did through a third-party feature, + but that it was removed around 2006[1]. For anyone worried about this + edge case raising PID_MAX via "/proc/sys/kernel/pid_max" will mitigate + it, but not eliminate it. + + One thing we don't need to worry about is getting into an infinite + loop when walking "/proc/*/stat". See 353d3d77f4f (trace2: collect + Windows-specific process information, 2019-02-22) for the related + Windows code that needs to deal with that, and [2] for an explanation + of that edge case. + + Aside from potential race conditions it's also a bit painful to + correctly parse the process name out of "/proc/*/stat". A simpler + approach is to use fscanf(), see [3] for an implementation of that, + but as noted in the comment being added here it would fail in the face + of some weird process names, so we need our own parse_proc_stat() to + parse it out. With this patch the "ancestry" chain for a trace2 event might look like this: @@ Commit message "systemd" ] - And in the case of naughty process names. This uses perl's ability to - use prctl(PR_SET_NAME, ...). See Perl/perl5@7636ea95c5 (Set the legacy - process name with prctl() on assignment to $0 on Linux, 2010-04-15)[2]: + And in the case of naughty process names like the following. This uses + perl's ability to use prctl(PR_SET_NAME, ...). See + Perl/perl5@7636ea95c5 (Set the legacy process name with prctl() on + assignment to $0 on Linux, 2010-04-15)[4]: $ perl -e '$0 = "(naughty\nname)"; system "GIT_TRACE2_EVENT=/dev/stdout ~/g/git/git version"' | grep ancestry | jq -r .ancestry [ @@ Commit message "systemd" ] - 1. https://lore.kernel.org/git/87o8agp29o.fsf@evledraar.gmail.com/ - 2. https://github.com/Perl/perl5/commit/7636ea95c57762930accf4358f7c0c2dec086b5e + 1. https://grsecurity.net/news#grsec2110 + 2. https://lore.kernel.org/git/48a62d5e-28e2-7103-a5bb-5db7e197a4b9@jeffhostetler.com/ + 3. https://lore.kernel.org/git/87o8agp29o.fsf@evledraar.gmail.com/ + 4. https://github.com/Perl/perl5/commit/7636ea95c57762930accf4358f7c0c2dec086b5e Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> @@ compat/linux/procinfo.c -static void get_ancestry_names(struct strvec *names) +/* -+ * We need more complex parsing instat_parent_pid() and ++ * We need more complex parsing in stat_parent_pid() and + * parse_proc_stat() below than a dumb fscanf(). That's because while + * the statcomm field is surrounded by parentheses, the process itself + * is free to insert any arbitrary byte sequence its its name. That -+ * can include newlines, spaces, closing parentheses etc. See -+ * do_task_stat() in fs/proc/array.c in linux.git, this is in contrast -+ * with the escaped version of the name found in /proc/%d/status. ++ * can include newlines, spaces, closing parentheses etc. ++ * ++ * See do_task_stat() in fs/proc/array.c in linux.git, this is in ++ * contrast with the escaped version of the name found in ++ * /proc/%d/status. + * + * So instead of using fscanf() we'll read N bytes from it, look for + * the first "(", and then the last ")", anything in-between is our @@ compat/linux/procinfo.c + * that's 7 digits for a PID. We have 2 PIDs in the first four fields + * we're interested in, so 2 * 7 = 14. + * -+ * We then have 4 spaces between those four values, which brings us up -+ * to 18. Add the two parentheses and it's 20. The "state" is then one -+ * character (now at 21). ++ * We then have 3 spaces between those four values, and we'd like to ++ * get to the space between the 4th and the 5th (the "pgrp" field) to ++ * make sure we read the entire "ppid" field. So that brings us up to ++ * 14 + 3 + 1 = 18. Add the two parentheses around the "comm" value ++ * and it's 20. The "state" value itself is then one character (now at ++ * 21). + * + * Finally the maximum length of the "comm" name itself is 15 + * characters, e.g. a setting of "123456789abcdefg" will be truncated @@ compat/linux/procinfo.c +static int parse_proc_stat(struct strbuf *sb, struct strbuf *name, + int *statppid) { -+ const char *lhs = strchr(sb->buf, '('); -+ const char *rhs = strrchr(sb->buf, ')'); ++ const char *comm_lhs = strchr(sb->buf, '('); ++ const char *comm_rhs = strrchr(sb->buf, ')'); + const char *ppid_lhs, *ppid_rhs; + char *p; + pid_t ppid; + -+ if (!lhs || !rhs) ++ if (!comm_lhs || !comm_rhs) + goto bad_kernel; + /* @@ compat/linux/procinfo.c + * We're at the ")", that's followed by " X ", where X is a + * single "state" character. So advance by 4 bytes. */ -+ ppid_lhs = rhs + 4; ++ ppid_lhs = comm_rhs + 4; + ++ /* ++ * Read until the space between the "ppid" and "pgrp" fields ++ * to make sure we're anchored after the untruncated "ppid" ++ * field.. ++ */ + ppid_rhs = strchr(ppid_lhs, ' '); + if (!ppid_rhs) + goto bad_kernel; + + ppid = strtol(ppid_lhs, &p, 10); + if (ppid_rhs == p) { -+ const char *comm = lhs + 1; -+ int commlen = rhs - lhs - 1; ++ const char *comm = comm_lhs + 1; ++ size_t commlen = comm_rhs - comm; + -+ strbuf_addf(name, "%.*s", commlen, comm); ++ strbuf_add(name, comm, commlen); + *statppid = ppid; + + return 0; @@ compat/linux/procinfo.c struct strbuf procfs_path = STRBUF_INIT; - struct strbuf name = STRBUF_INIT; + struct strbuf sb = STRBUF_INIT; -+ size_t n; -+ FILE *fp = NULL; ++ FILE *fp; + int ret = -1; /* try to use procfs if it's present. */ @@ compat/linux/procinfo.c - if (strbuf_read_file(&name, procfs_path.buf, 0) > 0) { - strbuf_trim_trailing_newline(&name); - strvec_push(names, name.buf); -- strbuf_release(&name); - } + strbuf_addf(&procfs_path, "/proc/%d/stat", pid); + fp = fopen(procfs_path.buf, "r"); + if (!fp) + goto cleanup; + -+ n = strbuf_fread(&sb, STAT_PARENT_PID_READ_N, fp); -+ if (n != STAT_PARENT_PID_READ_N) ++ /* ++ * We could be more strict here and assert that we read at ++ * least STAT_PARENT_PID_READ_N. My reading of procfs(5) is ++ * that on any modern kernel (at least since 2.6.0 released in ++ * 2003) even if all the mandatory numeric fields were zero'd ++ * out we'd get at least 100 bytes, but let's just check that ++ * we got anything at all and trust the parse_proc_stat() ++ * function to handle its "Bad Kernel?" error checking. ++ */ ++ if (!strbuf_fread(&sb, STAT_PARENT_PID_READ_N, fp)) + goto cleanup; + if (parse_proc_stat(&sb, name, statppid) < 0) + goto cleanup; @@ compat/linux/procinfo.c + if (ppid) + push_ancestry_name(names, ppid); +cleanup: -+ strbuf_release(&name); - return; - } + strbuf_release(&name); + return; @@ compat/linux/procinfo.c: void trace2_collect_process_info(enum trace2_process_info_reason reason) */ break; -- 2.33.0.733.ga72a4f1c2e1
I'm fairly sure that there is no way on Linux to inspect the process tree without using procfs, any tool such as ps(1), top(1) etc. that shows this sort of information ultimately looks the information up in procfs. So let's remove this comment added in 2f732bf15e6 (tr2: log parent process name, 2021-07-21), it's setting us up for an impossible task. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- compat/linux/procinfo.c | 1 - 1 file changed, 1 deletion(-) diff --git a/compat/linux/procinfo.c b/compat/linux/procinfo.c index 578fed4cd31..15a89676c7a 100644 --- a/compat/linux/procinfo.c +++ b/compat/linux/procinfo.c @@ -25,7 +25,6 @@ static void get_ancestry_names(struct strvec *names) } return; - /* NEEDSWORK: add non-procfs-linux implementations here */ } void trace2_collect_process_info(enum trace2_process_info_reason reason) -- 2.33.0.733.ga72a4f1c2e1
Rewrite a comment added in 2f732bf15e6 (tr2: log parent process name, 2021-07-21) to describe what we might do under TRACE2_PROCESS_INFO_EXIT in the future, instead of vaguely referring to "something extra". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- compat/linux/procinfo.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/compat/linux/procinfo.c b/compat/linux/procinfo.c index 15a89676c7a..62f8aaed4cc 100644 --- a/compat/linux/procinfo.c +++ b/compat/linux/procinfo.c @@ -32,8 +32,12 @@ void trace2_collect_process_info(enum trace2_process_info_reason reason) if (!trace2_is_enabled()) return; - /* someday we may want to write something extra here, but not today */ if (reason == TRACE2_PROCESS_INFO_EXIT) + /* + * The Windows version of this calls its + * get_peak_memory_info() here. We may want to insert + * similar process-end statistics here in the future. + */ return; if (reason == TRACE2_PROCESS_INFO_STARTUP) { -- 2.33.0.733.ga72a4f1c2e1
Fix a memory leak introduced in ee4512ed481 (trace2: create new combined trace facility, 2019-02-22), we were doing a free() of other memory allocated in tr2tls_create_self(), but not the "thread_name" "struct strbuf". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- trace2/tr2_tls.c | 1 + 1 file changed, 1 insertion(+) diff --git a/trace2/tr2_tls.c b/trace2/tr2_tls.c index 067c23755fb..7da94aba522 100644 --- a/trace2/tr2_tls.c +++ b/trace2/tr2_tls.c @@ -95,6 +95,7 @@ void tr2tls_unset_self(void) pthread_setspecific(tr2tls_key, NULL); + strbuf_release(&ctx->thread_name); free(ctx->array_us_start); free(ctx); } -- 2.33.0.733.ga72a4f1c2e1
In a subsequent commit I'll be replacing most of this code to log N parents, but let's first fix bugs introduced in the recent 2f732bf15e6 (tr2: log parent process name, 2021-07-21). It was using the strbuf_read_file() in the wrong way, its return value is either a length or a negative value on error. If we didn't have a procfs, or otherwise couldn't access it we'd end up pushing an empty string to the trace2 ancestry array. It was also using the strvec_push() API the wrong way. That API always does an xstrdup(), so by detaching the strbuf here we'd leak memory. Let's instead pass in our pointer for strvec_push() to xstrdup(), and then free our own strbuf. I do have some WIP changes to make strvec_push_nodup() non-static, which makes this and some other callsites nicer, but let's just follow the prevailing pattern of using strvec_push() for now. We'll also need to free that "procfs_path" strbuf whether or not strbuf_read_file() succeeds, which was another source of memory leaks in 2f732bf15e6, i.e. we'd leak that memory as well if we weren't on a system where we could read the file from procfs. Let's move all the freeing of the memory to the end of the function. If we're still at STRBUF_INIT with "name" due to not haven taken the branch where the strbuf_read_file() succeeds freeing it is redundant, so we could move it into the body of the "if", but just handling freeing the same way for all branches of the function makes it more readable. In combination with the preceding commit this makes all of t[0-9]*trace2*.sh pass under SANITIZE=leak on Linux. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- compat/linux/procinfo.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/compat/linux/procinfo.c b/compat/linux/procinfo.c index 62f8aaed4cc..bd01f017bc7 100644 --- a/compat/linux/procinfo.c +++ b/compat/linux/procinfo.c @@ -18,12 +18,14 @@ static void get_ancestry_names(struct strvec *names) /* try to use procfs if it's present. */ strbuf_addf(&procfs_path, "/proc/%d/comm", getppid()); - if (strbuf_read_file(&name, procfs_path.buf, 0)) { - strbuf_release(&procfs_path); + if (strbuf_read_file(&name, procfs_path.buf, 0) > 0) { strbuf_trim_trailing_newline(&name); - strvec_push(names, strbuf_detach(&name, NULL)); + strvec_push(names, name.buf); } + strbuf_release(&procfs_path); + strbuf_release(&name); + return; } -- 2.33.0.733.ga72a4f1c2e1
Change code added in 2f732bf15e6 (tr2: log parent process name, 2021-07-21) to use a switch statement without a "default" branch to have the compiler error if this code ever drifts out of sync with the members of the "enum trace2_process_info_reason". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- compat/linux/procinfo.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/compat/linux/procinfo.c b/compat/linux/procinfo.c index bd01f017bc7..0b47d44990c 100644 --- a/compat/linux/procinfo.c +++ b/compat/linux/procinfo.c @@ -31,29 +31,30 @@ static void get_ancestry_names(struct strvec *names) void trace2_collect_process_info(enum trace2_process_info_reason reason) { + struct strvec names = STRVEC_INIT; + if (!trace2_is_enabled()) return; - if (reason == TRACE2_PROCESS_INFO_EXIT) + switch (reason) { + case TRACE2_PROCESS_INFO_EXIT: /* * The Windows version of this calls its * get_peak_memory_info() here. We may want to insert * similar process-end statistics here in the future. */ - return; - - if (reason == TRACE2_PROCESS_INFO_STARTUP) { + break; + case TRACE2_PROCESS_INFO_STARTUP: /* * NEEDSWORK: we could do the entire ptree in an array instead, * see compat/win32/trace2_win32_process_info.c. */ - struct strvec names = STRVEC_INIT; - get_ancestry_names(&names); if (names.nr) trace2_cmd_ancestry(names.v); strvec_clear(&names); + break; } return; -- 2.33.0.733.ga72a4f1c2e1
In 2f732bf15e6 (tr2: log parent process name, 2021-07-21) we started logging parent process names, but only logged all parents on Windows. on Linux only the name of the immediate parent process was logged. Extend the functionality added there to also log full parent chain on Linux. This requires us to lookup "/proc/<getppid()>/stat" instead of "/proc/<getppid()>/comm". The "comm" file just contains the name of the process, but the "stat" file has both that information, and the parent PID of that process, see procfs(5). We parse out the parent PID of our own parent, and recursively walk the chain of "/proc/*/stat" files all the way up the chain. A parent PID of 0 indicates the end of the chain. It's possible given the semantics of Linux's PID files that we end up getting an entirely nonsensical chain of processes. It could happen if e.g. we have a chain of processes like: 1 (init) => 321 (bash) => 123 (git) Let's assume that "bash" was started a while ago, and that as shown the OS has already cycled back to using a lower PID for us than our parent process. In the time it takes us to start up and get to trace2_collect_process_info(TRACE2_PROCESS_INFO_STARTUP) our parent process might exit, and be replaced by an entirely different process! We'd racily look up our own getppid(), but in the meantime our parent would exit, and Linux would have cycled all the way back to starting an entirely unrelated process as PID 321. If that happens we'll just silently log incorrect data in our ancestry chain. Luckily we don't need to worry about this except in this specific cycling scenario, as Linux does not have PID randomization. It appears it once did through a third-party feature, but that it was removed around 2006[1]. For anyone worried about this edge case raising PID_MAX via "/proc/sys/kernel/pid_max" will mitigate it, but not eliminate it. One thing we don't need to worry about is getting into an infinite loop when walking "/proc/*/stat". See 353d3d77f4f (trace2: collect Windows-specific process information, 2019-02-22) for the related Windows code that needs to deal with that, and [2] for an explanation of that edge case. Aside from potential race conditions it's also a bit painful to correctly parse the process name out of "/proc/*/stat". A simpler approach is to use fscanf(), see [3] for an implementation of that, but as noted in the comment being added here it would fail in the face of some weird process names, so we need our own parse_proc_stat() to parse it out. With this patch the "ancestry" chain for a trace2 event might look like this: $ GIT_TRACE2_EVENT=/dev/stdout ~/g/git/git version | grep ancestry | jq -r .ancestry [ "bash", "screen", "systemd" ] And in the case of naughty process names like the following. This uses perl's ability to use prctl(PR_SET_NAME, ...). See Perl/perl5@7636ea95c5 (Set the legacy process name with prctl() on assignment to $0 on Linux, 2010-04-15)[4]: $ perl -e '$0 = "(naughty\nname)"; system "GIT_TRACE2_EVENT=/dev/stdout ~/g/git/git version"' | grep ancestry | jq -r .ancestry [ "sh", "(naughty\nname)", "bash", "screen", "systemd" ] 1. https://grsecurity.net/news#grsec2110 2. https://lore.kernel.org/git/48a62d5e-28e2-7103-a5bb-5db7e197a4b9@jeffhostetler.com/ 3. https://lore.kernel.org/git/87o8agp29o.fsf@evledraar.gmail.com/ 4. https://github.com/Perl/perl5/commit/7636ea95c57762930accf4358f7c0c2dec086b5e Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- compat/linux/procinfo.c | 149 +++++++++++++++++++++++++++++++++++----- 1 file changed, 132 insertions(+), 17 deletions(-) diff --git a/compat/linux/procinfo.c b/compat/linux/procinfo.c index 0b47d44990c..bc2f9382a17 100644 --- a/compat/linux/procinfo.c +++ b/compat/linux/procinfo.c @@ -4,26 +4,145 @@ #include "strvec.h" #include "trace2.h" -static void get_ancestry_names(struct strvec *names) +/* + * We need more complex parsing in stat_parent_pid() and + * parse_proc_stat() below than a dumb fscanf(). That's because while + * the statcomm field is surrounded by parentheses, the process itself + * is free to insert any arbitrary byte sequence its its name. That + * can include newlines, spaces, closing parentheses etc. + * + * See do_task_stat() in fs/proc/array.c in linux.git, this is in + * contrast with the escaped version of the name found in + * /proc/%d/status. + * + * So instead of using fscanf() we'll read N bytes from it, look for + * the first "(", and then the last ")", anything in-between is our + * process name. + * + * How much N do we need? On Linux /proc/sys/kernel/pid_max is 2^15 by + * default, but it can be raised set to values of up to 2^22. So + * that's 7 digits for a PID. We have 2 PIDs in the first four fields + * we're interested in, so 2 * 7 = 14. + * + * We then have 3 spaces between those four values, and we'd like to + * get to the space between the 4th and the 5th (the "pgrp" field) to + * make sure we read the entire "ppid" field. So that brings us up to + * 14 + 3 + 1 = 18. Add the two parentheses around the "comm" value + * and it's 20. The "state" value itself is then one character (now at + * 21). + * + * Finally the maximum length of the "comm" name itself is 15 + * characters, e.g. a setting of "123456789abcdefg" will be truncated + * to "123456789abcdef". See PR_SET_NAME in prctl(2). So all in all + * we'd need to read 21 + 15 = 36 bytes. + * + * Let's just read 2^6 (64) instead for good measure. If PID_MAX ever + * grows past 2^22 we'll be future-proof. We'll then anchor at the + * last ")" we find to locate the parent PID. + */ +#define STAT_PARENT_PID_READ_N 64 + +static int parse_proc_stat(struct strbuf *sb, struct strbuf *name, + int *statppid) { + const char *comm_lhs = strchr(sb->buf, '('); + const char *comm_rhs = strrchr(sb->buf, ')'); + const char *ppid_lhs, *ppid_rhs; + char *p; + pid_t ppid; + + if (!comm_lhs || !comm_rhs) + goto bad_kernel; + /* - * NEEDSWORK: We could gather the entire pstree into an array to match - * functionality with compat/win32/trace2_win32_process_info.c. - * To do so, we may want to examine /proc/<pid>/stat. For now, just - * gather the immediate parent name which is readily accessible from - * /proc/$(getppid())/comm. + * We're at the ")", that's followed by " X ", where X is a + * single "state" character. So advance by 4 bytes. */ + ppid_lhs = comm_rhs + 4; + + /* + * Read until the space between the "ppid" and "pgrp" fields + * to make sure we're anchored after the untruncated "ppid" + * field.. + */ + ppid_rhs = strchr(ppid_lhs, ' '); + if (!ppid_rhs) + goto bad_kernel; + + ppid = strtol(ppid_lhs, &p, 10); + if (ppid_rhs == p) { + const char *comm = comm_lhs + 1; + size_t commlen = comm_rhs - comm; + + strbuf_add(name, comm, commlen); + *statppid = ppid; + + return 0; + } + +bad_kernel: + /* + * We were able to read our STAT_PARENT_PID_READ_N bytes from + * /proc/%d/stat, but the content is bad. Broken kernel? + * Should not happen, but handle it gracefully. + */ + return -1; +} + +static int stat_parent_pid(pid_t pid, struct strbuf *name, int *statppid) +{ struct strbuf procfs_path = STRBUF_INIT; - struct strbuf name = STRBUF_INIT; + struct strbuf sb = STRBUF_INIT; + FILE *fp; + int ret = -1; /* try to use procfs if it's present. */ - strbuf_addf(&procfs_path, "/proc/%d/comm", getppid()); - if (strbuf_read_file(&name, procfs_path.buf, 0) > 0) { - strbuf_trim_trailing_newline(&name); - strvec_push(names, name.buf); - } + strbuf_addf(&procfs_path, "/proc/%d/stat", pid); + fp = fopen(procfs_path.buf, "r"); + if (!fp) + goto cleanup; + + /* + * We could be more strict here and assert that we read at + * least STAT_PARENT_PID_READ_N. My reading of procfs(5) is + * that on any modern kernel (at least since 2.6.0 released in + * 2003) even if all the mandatory numeric fields were zero'd + * out we'd get at least 100 bytes, but let's just check that + * we got anything at all and trust the parse_proc_stat() + * function to handle its "Bad Kernel?" error checking. + */ + if (!strbuf_fread(&sb, STAT_PARENT_PID_READ_N, fp)) + goto cleanup; + if (parse_proc_stat(&sb, name, statppid) < 0) + goto cleanup; + ret = 0; +cleanup: + if (fp) + fclose(fp); strbuf_release(&procfs_path); + strbuf_release(&sb); + + return ret; +} + +static void push_ancestry_name(struct strvec *names, pid_t pid) +{ + struct strbuf name = STRBUF_INIT; + int ppid; + + if (stat_parent_pid(pid, &name, &ppid) < 0) + goto cleanup; + + strvec_push(names, name.buf); + + /* + * Both errors and reaching the end of the process chain are + * reported as fields of 0 by proc(5) + */ + if (ppid) + push_ancestry_name(names, ppid); +cleanup: strbuf_release(&name); return; @@ -45,11 +164,7 @@ void trace2_collect_process_info(enum trace2_process_info_reason reason) */ break; case TRACE2_PROCESS_INFO_STARTUP: - /* - * NEEDSWORK: we could do the entire ptree in an array instead, - * see compat/win32/trace2_win32_process_info.c. - */ - get_ancestry_names(&names); + push_ancestry_name(&names, getppid()); if (names.nr) trace2_cmd_ancestry(names.v); -- 2.33.0.733.ga72a4f1c2e1
On Thu, Aug 26 2021, Taylor Blau wrote: > On Thu, Aug 26, 2021 at 01:19:24AM +0200, Ævar Arnfjörð Bjarmason wrote: >> In 2f732bf15e6 (tr2: log parent process name, 2021-07-21) we started >> logging parent process names, but only logged all parents on Windows. >> on Linux only the name of the immediate parent process was logged. >> >> Extend the functionality added there to also log full parent chain on >> Linux. In 2f732bf15e6 it was claimed that "further ancestry info can >> be gathered with procfs, but it's unwieldy to do so.". >> >> I don't know what the author meant by that, but I think it probably >> referred to needing to slurp this up from the FS, as opposed to having >> an API. > > I don't think that this (specifically, "I don't know what the author > meant by that") is necessary information to include in a patch message. > > If you're looking for a replacement (and you may not be, but just my > $.02) I would suggest: > > "2f732bf15e6 does not log the full parent chain on Linux; implement > that functionality here." I hope Taylor doesn't mind me quoting it, but he sent me this follow-up off-list: For what it's worth, I really struggled to write this. What I was trying to say was something along the lines of "let's give Emily a little more credit for not doing this right off the bat", but I didn't want to write that exactly, since I don't think it was your original intention. At the very least, saying something to the effect of "I don't know what the original author thought was so tough, here's a patch to implement it" isn't helping anybody, so I tried to focus on that. Anyway, just writing to you off-list to say that I do think you had good intentions, but that what you wrote may be read differently by others in a way that you didn't intend it to be. First, thanks to Taylor for pointing this out, and second I'd like to apologize for that comment. More than some "sorry someone read it that way", this really does read even to me, its author, shortly after having written it like something that's way more on the side of sneakiness than a charitable comment. I.e. like some snipe-y paraphrasing of "maybe they found this problem too complex, but look how easy it is!" than something charitable that's meant to barely pass under the radar. Hopefully it helps that I'm honestly just being a bit of an inconsiderate idiot here than actively malicious. What I meant to accomplish here was to guide a reader of these patches through the same mental states I went through when reading the original patch. I.e. I took it from its description that there were some unstated special-cases in reading procfs that made it harder to deal with than not. I think those comments were rather just a way to say that scraping procfs was a bit of a pain, and the MVP in 2f732bf15e6 was good enough for now, which is fair enough. I rephrased those comments in v2 of this patch[1]. The summary starting at the 4th paragraph ("It's possible given the semantics[...]") is an edge case I hadn't considered when I wrote v1. I in turn copied that "unweildy" comment from an even older version[2], where the difficulties of parsing the "comm" field out of "/proc/*/stat" weren't apparent to me. I.e. I think if I had to describe that interface now I would describe it as unwieldy, but wouldn't when I wrote v1[1] or that v0[2]. I.e. I think there's no convenient way to get full atomic snapshot of the process tree, so "give me the names of all my parents as an array" is definitely somewhere between brittle and unwieldy, in particular considering the hassle of parsing "/proc/*/stat" properly. But I digress, which is also a problem I have with being overly verbose sometimes and weaving enough rope to hang myself with. The point isn't the difficulties of the procfs(5) interface, but that particularly with a medium like the text-based communication we mainly use in this project it behooves the sender to think not only about what they're saying, but how what they're saying is going to be interpreted by the recipient and bystanders alike. I think this is a case where I clearly failed at that, sorry Emily, and thanks Taylor for pointing this out to me. 1. https://lore.kernel.org/git/cover-v2-0.6-00000000000-20210826T121820Z-avarab@gmail.com/ 2. https://lore.kernel.org/git/87o8agp29o.fsf@evledraar.gmail.com/
On Thu, Aug 26, 2021 at 8:22 AM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote: > [...] > Let's move all the freeing of the memory to the end of the > function. If we're still at STRBUF_INIT with "name" due to not haven s/haven/having/ > taken the branch where the strbuf_read_file() succeeds freeing it is > redundant, so we could move it into the body of the "if", but just > handling freeing the same way for all branches of the function makes > it more readable. > [...] > Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > Subject: Re: [PATCH v2 4/6] tr2: fix memory leak & logic error in 2f732bf15e6 Please remember to write your commit titles to help readers of "git shortlog". This commit fixes get_ancestry_names() function. get_ancestry_names(): leave the parent list empty upon failure or something like that. The rest of the proposed log message looks perfect and so do the changes to the code. > In a subsequent commit I'll be replacing most of this code to log N > parents, but let's first fix bugs introduced in the recent > 2f732bf15e6 (tr2: log parent process name, 2021-07-21). > > It was using the strbuf_read_file() in the wrong way, its return value > is either a length or a negative value on error. If we didn't have a > procfs, or otherwise couldn't access it we'd end up pushing an empty > string to the trace2 ancestry array. > > It was also using the strvec_push() API the wrong way. That API always > does an xstrdup(), so by detaching the strbuf here we'd leak > memory. Let's instead pass in our pointer for strvec_push() to > xstrdup(), and then free our own strbuf. I do have some WIP changes to > make strvec_push_nodup() non-static, which makes this and some other > callsites nicer, but let's just follow the prevailing pattern of using > strvec_push() for now. > > We'll also need to free that "procfs_path" strbuf whether or not > strbuf_read_file() succeeds, which was another source of memory leaks > in 2f732bf15e6, i.e. we'd leak that memory as well if we weren't on a > system where we could read the file from procfs. > > Let's move all the freeing of the memory to the end of the > function. If we're still at STRBUF_INIT with "name" due to not haven > taken the branch where the strbuf_read_file() succeeds freeing it is > redundant, so we could move it into the body of the "if", but just > handling freeing the same way for all branches of the function makes > it more readable. > > In combination with the preceding commit this makes all of > t[0-9]*trace2*.sh pass under SANITIZE=leak on Linux. > > Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> > --- > compat/linux/procinfo.c | 8 +++++--- > 1 file changed, 5 insertions(+), 3 deletions(-) > > diff --git a/compat/linux/procinfo.c b/compat/linux/procinfo.c > index 62f8aaed4cc..bd01f017bc7 100644 > --- a/compat/linux/procinfo.c > +++ b/compat/linux/procinfo.c > @@ -18,12 +18,14 @@ static void get_ancestry_names(struct strvec *names) > > /* try to use procfs if it's present. */ > strbuf_addf(&procfs_path, "/proc/%d/comm", getppid()); > - if (strbuf_read_file(&name, procfs_path.buf, 0)) { > - strbuf_release(&procfs_path); > + if (strbuf_read_file(&name, procfs_path.buf, 0) > 0) { > strbuf_trim_trailing_newline(&name); > - strvec_push(names, strbuf_detach(&name, NULL)); > + strvec_push(names, name.buf); > } > > + strbuf_release(&procfs_path); > + strbuf_release(&name); > + > return; > }
On Thu, Aug 26, 2021 at 02:22:18PM +0200, Ævar Arnfjörð Bjarmason wrote:
> An early re-roll due to some very good & early review by Taylor Blau,
> this also fixes the grammar/typo pointed out by Eric Sunshine. Thanks
> both:
Thanks; aside from the cosmetic issues that Junio and Eric already
brought up, this version looks good to me. I probably would have avoided
4/6 and jumped straight to 6/6, but your approach is good, too.
Thanks for being so receptive to my comments (public and private) in the
earlier round of this series.
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Thanks,
Taylor
A hopefully final re-roll with minor commit message changes. Thanks everyone for the reviews! Ævar Arnfjörð Bjarmason (6): tr2: remove NEEDSWORK comment for "non-procfs" implementations tr2: clarify TRACE2_PROCESS_INFO_EXIT comment under Linux tr2: stop leaking "thread_name" memory tr2: leave the parent list empty upon failure & don't leak memory tr2: do compiler enum check in trace2_collect_process_info() tr2: log N parent process names on Linux compat/linux/procinfo.c | 169 ++++++++++++++++++++++++++++++++++------ trace2/tr2_tls.c | 1 + 2 files changed, 146 insertions(+), 24 deletions(-) Range-diff against v2: 1: 8c649ce3b4 = 1: 306f14a0f7 tr2: remove NEEDSWORK comment for "non-procfs" implementations 2: 0150e3402a = 2: a999e016a9 tr2: clarify TRACE2_PROCESS_INFO_EXIT comment under Linux 3: 1d835d6767 = 3: 45769da953 tr2: stop leaking "thread_name" memory 4: 1aa0dbc394 ! 4: 946140691f tr2: fix memory leak & logic error in 2f732bf15e6 @@ Metadata Author: Ævar Arnfjörð Bjarmason <avarab@gmail.com> ## Commit message ## - tr2: fix memory leak & logic error in 2f732bf15e6 + tr2: leave the parent list empty upon failure & don't leak memory In a subsequent commit I'll be replacing most of this code to log N parents, but let's first fix bugs introduced in the recent @@ Commit message system where we could read the file from procfs. Let's move all the freeing of the memory to the end of the - function. If we're still at STRBUF_INIT with "name" due to not haven + function. If we're still at STRBUF_INIT with "name" due to not having taken the branch where the strbuf_read_file() succeeds freeing it is - redundant, so we could move it into the body of the "if", but just + redundant. So we could move it into the body of the "if", but just handling freeing the same way for all branches of the function makes it more readable. 5: 70fef093d8 = 5: 0bea5aa9c9 tr2: do compiler enum check in trace2_collect_process_info() 6: f6aac90248 = 6: 6eac9986c3 tr2: log N parent process names on Linux -- 2.33.0.736.g68690aaec9a
I'm fairly sure that there is no way on Linux to inspect the process tree without using procfs, any tool such as ps(1), top(1) etc. that shows this sort of information ultimately looks the information up in procfs. So let's remove this comment added in 2f732bf15e6 (tr2: log parent process name, 2021-07-21), it's setting us up for an impossible task. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- compat/linux/procinfo.c | 1 - 1 file changed, 1 deletion(-) diff --git a/compat/linux/procinfo.c b/compat/linux/procinfo.c index 578fed4cd3..15a89676c7 100644 --- a/compat/linux/procinfo.c +++ b/compat/linux/procinfo.c @@ -25,7 +25,6 @@ static void get_ancestry_names(struct strvec *names) } return; - /* NEEDSWORK: add non-procfs-linux implementations here */ } void trace2_collect_process_info(enum trace2_process_info_reason reason) -- 2.33.0.736.g68690aaec9a
Rewrite a comment added in 2f732bf15e6 (tr2: log parent process name, 2021-07-21) to describe what we might do under TRACE2_PROCESS_INFO_EXIT in the future, instead of vaguely referring to "something extra". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- compat/linux/procinfo.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/compat/linux/procinfo.c b/compat/linux/procinfo.c index 15a89676c7..62f8aaed4c 100644 --- a/compat/linux/procinfo.c +++ b/compat/linux/procinfo.c @@ -32,8 +32,12 @@ void trace2_collect_process_info(enum trace2_process_info_reason reason) if (!trace2_is_enabled()) return; - /* someday we may want to write something extra here, but not today */ if (reason == TRACE2_PROCESS_INFO_EXIT) + /* + * The Windows version of this calls its + * get_peak_memory_info() here. We may want to insert + * similar process-end statistics here in the future. + */ return; if (reason == TRACE2_PROCESS_INFO_STARTUP) { -- 2.33.0.736.g68690aaec9a
Fix a memory leak introduced in ee4512ed481 (trace2: create new combined trace facility, 2019-02-22), we were doing a free() of other memory allocated in tr2tls_create_self(), but not the "thread_name" "struct strbuf". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- trace2/tr2_tls.c | 1 + 1 file changed, 1 insertion(+) diff --git a/trace2/tr2_tls.c b/trace2/tr2_tls.c index 067c23755f..7da94aba52 100644 --- a/trace2/tr2_tls.c +++ b/trace2/tr2_tls.c @@ -95,6 +95,7 @@ void tr2tls_unset_self(void) pthread_setspecific(tr2tls_key, NULL); + strbuf_release(&ctx->thread_name); free(ctx->array_us_start); free(ctx); } -- 2.33.0.736.g68690aaec9a
In a subsequent commit I'll be replacing most of this code to log N parents, but let's first fix bugs introduced in the recent 2f732bf15e6 (tr2: log parent process name, 2021-07-21). It was using the strbuf_read_file() in the wrong way, its return value is either a length or a negative value on error. If we didn't have a procfs, or otherwise couldn't access it we'd end up pushing an empty string to the trace2 ancestry array. It was also using the strvec_push() API the wrong way. That API always does an xstrdup(), so by detaching the strbuf here we'd leak memory. Let's instead pass in our pointer for strvec_push() to xstrdup(), and then free our own strbuf. I do have some WIP changes to make strvec_push_nodup() non-static, which makes this and some other callsites nicer, but let's just follow the prevailing pattern of using strvec_push() for now. We'll also need to free that "procfs_path" strbuf whether or not strbuf_read_file() succeeds, which was another source of memory leaks in 2f732bf15e6, i.e. we'd leak that memory as well if we weren't on a system where we could read the file from procfs. Let's move all the freeing of the memory to the end of the function. If we're still at STRBUF_INIT with "name" due to not having taken the branch where the strbuf_read_file() succeeds freeing it is redundant. So we could move it into the body of the "if", but just handling freeing the same way for all branches of the function makes it more readable. In combination with the preceding commit this makes all of t[0-9]*trace2*.sh pass under SANITIZE=leak on Linux. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- compat/linux/procinfo.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/compat/linux/procinfo.c b/compat/linux/procinfo.c index 62f8aaed4c..bd01f017bc 100644 --- a/compat/linux/procinfo.c +++ b/compat/linux/procinfo.c @@ -18,12 +18,14 @@ static void get_ancestry_names(struct strvec *names) /* try to use procfs if it's present. */ strbuf_addf(&procfs_path, "/proc/%d/comm", getppid()); - if (strbuf_read_file(&name, procfs_path.buf, 0)) { - strbuf_release(&procfs_path); + if (strbuf_read_file(&name, procfs_path.buf, 0) > 0) { strbuf_trim_trailing_newline(&name); - strvec_push(names, strbuf_detach(&name, NULL)); + strvec_push(names, name.buf); } + strbuf_release(&procfs_path); + strbuf_release(&name); + return; } -- 2.33.0.736.g68690aaec9a
Change code added in 2f732bf15e6 (tr2: log parent process name, 2021-07-21) to use a switch statement without a "default" branch to have the compiler error if this code ever drifts out of sync with the members of the "enum trace2_process_info_reason". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- compat/linux/procinfo.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/compat/linux/procinfo.c b/compat/linux/procinfo.c index bd01f017bc..0b47d44990 100644 --- a/compat/linux/procinfo.c +++ b/compat/linux/procinfo.c @@ -31,29 +31,30 @@ static void get_ancestry_names(struct strvec *names) void trace2_collect_process_info(enum trace2_process_info_reason reason) { + struct strvec names = STRVEC_INIT; + if (!trace2_is_enabled()) return; - if (reason == TRACE2_PROCESS_INFO_EXIT) + switch (reason) { + case TRACE2_PROCESS_INFO_EXIT: /* * The Windows version of this calls its * get_peak_memory_info() here. We may want to insert * similar process-end statistics here in the future. */ - return; - - if (reason == TRACE2_PROCESS_INFO_STARTUP) { + break; + case TRACE2_PROCESS_INFO_STARTUP: /* * NEEDSWORK: we could do the entire ptree in an array instead, * see compat/win32/trace2_win32_process_info.c. */ - struct strvec names = STRVEC_INIT; - get_ancestry_names(&names); if (names.nr) trace2_cmd_ancestry(names.v); strvec_clear(&names); + break; } return; -- 2.33.0.736.g68690aaec9a
In 2f732bf15e6 (tr2: log parent process name, 2021-07-21) we started logging parent process names, but only logged all parents on Windows. on Linux only the name of the immediate parent process was logged. Extend the functionality added there to also log full parent chain on Linux. This requires us to lookup "/proc/<getppid()>/stat" instead of "/proc/<getppid()>/comm". The "comm" file just contains the name of the process, but the "stat" file has both that information, and the parent PID of that process, see procfs(5). We parse out the parent PID of our own parent, and recursively walk the chain of "/proc/*/stat" files all the way up the chain. A parent PID of 0 indicates the end of the chain. It's possible given the semantics of Linux's PID files that we end up getting an entirely nonsensical chain of processes. It could happen if e.g. we have a chain of processes like: 1 (init) => 321 (bash) => 123 (git) Let's assume that "bash" was started a while ago, and that as shown the OS has already cycled back to using a lower PID for us than our parent process. In the time it takes us to start up and get to trace2_collect_process_info(TRACE2_PROCESS_INFO_STARTUP) our parent process might exit, and be replaced by an entirely different process! We'd racily look up our own getppid(), but in the meantime our parent would exit, and Linux would have cycled all the way back to starting an entirely unrelated process as PID 321. If that happens we'll just silently log incorrect data in our ancestry chain. Luckily we don't need to worry about this except in this specific cycling scenario, as Linux does not have PID randomization. It appears it once did through a third-party feature, but that it was removed around 2006[1]. For anyone worried about this edge case raising PID_MAX via "/proc/sys/kernel/pid_max" will mitigate it, but not eliminate it. One thing we don't need to worry about is getting into an infinite loop when walking "/proc/*/stat". See 353d3d77f4f (trace2: collect Windows-specific process information, 2019-02-22) for the related Windows code that needs to deal with that, and [2] for an explanation of that edge case. Aside from potential race conditions it's also a bit painful to correctly parse the process name out of "/proc/*/stat". A simpler approach is to use fscanf(), see [3] for an implementation of that, but as noted in the comment being added here it would fail in the face of some weird process names, so we need our own parse_proc_stat() to parse it out. With this patch the "ancestry" chain for a trace2 event might look like this: $ GIT_TRACE2_EVENT=/dev/stdout ~/g/git/git version | grep ancestry | jq -r .ancestry [ "bash", "screen", "systemd" ] And in the case of naughty process names like the following. This uses perl's ability to use prctl(PR_SET_NAME, ...). See Perl/perl5@7636ea95c5 (Set the legacy process name with prctl() on assignment to $0 on Linux, 2010-04-15)[4]: $ perl -e '$0 = "(naughty\nname)"; system "GIT_TRACE2_EVENT=/dev/stdout ~/g/git/git version"' | grep ancestry | jq -r .ancestry [ "sh", "(naughty\nname)", "bash", "screen", "systemd" ] 1. https://grsecurity.net/news#grsec2110 2. https://lore.kernel.org/git/48a62d5e-28e2-7103-a5bb-5db7e197a4b9@jeffhostetler.com/ 3. https://lore.kernel.org/git/87o8agp29o.fsf@evledraar.gmail.com/ 4. https://github.com/Perl/perl5/commit/7636ea95c57762930accf4358f7c0c2dec086b5e Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- compat/linux/procinfo.c | 149 +++++++++++++++++++++++++++++++++++----- 1 file changed, 132 insertions(+), 17 deletions(-) diff --git a/compat/linux/procinfo.c b/compat/linux/procinfo.c index 0b47d44990..bc2f9382a1 100644 --- a/compat/linux/procinfo.c +++ b/compat/linux/procinfo.c @@ -4,26 +4,145 @@ #include "strvec.h" #include "trace2.h" -static void get_ancestry_names(struct strvec *names) +/* + * We need more complex parsing in stat_parent_pid() and + * parse_proc_stat() below than a dumb fscanf(). That's because while + * the statcomm field is surrounded by parentheses, the process itself + * is free to insert any arbitrary byte sequence its its name. That + * can include newlines, spaces, closing parentheses etc. + * + * See do_task_stat() in fs/proc/array.c in linux.git, this is in + * contrast with the escaped version of the name found in + * /proc/%d/status. + * + * So instead of using fscanf() we'll read N bytes from it, look for + * the first "(", and then the last ")", anything in-between is our + * process name. + * + * How much N do we need? On Linux /proc/sys/kernel/pid_max is 2^15 by + * default, but it can be raised set to values of up to 2^22. So + * that's 7 digits for a PID. We have 2 PIDs in the first four fields + * we're interested in, so 2 * 7 = 14. + * + * We then have 3 spaces between those four values, and we'd like to + * get to the space between the 4th and the 5th (the "pgrp" field) to + * make sure we read the entire "ppid" field. So that brings us up to + * 14 + 3 + 1 = 18. Add the two parentheses around the "comm" value + * and it's 20. The "state" value itself is then one character (now at + * 21). + * + * Finally the maximum length of the "comm" name itself is 15 + * characters, e.g. a setting of "123456789abcdefg" will be truncated + * to "123456789abcdef". See PR_SET_NAME in prctl(2). So all in all + * we'd need to read 21 + 15 = 36 bytes. + * + * Let's just read 2^6 (64) instead for good measure. If PID_MAX ever + * grows past 2^22 we'll be future-proof. We'll then anchor at the + * last ")" we find to locate the parent PID. + */ +#define STAT_PARENT_PID_READ_N 64 + +static int parse_proc_stat(struct strbuf *sb, struct strbuf *name, + int *statppid) { + const char *comm_lhs = strchr(sb->buf, '('); + const char *comm_rhs = strrchr(sb->buf, ')'); + const char *ppid_lhs, *ppid_rhs; + char *p; + pid_t ppid; + + if (!comm_lhs || !comm_rhs) + goto bad_kernel; + /* - * NEEDSWORK: We could gather the entire pstree into an array to match - * functionality with compat/win32/trace2_win32_process_info.c. - * To do so, we may want to examine /proc/<pid>/stat. For now, just - * gather the immediate parent name which is readily accessible from - * /proc/$(getppid())/comm. + * We're at the ")", that's followed by " X ", where X is a + * single "state" character. So advance by 4 bytes. */ + ppid_lhs = comm_rhs + 4; + + /* + * Read until the space between the "ppid" and "pgrp" fields + * to make sure we're anchored after the untruncated "ppid" + * field.. + */ + ppid_rhs = strchr(ppid_lhs, ' '); + if (!ppid_rhs) + goto bad_kernel; + + ppid = strtol(ppid_lhs, &p, 10); + if (ppid_rhs == p) { + const char *comm = comm_lhs + 1; + size_t commlen = comm_rhs - comm; + + strbuf_add(name, comm, commlen); + *statppid = ppid; + + return 0; + } + +bad_kernel: + /* + * We were able to read our STAT_PARENT_PID_READ_N bytes from + * /proc/%d/stat, but the content is bad. Broken kernel? + * Should not happen, but handle it gracefully. + */ + return -1; +} + +static int stat_parent_pid(pid_t pid, struct strbuf *name, int *statppid) +{ struct strbuf procfs_path = STRBUF_INIT; - struct strbuf name = STRBUF_INIT; + struct strbuf sb = STRBUF_INIT; + FILE *fp; + int ret = -1; /* try to use procfs if it's present. */ - strbuf_addf(&procfs_path, "/proc/%d/comm", getppid()); - if (strbuf_read_file(&name, procfs_path.buf, 0) > 0) { - strbuf_trim_trailing_newline(&name); - strvec_push(names, name.buf); - } + strbuf_addf(&procfs_path, "/proc/%d/stat", pid); + fp = fopen(procfs_path.buf, "r"); + if (!fp) + goto cleanup; + + /* + * We could be more strict here and assert that we read at + * least STAT_PARENT_PID_READ_N. My reading of procfs(5) is + * that on any modern kernel (at least since 2.6.0 released in + * 2003) even if all the mandatory numeric fields were zero'd + * out we'd get at least 100 bytes, but let's just check that + * we got anything at all and trust the parse_proc_stat() + * function to handle its "Bad Kernel?" error checking. + */ + if (!strbuf_fread(&sb, STAT_PARENT_PID_READ_N, fp)) + goto cleanup; + if (parse_proc_stat(&sb, name, statppid) < 0) + goto cleanup; + ret = 0; +cleanup: + if (fp) + fclose(fp); strbuf_release(&procfs_path); + strbuf_release(&sb); + + return ret; +} + +static void push_ancestry_name(struct strvec *names, pid_t pid) +{ + struct strbuf name = STRBUF_INIT; + int ppid; + + if (stat_parent_pid(pid, &name, &ppid) < 0) + goto cleanup; + + strvec_push(names, name.buf); + + /* + * Both errors and reaching the end of the process chain are + * reported as fields of 0 by proc(5) + */ + if (ppid) + push_ancestry_name(names, ppid); +cleanup: strbuf_release(&name); return; @@ -45,11 +164,7 @@ void trace2_collect_process_info(enum trace2_process_info_reason reason) */ break; case TRACE2_PROCESS_INFO_STARTUP: - /* - * NEEDSWORK: we could do the entire ptree in an array instead, - * see compat/win32/trace2_win32_process_info.c. - */ - get_ancestry_names(&names); + push_ancestry_name(&names, getppid()); if (names.nr) trace2_cmd_ancestry(names.v); -- 2.33.0.736.g68690aaec9a
On Fri, Aug 27, 2021 at 10:02:12AM +0200, Ævar Arnfjörð Bjarmason wrote:
> Range-diff against v2:
> 1: 8c649ce3b4 = 1: 306f14a0f7 tr2: remove NEEDSWORK comment for "non-procfs" implementations
> 2: 0150e3402a = 2: a999e016a9 tr2: clarify TRACE2_PROCESS_INFO_EXIT comment under Linux
> 3: 1d835d6767 = 3: 45769da953 tr2: stop leaking "thread_name" memory
> 4: 1aa0dbc394 ! 4: 946140691f tr2: fix memory leak & logic error in 2f732bf15e6
> @@ Metadata
> Author: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
>
> ## Commit message ##
> - tr2: fix memory leak & logic error in 2f732bf15e6
> + tr2: leave the parent list empty upon failure & don't leak memory
I agree with Junio that the earlier subject line could be improved, but
the replacement is a little verbose for my taste. On the other hand, I'm
not sure how to simplify it without it becoming:
tr2: memory leaks and bug fixes
;-).
So I think that what you wrote here is perfectly good. This version
looks good to me, too.
Thanks,
Taylor