All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Jeff Hostetler <git@jeffhostetler.com>
Cc: Josh Steadmon <steadmon@google.com>, git@vger.kernel.org
Subject: Re: [PATCH 2/2] trace2: randomize/timestamp trace2 targets
Date: Fri, 15 Mar 2019 20:26:42 +0100	[thread overview]
Message-ID: <87sgvo9bwt.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <1431dc76-1b1c-c581-6355-b796591e99a8@jeffhostetler.com>


On Fri, Mar 15 2019, Jeff Hostetler wrote:

> On 3/13/2019 7:49 PM, Ævar Arnfjörð Bjarmason wrote:
>>
>> On Thu, Mar 14 2019, Josh Steadmon wrote:
>>
>>> When the value of a trace2 environment variable contains instances of
>>> the string "%ISO8601%", expand them into the current UTC timestamp in
>>> ISO 8601 format.
>>
>> Any reason not to just support feeding the path to strbuf_addftime(), to
>> e.g. support a daily/hourly log?
>>
>>> When the value of a trace2 environment variable is an absolute path
>>> referring to an existing directory, write output to randomly-named
>>> files under the given directory. If the value is an absolute path
>>> referring to a non-existent file and ends with a dash, use the value as
>>> a prefix for randomly named files.
>>>
>>> The random filenames will consist of the value of the environment
>>> variable (after potential timestamp expansion), followed by a 6
>>> character random string such as would be produced by mkstemp(3).
>>>
>>> This makes it more convenient to collect traces for every git
>>> invocation by unconditionally setting the relevant trace2 envvar to a
>>> constant directory name.
>>
>> Hrm, api-trace2.txt already specifies that the "sid" is going to be
>> unique, couldn't we just have some mode where we use that?
>>
>> But then of course when we have nested processes will contain slashes,
>> so we'd either run into deep nesting or need to munge the slashes, in
>> which case we might bump against a file length limit (although I haven't
>> seen process trees deeper than 3-4).
>
> Using the "sid" would be a good place to start.  Just take the final
> component in the string (after the last slash or the whole sid if there
> are no slashes).  That will give you a filename with microseconds since
> epoch of the command's start time and the PID.
>
> That should be unique, should not require random strings, and not go
> deep in the filesystem.  And it will let you correlate files between
> child and parent commands, if you need to.
>
> So maybe if GIT_TR2_* is set to a directory, we append the final portion
> of the "sid" and create a file inside that directory.
>
>>
>> Just to pry about the use-case since I'm doing similar collecting, why
>> are you finding this easier to process?
>>
>> With the current O_APPEND semantics you're (unless I've missed
>> something) guaranteed to get a single process tree in nested order,
>> whereas with this they'll all end up in separate files and you'll need
>> to slurp them up, sort the whole thing and stitch it together yourself
>> without the benefit of stream-parsing it where you can cheat a bit
>> knowing that e.g. a "reflog expire" entry is always coming after the
>> corresponding "gc" that invoked it.
>>
>
> Yes, with O_APPEND, you should get a series of events as they happen
> on the system all properly interleaved.  And see concurrent activity.
> This file should let you grep to see individual processes if you want
> to.
>
> Routing each command to a different file is fine if you want, but
> that opens you up to having to manage and delete them.
>
> Whether to have 1 file (with occasional rotation) or 1 file-per-command
> depends, I guess, on how you want to process them.
>
> I'm routing the Trace2 data to a named-pipe/socket and have a daemon
> collecting and filtering, so I have a single pathname for output and
> yet get the per-file stream handling that I think Josh is looking for.

Is the collecting code something you can share & general enough that it
might be useful for others?

  reply	other threads:[~2019-03-15 19:26 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-13 23:33 [PATCH 0/2] Randomize / timestamp trace2 targets Josh Steadmon
2019-03-13 23:33 ` [PATCH 1/2] date: make get_time() public Josh Steadmon
2019-03-13 23:33 ` [PATCH 2/2] trace2: randomize/timestamp trace2 targets Josh Steadmon
2019-03-13 23:49   ` Ævar Arnfjörð Bjarmason
2019-03-15 18:39     ` Jeff Hostetler
2019-03-15 19:26       ` Ævar Arnfjörð Bjarmason [this message]
2019-03-15 20:14         ` Jeff Hostetler
2019-03-15 20:43     ` Josh Steadmon
2019-03-15 20:49       ` Josh Steadmon
2019-03-18  1:40         ` Junio C Hamano
2019-03-19  3:17           ` Jeff King
2019-03-14  0:16   ` Jeff King
2019-03-14  6:07     ` Junio C Hamano
2019-03-14 14:34 ` [PATCH 0/2] Randomize / timestamp " Johannes Schindelin
2019-03-15 20:37   ` Josh Steadmon
2019-03-15 19:18 ` Jeff Hostetler
2019-03-15 20:38   ` Josh Steadmon
2019-03-18 12:50     ` Jeff Hostetler
2019-03-21  0:16 ` [PATCH v2 0/1] Write trace2 output to directories Josh Steadmon
2019-03-21  0:16   ` [PATCH v2 1/1] trace2: write to directory targets Josh Steadmon
2019-03-21  2:04     ` Junio C Hamano
2019-03-21 17:43       ` Jeff Hostetler
2019-03-22  3:30         ` Junio C Hamano
2019-03-22 14:20           ` Jeff Hostetler
2019-03-21 21:09 ` [PATCH v3 0/1] Write trace2 output to directories Josh Steadmon
2019-03-21 21:09   ` [PATCH v3 1/1] trace2: write to directory targets Josh Steadmon
2019-03-23 20:44     ` Ævar Arnfjörð Bjarmason
2019-03-24 12:33       ` Junio C Hamano
2019-03-24 14:51         ` Ævar Arnfjörð Bjarmason
2019-03-25  2:21           ` Junio C Hamano
2019-03-25  8:21             ` Ævar Arnfjörð Bjarmason
2019-03-25 16:29       ` Jeff Hostetler
2019-03-21 21:16   ` [PATCH v3 0/1] Write trace2 output to directories Jeff Hostetler
2019-03-22  5:23     ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87sgvo9bwt.fsf@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=git@jeffhostetler.com \
    --cc=git@vger.kernel.org \
    --cc=steadmon@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.