git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff Hostetler <git@jeffhostetler.com>
To: Jonathan Nieder <jrnieder@gmail.com>,
	Jeff Hostetler via GitGitGadget <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, jeffhost@microsoft.com,
	Junio C Hamano <gitster@pobox.com>,
	Derrick Stolee <dstolee@microsoft.com>
Subject: Re: [PATCH 0/8] WIP: trace2: a new trace facility
Date: Tue, 15 Jan 2019 12:03:55 -0500	[thread overview]
Message-ID: <5df8db4b-c772-c202-5f58-58e9498f6bc0@jeffhostetler.com> (raw)
In-Reply-To: <20190115010528.GJ162110@google.com>



On 1/14/2019 8:05 PM, Jonathan Nieder wrote:
> Hi,
> 
> Jeff Hostetler wrote:
> 
>> This patch series contains a new trace2 facility that hopefully addresses
>> the recent trace- and structured-logging-related discussions. The intent is
>> to eventually replace the existing trace_ routines (or to route them to the
>> new trace2_ routines) as time permits.
> 
> I've been running with these patches since last October.  A few
> thoughts:
> 
> I like the API.

Great, thanks.  Hopefully you're getting some good/actionable data from
it.


> The logs are a bit noisy and especially wide.  For my use, the
> function name is not too important since we can get that from the file
> and line number.  Should we have a way to omit some fields, or is that
> for post-processing?

Yes, the events are a little wide and noisy, at least in this draft.

Part of this is to flesh out the trace2 API (which should be relatively
fixed) and make sure we have enough event types to emit useful
information.  This is independent of some of the detail events (like
region/data events within status or index reading/writing). Some of
those detail events might be kept if they're useful or temporary
demonstration events or events you could include in a private build for
a limited period of time.  So some of the noise might be those
demonstration events (stuff that you'd want for testing in a perf view,
but not need archived, for example).

Also, for the events that have a "category" field, I'd eventually
like to have a filter setting to include/omit them.  This is something
like the GIT_TRACE_<name> feature we currently have, but limited to
always writing to the same file.  I had this in an earlier version,
but haven't brought it over yet.

And yes, I have a post-processing step that filters fields and
generates a summary record for each process instance.  My previous
draft tried to do that summary inside the git.exe process and it was
suggested that we move that out, so this version emits the raw data
as it occurs and I get the summary after the fact.  This has turned
out nicely, even if the Trace2 stream is a little noisy.

There are some fields that I'd like to omit from my JSON stream that
I'm not using in my summary, such as the filename and line number.
These got carried along since the PERF view needed them.  I think they
make sense in the PERF view, but not so much in the EVENT view.
I'm filtering them out in my post-processing, but I think we could
just omit them.


> We don't find the JSON easy to parse and would prefer a binary format.

I'm going to have to push back a little on this one.  JSON is easy to
process in PERL, C#, various databases, and etc.  Processing a non-text
format in bash is just asking for pain and suffering.

Can you elaborate on the problems you're having with JSON?

When you say "binary" what kind of binary do you mean?  Is this BSON?
Or are you suggesting protocol buffers?  If the latter, is there a C
binding for that? (Every example I've seen talks about C++.)

In my gvfs-trace2-v4 branch, I've refactored the code and now have
a vtable-like mechanism that allows multiple Trace2 "targets" to be
defined.  See trace2/tr2_tgt_perf.c vs trace2/tr2_tgt_events.c.  The
former generates the GIT_TR2_PERF view and the latter generates the
JSON event view.

You could add a self-contained target vtable that generates a binary
view if you wanted.  (Just let it key off of a different GIT_TR2_
environment variable.)


> When I apply the patches, Git complains about whitespace problems
> (trailing whitespace, etc).
> 
> Aside from that kind of easily correctible issue (trailing
> whitespace), I'd be in favor of taking these patches pretty much as-is
> and making improvements in tree.  Any objections to that, or do you
> have other thoughts on where this should go?
> 
> If that sounds reasonable to you, I can send a clean version of these
> based against current "master".  If I understand correctly, then
> 
>   https://github.com/jeffhostetler/git
> 
> branch
> 
>   gvfs-trace2-v4
> 
> contains some improvements, so as a next step I'd try to extract those
> as incremental patches on top.  What do you think?
> 
> Thanks,
> Jonathan

The gvfs-trace2-v4 version has lots of improvements over the version
I last posted on the mailing list.  We should go with it.

I'm not surprised that there are merge conflicts, since mine is based
upon the recent GVFS release and has some gvfs-specific commits in it.

Let me rebase that branch onto the upstream/master and clean up the
mess and send out another patch set.

Hopefully, I can get that out tomorrow.

Jeff

      reply	other threads:[~2019-01-15 17:12 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-31 16:49 [PATCH 0/8] WIP: trace2: a new trace facility Jeff Hostetler via GitGitGadget
2018-08-31 16:49 ` [PATCH 1/8] trace2: create new combined " Jeff Hostetler via GitGitGadget
2018-08-31 17:19   ` Derrick Stolee
2018-09-04 22:12   ` Stefan Beller
2018-09-04 22:30     ` Junio C Hamano
2018-09-05 15:51       ` Jeff Hostetler
2018-09-05 15:01     ` Jeff Hostetler
2018-08-31 16:49 ` [PATCH 2/8] trace2: add trace2 to main Jeff Hostetler via GitGitGadget
2018-08-31 16:49 ` [PATCH 3/8] trace2: demonstrate trace2 regions in wt-status Jeff Hostetler via GitGitGadget
2018-08-31 16:49 ` [PATCH 4/8] trace2: demonstrate trace2 child process classification Jeff Hostetler via GitGitGadget
2018-08-31 16:50 ` [PATCH 5/8] trace2: demonstrate instrumenting do_read_index Jeff Hostetler via GitGitGadget
2018-08-31 16:50 ` [PATCH 6/8] trace2: demonstrate instrumenting threaded preload_index Jeff Hostetler via GitGitGadget
2018-08-31 16:50 ` [PATCH 7/8] trace2: demonstrate setting sub-command parameter in checkout Jeff Hostetler via GitGitGadget
2018-08-31 16:50 ` [PATCH 8/8] trace2: demonstrate use of regions in read_directory_recursive Jeff Hostetler via GitGitGadget
2018-08-31 17:19 ` [PATCH 0/8] WIP: trace2: a new trace facility Derrick Stolee
2018-09-06 15:13   ` [RFC PATCH 0/6] Use trace2 in commit-reach Derrick Stolee
2018-09-06 15:13     ` [RFC PATCH 1/6] commit-reach: add trace2 telemetry and walk count Derrick Stolee
2018-09-06 15:13     ` [RFC PATCH 2/6] comit-reach: use trace2 for commit_contains_tag_algo Derrick Stolee
2018-09-06 15:13     ` [RFC PATCH 3/6] commit-reach: use trace2 in can_all_from_reach Derrick Stolee
2018-09-06 15:13     ` [RFC PATCH 4/6] test-tool: start trace2 environment Derrick Stolee
2018-09-06 15:13     ` [RFC PATCH 5/6] test-lib: add run_and_check_trace2 Derrick Stolee
2018-09-06 15:13     ` [RFC PATCH 6/6] commit-reach: fix first-parent heuristic Derrick Stolee
2018-10-11  1:50       ` Jonathan Nieder
2018-10-11 11:00         ` Derrick Stolee
2019-01-15  1:05 ` [PATCH 0/8] WIP: trace2: a new trace facility Jonathan Nieder
2019-01-15 17:03   ` Jeff Hostetler [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5df8db4b-c772-c202-5f58-58e9498f6bc0@jeffhostetler.com \
    --to=git@jeffhostetler.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=jeffhost@microsoft.com \
    --cc=jrnieder@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).