All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Jeff Hostetler <git@jeffhostetler.com>
Cc: git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>,
	Jeff Hostetler <jeffhost@microsoft.com>,
	Jeff Hostetler via GitGitGadget <gitgitgadget@gmail.com>
Subject: Re: [RFC PATCH] trace2 API: don't save a copy of constant "thread_name"
Date: Tue, 11 Oct 2022 14:52:11 +0200	[thread overview]
Message-ID: <221011.86lepmo5dn.gmgdl@evledraar.gmail.com> (raw)
In-Reply-To: <cb9f8321-d9e6-6f80-a590-a9ad49c7f557@jeffhostetler.com>


On Mon, Oct 10 2022, Jeff Hostetler wrote:

> On 10/6/22 9:10 PM, Ævar Arnfjörð Bjarmason wrote:
>> Since ee4512ed481 (trace2: create new combined trace facility,
>> 2019-02-22) the "thread_name" member of "struct tr2tls_thread_ctx" has
>> been copied from the caller, but those callers have always passed a
>> constant string:
>> 	$ git -P grep '^\s*trace2_thread_start\('
>> 	Documentation/technical/api-trace2.txt: trace2_thread_start("preload_thread");
>> 	builtin/fsmonitor--daemon.c:    trace2_thread_start("fsm-health");
>> 	builtin/fsmonitor--daemon.c:    trace2_thread_start("fsm-listen");
>> 	compat/simple-ipc/ipc-unix-socket.c:    trace2_thread_start("ipc-worker");
>> 	compat/simple-ipc/ipc-unix-socket.c:    trace2_thread_start("ipc-accept");
>> 	compat/simple-ipc/ipc-win32.c:  trace2_thread_start("ipc-server");
>> 	t/helper/test-fsmonitor-client.c:       trace2_thread_start("hammer");
>> 	t/helper/test-simple-ipc.c:     trace2_thread_start("multiple");
>> This isn't needed for optimization, but apparently[1] there's been
>> some confusion about the non-const-ness of the previous "struct
>> strbuf".
>> Using the caller's string here makes this more straightforward, as
>> it's now clear that we're not dynamically constructing these. It's
>> also what the progress API does with its "title" string.
>> Since we know we're hardcoding these thread names let's BUG() out
>> when
>> we see that the length of the name plus the length of the prefix would
>> exceed the maximum length for the "perf" format.
>> 1. https://lore.kernel.org/git/82f1672e180afcd876505a4354bd9952f70db49e.1664900407.git.gitgitgadget@gmail.com/
>> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
>
> PLEASE DON'T DO THIS.
>
> If you don't like my patch, fine.  Let's discuss it.  But DON'T submit
> a new one to replace it.  Or worse, try to inject it into the middle
> of an existing series.

I'm not seeking to replace your series, or to tick you off, sorry if it
came across like that.

I just thought (and still think) that we were at a point in the
discussion where it seemed clear that I wasn't quite managing to get
across to you what I meant, so sending that in the form of working code
should clarify things.

Per Junio's "That's nice to learn, indeed." in
<xmqqo7uoh1q0.fsf@gitster.g> it seems to have had that intended effect
on him. It's marked as an RFC, so not-a-thing-to-pick-up, but just for
discussion.

> Yes, current callers are passing a string literal and thread-start
> could take a "const char*" to it, but there is no way to guarantee
> that that is safe if someone decides to dynamically construct their
> thread-name and pass it in (since we don't know the lifetime of that
> pointer).  So it is safer to copy it into the thread context so that
> it can be used by later trace messages.

I think that's a defensible opinion, but I also think it's fair to say
that:

 * This seems to be *the* motivation for why you're doing things the way
   you're doing them, and at least to this reviewer that wasn't really
   coming across...

 * ...nor the context of why we'd need that sort of guarded API in this
   case, but not e.g. for another widely-used API like start_progress().

   See 791afae2924 (progress.c tests: make start/stop commands on stdin,
   2022-02-03) for a case where we're using that where we need to work
   around its behavior (and no, I didn't make the underlying API that
   way, it's just a commit of mine where I'm having to work with it).

I think designing our internal APIs to not be quite so guarded is fine,
and we do that in various other contexts (progress, etc.). We control
both the API and its users, so just leaving a "this must be a constant"
should be enough.

But even if you want to be paranoid about it there's much easier ways to
do that which give you more of the safety you seem to want. E.g. this on
top of master (and easily adjusted on top of this RFC patch):
	
	diff --git a/trace2.h b/trace2.h
	index 88d906ea830..1c3a98fb30f 100644
	--- a/trace2.h
	+++ b/trace2.h
	@@ -306,12 +306,18 @@ void trace2_exec_result_fl(const char *file, int line, int exec_id, int code);
	  *
	  * Thread names should be descriptive, like "preload_index".
	  * Thread names will be decorated with an instance number automatically.
	+ * Thread names must point to data that won't change after it's passed
	+ * into this function. Once trace2_thread_exit() is called it can be
	+ * free'd.
	  */
	 void trace2_thread_start_fl(const char *file, int line,
	 			    const char *thread_name);
	 
	+/*
	+ * The "" is to assure us that API users pass only constant strings
	+ */
	 #define trace2_thread_start(thread_name) \
	-	trace2_thread_start_fl(__FILE__, __LINE__, (thread_name))
	+	trace2_thread_start_fl(__FILE__, __LINE__, (thread_name ""))
	 
	 /*
	  * Emit a 'thread_exit' event.  This must be called from inside the

Will pass, as we only pass it constant strings, but if someone were to
pass a variable it'll blow up, at which point we could provide some
inline macro/function that would do the required xstrdup().

All of which I think is *still* being too paranoid, but which I think
*if* you want the paranoia is much more explicit about what we're trying
to accomplish with said paranoida, and where the compiler will help you.

> [...]
>> +void jw_strbuf_add_thread_name(struct strbuf *buf, const char *thread_name,
>> +			       int thread_id);
>> +void jw_object_string_thread(struct json_writer *jw, const char *thread_name,
>> +			     int thread_id);
>
> This violates a separation of concerns.  json-writer is ONLY concerned
> with formatting valid JSON from basic data types.  It does not know
> about threads or thread contexts.
>
> `js_strbuf_add_thread_name()` also violates the json-writer conventions
> -- that it takes a "struct json_writer *" pointer.  There is nothing
> about JSON here.
>
> You might write a helper (inside of tr2_tgt_event.c) that formats a
> thread-name from the id and hint, but that is specific to the Event
> target -- not to JSON, nor the JSON writer.

That's fair, more on that below.

> But then again, why make every trace message from every target format
> that "th%0d:%s" when we could save some time and format it in the
> thread-start and just USE it.

If you actually care about this being fasterer -- and only reason for
posting this RFC patch is to try to tease out *why* that is -- then this
part of your concern can be trivially mitigated with having a struct
member like:

	char thread_id_str[3];

We'd then just snprintf() into that in tr2tls_create_self(). Then when
we print the thread to the JSON or log you'd do so without any
strbuf_addf(), just a strbuf_addstr() or strbuf_add().

I think that micro-optimization isn't needed in this case, but it *is*
easy to do .

> [...]
>> @@ -107,9 +109,11 @@ static void perf_fmt_prepare(const char *event_name,
>>   	}
>>     	strbuf_addf(buf, "d%d | ", tr2_sid_depth());
>> -	strbuf_addf(buf, "%-*s | %-*s | ", TR2_MAX_THREAD_NAME,
>> -		    ctx->thread_name.buf, TR2FMT_PERF_MAX_EVENT_NAME,
>> -		    event_name);
>> +	oldlen = buf->len;
>> +	jw_strbuf_add_thread_name(buf, ctx->thread_name, ctx->thread_id);
>
> This stands out as very wrong.  The _Perf target does not use JSON
> at all, yet here we are calling a jw_ routine.  Again, that code is
> in the wrong place.
>
> I'm going to clip the rest of this commit, since the above invalidates
> it.

A helper function being in the wrong place invalidates the whole commit?

I think you're right that this jw_strbuf_add_thread_name() helper should
live somewhere else, probably in thread-utils.c.

So, pretending that it's in whatever place you'd be comfortable with,
and using whatever naming convention you'd prefer. What do you think
about the rest of the commit?

You snippet it just as you were getting to the meaty part of it, namely:

 * With this approach we can BUG() out as soon as we try to construct
   the main thread if its name is bad, we don't need to wait until
   runtime when a child thread runs into the limit.

 * We no longer need the whole thread-creation-time string duplication,
   associated storage in the struct etc.

 * That struct member is "const", addresing your initial concern of
   (from the upthread commit message):

	Using a (non-const) `strbuf` structure for it caused some
	confusion in the past because it implied that someone could
	rename a thread after it was created.  That usage was not
	intended.

   Although I think (and I'm possibly misreading it) that your
   commentary here is saying that even that's not enough, i.e. we can't
   just leave it at a "const" here, but must assume that an API user
   will disregard that and modify it after it's passed to us anyway.

  reply	other threads:[~2022-10-11 13:29 UTC|newest]

Thread overview: 73+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-04 16:19 [PATCH 0/9] Trace2 timers and counters and some cleanup Jeff Hostetler via GitGitGadget
2022-10-04 16:19 ` [PATCH 1/9] builtin/merge-file: fix compiler warning on MacOS with clang 11.0.0 Jeff Hostetler via GitGitGadget
2022-10-04 16:20 ` [PATCH 2/9] builtin/unpack-objects.c: " Jeff Hostetler via GitGitGadget
2022-10-04 16:20 ` [PATCH 3/9] trace2: use size_t alloc,nr_open_regions in tr2tls_thread_ctx Jeff Hostetler via GitGitGadget
2022-10-04 16:20 ` [PATCH 4/9] tr2tls: clarify TLS terminology Jeff Hostetler via GitGitGadget
2022-10-04 16:20 ` [PATCH 5/9] trace2: rename trace2 thread_name argument as name_hint Jeff Hostetler via GitGitGadget
2022-10-04 16:20 ` [PATCH 6/9] trace2: convert ctx.thread_name to flex array Jeff Hostetler via GitGitGadget
2022-10-05 11:14   ` Ævar Arnfjörð Bjarmason
2022-10-06 16:28     ` Jeff Hostetler
2022-10-10 18:31     ` Jeff Hostetler
2022-10-05 18:03   ` Junio C Hamano
2022-10-06 21:05     ` Ævar Arnfjörð Bjarmason
2022-10-06 21:50       ` Junio C Hamano
2022-10-07  1:10         ` [RFC PATCH] trace2 API: don't save a copy of constant "thread_name" Ævar Arnfjörð Bjarmason
2022-10-07  1:16           ` Junio C Hamano
2022-10-07 10:03             ` Ævar Arnfjörð Bjarmason
2022-10-10 19:16               ` Jeff Hostetler
2022-10-11 13:31                 ` Ævar Arnfjörð Bjarmason
2022-10-12 13:31                   ` Jeff Hostetler
2022-10-10 19:05           ` Jeff Hostetler
2022-10-11 12:52             ` Ævar Arnfjörð Bjarmason [this message]
2022-10-11 14:40               ` Junio C Hamano
2022-10-10 18:39       ` [PATCH 6/9] trace2: convert ctx.thread_name to flex array Jeff Hostetler
2022-10-04 16:20 ` [PATCH 7/9] api-trace2.txt: elminate section describing the public trace2 API Jeff Hostetler via GitGitGadget
2022-10-04 16:20 ` [PATCH 8/9] trace2: add stopwatch timers Jeff Hostetler via GitGitGadget
2022-10-04 16:20 ` [PATCH 9/9] trace2: add global counter mechanism Jeff Hostetler via GitGitGadget
2022-10-05 13:04 ` [PATCH 0/9] Trace2 timers and counters and some cleanup Ævar Arnfjörð Bjarmason
2022-10-06 15:45   ` Jeff Hostetler
2022-10-06 18:12 ` Derrick Stolee
2022-10-12 18:52 ` [PATCH v2 0/7] " Jeff Hostetler via GitGitGadget
2022-10-12 18:52   ` [PATCH v2 1/7] trace2: use size_t alloc,nr_open_regions in tr2tls_thread_ctx Jeff Hostetler via GitGitGadget
2022-10-12 18:52   ` [PATCH v2 2/7] tr2tls: clarify TLS terminology Jeff Hostetler via GitGitGadget
2022-10-13 21:12     ` Junio C Hamano
2022-10-12 18:52   ` [PATCH v2 3/7] api-trace2.txt: elminate section describing the public trace2 API Jeff Hostetler via GitGitGadget
2022-10-12 18:52   ` [PATCH v2 4/7] trace2: rename the thread_name argument to trace2_thread_start Jeff Hostetler via GitGitGadget
2022-10-12 21:06     ` Ævar Arnfjörð Bjarmason
2022-10-20 14:40       ` Jeff Hostetler
2022-10-13 21:12     ` Junio C Hamano
2022-10-12 18:52   ` [PATCH v2 5/7] trace2: convert ctx.thread_name from strbuf to pointer Jeff Hostetler via GitGitGadget
2022-10-13 21:12     ` Junio C Hamano
2022-10-12 18:52   ` [PATCH v2 6/7] trace2: add stopwatch timers Jeff Hostetler via GitGitGadget
2022-10-13 21:12     ` Junio C Hamano
2022-10-20 14:42       ` Jeff Hostetler
2022-10-12 18:52   ` [PATCH v2 7/7] trace2: add global counter mechanism Jeff Hostetler via GitGitGadget
2022-10-20 18:28   ` [PATCH v3 0/8] Trace2 timers and counters and some cleanup Jeff Hostetler via GitGitGadget
2022-10-20 18:28     ` [PATCH v3 1/8] trace2: use size_t alloc,nr_open_regions in tr2tls_thread_ctx Jeff Hostetler via GitGitGadget
2022-10-20 18:28     ` [PATCH v3 2/8] tr2tls: clarify TLS terminology Jeff Hostetler via GitGitGadget
2022-10-20 18:28     ` [PATCH v3 3/8] api-trace2.txt: elminate section describing the public trace2 API Jeff Hostetler via GitGitGadget
2022-10-20 18:28     ` [PATCH v3 4/8] trace2: rename the thread_name argument to trace2_thread_start Jeff Hostetler via GitGitGadget
2022-10-20 18:28     ` [PATCH v3 5/8] trace2: improve thread-name documentation in the thread-context Jeff Hostetler via GitGitGadget
2022-10-20 18:57       ` Ævar Arnfjörð Bjarmason
2022-10-20 20:15         ` Jeff Hostetler
2022-10-20 18:28     ` [PATCH v3 6/8] trace2: convert ctx.thread_name from strbuf to pointer Jeff Hostetler via GitGitGadget
2022-10-20 18:28     ` [PATCH v3 7/8] trace2: add stopwatch timers Jeff Hostetler via GitGitGadget
2022-10-20 20:25       ` Junio C Hamano
2022-10-20 20:52         ` Jeff Hostetler
2022-10-20 20:55           ` Junio C Hamano
2022-10-21 21:51             ` Jeff Hostetler
2022-10-20 18:28     ` [PATCH v3 8/8] trace2: add global counter mechanism Jeff Hostetler via GitGitGadget
2022-10-24 13:40     ` [PATCH v4 0/8] Trace2 timers and counters and some cleanup Jeff Hostetler via GitGitGadget
2022-10-24 13:41       ` [PATCH v4 1/8] trace2: use size_t alloc,nr_open_regions in tr2tls_thread_ctx Jeff Hostetler via GitGitGadget
2022-10-24 20:31         ` Junio C Hamano
2022-10-25 12:35           ` Derrick Stolee
2022-10-25 15:40             ` Junio C Hamano
2022-10-24 13:41       ` [PATCH v4 2/8] tr2tls: clarify TLS terminology Jeff Hostetler via GitGitGadget
2022-10-24 13:41       ` [PATCH v4 3/8] api-trace2.txt: elminate section describing the public trace2 API Jeff Hostetler via GitGitGadget
2022-10-24 13:41       ` [PATCH v4 4/8] trace2: rename the thread_name argument to trace2_thread_start Jeff Hostetler via GitGitGadget
2022-10-24 13:41       ` [PATCH v4 5/8] trace2: improve thread-name documentation in the thread-context Jeff Hostetler via GitGitGadget
2022-10-24 13:41       ` [PATCH v4 6/8] trace2: convert ctx.thread_name from strbuf to pointer Jeff Hostetler via GitGitGadget
2022-10-24 13:41       ` [PATCH v4 7/8] trace2: add stopwatch timers Jeff Hostetler via GitGitGadget
2022-10-24 13:41       ` [PATCH v4 8/8] trace2: add global counter mechanism Jeff Hostetler via GitGitGadget
2022-10-25 12:27       ` [PATCH v4 0/8] Trace2 timers and counters and some cleanup Derrick Stolee
2022-10-25 15:36         ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=221011.86lepmo5dn.gmgdl@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=git@jeffhostetler.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=jeffhost@microsoft.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.