git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Jeff Hostetler <git@jeffhostetler.com>
Cc: Jeff Hostetler via GitGitGadget <gitgitgadget@gmail.com>,
	git@vger.kernel.org, Jeff Hostetler <jeffhost@microsoft.com>
Subject: Re: [PATCH 2/9] trace2: convert tr2tls_thread_ctx.thread_name from strbuf to char*
Date: Mon, 20 Dec 2021 20:35:43 +0100	[thread overview]
Message-ID: <211220.86ilvjt5y8.gmgdl@evledraar.gmail.com> (raw)
In-Reply-To: <289f525b-93b8-6f33-e0fc-3b24fa8a54ea@jeffhostetler.com>


On Mon, Dec 20 2021, Jeff Hostetler wrote:

> On 12/20/21 11:31 AM, Ævar Arnfjörð Bjarmason wrote:
>> On Mon, Dec 20 2021, Jeff Hostetler via GitGitGadget wrote:
>> 
>>> From: Jeff Hostetler <jeffhost@microsoft.com>
>>>
>>> Use a 'char *' to hold the thread name rather than a 'struct strbuf'.
>>> The thread name is set when the thread is created and should not be
>>> be modified afterwards.  Replace the strbuf with an allocated pointer
>>> to make that more clear.
>>>
>>> This was discussed in: https://lore.kernel.org/all/xmqqa6kdwo24.fsf@gitster.g/
>>>...
>>> diff --git a/trace2/tr2_tls.c b/trace2/tr2_tls.c
>>> index 7da94aba522..cd8b9f2f0a0 100644
>>> --- a/trace2/tr2_tls.c
>>> +++ b/trace2/tr2_tls.c
>>> @@ -35,6 +35,7 @@ struct tr2tls_thread_ctx *tr2tls_create_self(const char *thread_name,
>>>   					     uint64_t us_thread_start)
>>>   {
>>>   	struct tr2tls_thread_ctx *ctx = xcalloc(1, sizeof(*ctx));
>>> +	struct strbuf buf_name = STRBUF_INIT;
>>>     	/*
>>>   	 * Implicitly "tr2tls_push_self()" to capture the thread's start
>>> @@ -47,12 +48,13 @@ struct tr2tls_thread_ctx *tr2tls_create_self(const char *thread_name,
>>>     	ctx->thread_id = tr2tls_locked_increment(&tr2_next_thread_id);
>>>   -	strbuf_init(&ctx->thread_name, 0);
>>>   	if (ctx->thread_id)
>>> -		strbuf_addf(&ctx->thread_name, "th%02d:", ctx->thread_id);
>>> -	strbuf_addstr(&ctx->thread_name, thread_name);
>>> -	if (ctx->thread_name.len > TR2_MAX_THREAD_NAME)
>>> -		strbuf_setlen(&ctx->thread_name, TR2_MAX_THREAD_NAME);
>>> +		strbuf_addf(&buf_name, "th%02d:", ctx->thread_id);
>>> +	strbuf_addstr(&buf_name, thread_name);
>>> +	if (buf_name.len > TR2_MAX_THREAD_NAME)
>>> +		strbuf_setlen(&buf_name, TR2_MAX_THREAD_NAME);
>>> +
>>> +	ctx->thread_name = strbuf_detach(&buf_name, NULL);
>>>     	pthread_setspecific(tr2tls_key, ctx);
>>>   
>>>..
>>> diff --git a/trace2/tr2_tls.h b/trace2/tr2_tls.h
>>> index a90bd639d48..d968da6a679 100644
>>> --- a/trace2/tr2_tls.h
>>> +++ b/trace2/tr2_tls.h
>>> @@ -9,7 +9,7 @@
>>>   #define TR2_MAX_THREAD_NAME (24)
>>>     struct tr2tls_thread_ctx {
>>> -	struct strbuf thread_name;
>>> +	char *thread_name;
>>>   	uint64_t *array_us_start;
>>>   	size_t alloc;
>>>   	size_t nr_open_regions; /* plays role of "nr" in ALLOC_GROW */
>> Junio's suggestion in the linked E-Mail was to make this a "const
>> char *".
>
> Yes, it was.  To me a "const char *" in a structure means that
> the structure does not own the pointer and must not free it.
> Whereas as "char *" means that the structure might own it and
> should maybe free it when the structure is freed.  My usage here
> is that the structure does own it (because it took it from the
> temporary strbuf using strbuf_detach()) and so it must free it.
> Therefore it should not be "const".  This has nothing to do with
> whether or not we allow the thread name to be changed after the
> fact.  (We don't, but that is a different issue).

We use the pattern of having a "const char *" that's really a "char *"
with a cast to free() in many existing APIs for this scenario.

Maybe the cast for free would be more correct here, see my recent
9081a421a6d (checkout: fix "branch info" memory leaks, 2021-11-16) & the
discussion it referencese. I.e. in that case we didn't go for the
"free((char *)ptr)" cast as it was a private API.

>> Narrowly, I don't see why not just add a "const" to the "struct
>> strbuf
>> *" instead.
>
> Adding "const" to a strbuf would be wrong in this case, since the
> structure owns the strbuf and needs to strbuf_release the contained
> buffer and (now) free the strbuf pointer, right?
>
> This also makes things confusing -- all callers of tr2tls_create_self()
> would now be responsible for allocating a strbuf to pass in -- and who
> would own those.  This would also create opportunities for mistakes if
> they pass in the address of a stack-based strbuf, right?
>
> This is being used to initialize thread-based data, so the caller
> can't just use a "function local static" or a "global static" strbuf.

Right, I meant that in the context of who/where you'd have your casts.

>> But less narrowly if we're not going to change it why malloc a new
>> one
>> at all? Can't we just use the "const char *" passed into
>> tr2tls_create_self(), and for the "th%02d:" case have the code that's
>> formatting it handle that case?
>> I.e. have the things that use it as a "%s" now call a function that
>> formats things as a function of the "ctx->thread_id" (which may be 0)
>> and limit it by TR2_MAX_THREAD_NAME?
>> 
>
> This would be less efficient, right?  That thread name is included in
> *EVERY* _perf and _event message emitted.  If we were to change the
> design to have basically a callback to get the formatted value based
> on the `ctx` or `cts->thread_id` and dynamically formatting the name,
> then we would have to hit that callback once (or twice) for every Trace2
> message, right?  That would be much slower than just having a fixed
> string (formatted when the thread is created) that we can just use.
> And even if we said that the callback could cache the result (like
> we do when we lookup env vars), where would it cache it?  It would have
> to cache it in the `ctx`, which is where it currently is and without
> any of the unnecessary overhead, right?

Aren't we per
https://lore.kernel.org/git/211220.86czlrurm6.gmgdl@evledraar.gmail.com/
doing a lot of that formatting (and sometimes allocation) anyway in a
way that's easily avoidable for the "perf" backend?

And for tr2_tgt_event.c we'll call jw_object_string(), which calls
append_quoted_string() for each event. That'll be re-quoting (presumably
always needlessly) the thread_name every time.

So just deferring a single strbuf_addf() doesn't seem like it would slow
things down.

> I think you're assuming that callers of `tr2tls_create_self()` always
> pass a literal string such that that string value is always safe to
> reference later.  Nothing would prevent a caller from passing the
> address of a stack buffer.  It is not safe to assume that that string
> pointer will always be valid, such as after the thread exits.  It is
> better for _create_self() to copy the given string (whether we format
> it immediately or not) than to assume that the pointer will always be
> valid, right?

Sure, if that's the API we can xstrdup() it, and/or xstrfmt() it etc. as
we're doing now.

> So I don't think we should deviate from the patch that I submitted.

I'm not saying anything needs to change here, these were really just
read-through suggestion, but I think per the above (about the casts &
optimization) that some of your assumptions here may not hold.

  reply	other threads:[~2021-12-20 19:46 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-20 15:01 [PATCH 0/9] Trace2 stopwatch timers and global counters Jeff Hostetler via GitGitGadget
2021-12-20 15:01 ` [PATCH 1/9] trace2: use size_t alloc,nr_open_regions in tr2tls_thread_ctx Jeff Hostetler via GitGitGadget
2021-12-20 15:01 ` [PATCH 2/9] trace2: convert tr2tls_thread_ctx.thread_name from strbuf to char* Jeff Hostetler via GitGitGadget
2021-12-20 16:31   ` Ævar Arnfjörð Bjarmason
2021-12-20 19:07     ` Jeff Hostetler
2021-12-20 19:35       ` Ævar Arnfjörð Bjarmason [this message]
2021-12-22 16:32         ` Jeff Hostetler
2021-12-21  7:33     ` Junio C Hamano
2021-12-21  7:22   ` Junio C Hamano
2021-12-22 16:28     ` Jeff Hostetler
2021-12-22 19:57       ` Junio C Hamano
2021-12-20 15:01 ` [PATCH 3/9] trace2: defer free of TLS CTX until program exit Jeff Hostetler via GitGitGadget
2021-12-21  7:30   ` Junio C Hamano
2021-12-22 21:59     ` Jeff Hostetler
2021-12-22 22:56       ` Junio C Hamano
2021-12-22 23:04         ` Jeff Hostetler
2021-12-23  7:38         ` Johannes Sixt
2021-12-23 18:18           ` Junio C Hamano
2021-12-27 18:51             ` Jeff Hostetler
2021-12-20 15:01 ` [PATCH 4/9] trace2: add thread-name override to event target Jeff Hostetler via GitGitGadget
2021-12-20 15:01 ` [PATCH 5/9] trace2: add thread-name override to perf target Jeff Hostetler via GitGitGadget
2021-12-20 15:01 ` [PATCH 6/9] trace2: add timer events to perf and event target formats Jeff Hostetler via GitGitGadget
2021-12-20 16:39   ` Ævar Arnfjörð Bjarmason
2021-12-20 19:44     ` Jeff Hostetler
2021-12-21 14:20   ` Derrick Stolee
2021-12-20 15:01 ` [PATCH 7/9] trace2: add stopwatch timers Jeff Hostetler via GitGitGadget
2021-12-20 16:42   ` Ævar Arnfjörð Bjarmason
2021-12-22 21:38     ` Jeff Hostetler
2021-12-21 14:45   ` Derrick Stolee
2021-12-22 21:57     ` Jeff Hostetler
2021-12-20 15:01 ` [PATCH 8/9] trace2: add counter events to perf and event target formats Jeff Hostetler via GitGitGadget
2021-12-20 16:51   ` Ævar Arnfjörð Bjarmason
2021-12-22 22:56     ` Jeff Hostetler
2021-12-20 15:01 ` [PATCH 9/9] trace2: add global counters Jeff Hostetler via GitGitGadget
2021-12-20 17:14   ` Ævar Arnfjörð Bjarmason
2021-12-22 22:18     ` Jeff Hostetler
2021-12-21 14:51 ` [PATCH 0/9] Trace2 stopwatch timers and " Derrick Stolee
2021-12-21 23:27   ` Matheus Tavares
2021-12-28 19:36 ` [PATCH v2 " Jeff Hostetler via GitGitGadget
2021-12-28 19:36   ` [PATCH v2 1/9] trace2: use size_t alloc,nr_open_regions in tr2tls_thread_ctx Jeff Hostetler via GitGitGadget
2021-12-29  0:48     ` Ævar Arnfjörð Bjarmason
2021-12-28 19:36   ` [PATCH v2 2/9] trace2: convert tr2tls_thread_ctx.thread_name from strbuf to flex array Jeff Hostetler via GitGitGadget
2021-12-29  1:11     ` Ævar Arnfjörð Bjarmason
2021-12-29 16:46       ` Jeff Hostetler
2021-12-28 19:36   ` [PATCH v2 3/9] trace2: defer free of thread local storage until program exit Jeff Hostetler via GitGitGadget
2021-12-28 19:36   ` [PATCH v2 4/9] trace2: add thread-name override to event target Jeff Hostetler via GitGitGadget
2021-12-28 19:36   ` [PATCH v2 5/9] trace2: add thread-name override to perf target Jeff Hostetler via GitGitGadget
2021-12-29  1:48     ` Ævar Arnfjörð Bjarmason
2021-12-29 17:15       ` Jeff Hostetler
2021-12-28 19:36   ` [PATCH v2 6/9] trace2: add timer events to perf and event target formats Jeff Hostetler via GitGitGadget
2021-12-28 19:36   ` [PATCH v2 7/9] trace2: add stopwatch timers Jeff Hostetler via GitGitGadget
2021-12-28 19:36   ` [PATCH v2 8/9] trace2: add counter events to perf and event target formats Jeff Hostetler via GitGitGadget
2021-12-28 19:36   ` [PATCH v2 9/9] trace2: add global counters Jeff Hostetler via GitGitGadget
2021-12-29  1:54   ` [PATCH v2 0/9] Trace2 stopwatch timers and " Ævar Arnfjörð Bjarmason
2021-12-30 16:42     ` Jeff Hostetler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=211220.86ilvjt5y8.gmgdl@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=git@jeffhostetler.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=jeffhost@microsoft.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).