All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Cai <johncai86@gmail.com>
To: Taylor Blau <me@ttaylorr.com>
Cc: "John Cai via GitGitGadget" <gitgitgadget@gmail.com>,
	git@vger.kernel.org, "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Subject: Re: [PATCH v3] cat-file: skip expanding default format
Date: Tue, 08 Mar 2022 18:09:29 -0500	[thread overview]
Message-ID: <5A50CFA8-AD78-4774-9340-DF18C493FEC9@gmail.com> (raw)
In-Reply-To: <YifZBEAEqUvFwiEV@nand.local>

Hi Taylor,

On 8 Mar 2022, at 17:30, Taylor Blau wrote:

> On Tue, Mar 08, 2022 at 10:08:46PM +0000, John Cai via GitGitGadget wrote:
>> diff --git a/builtin/cat-file.c b/builtin/cat-file.c
>> index 7b3f42950ec..e2edba70b41 100644
>> --- a/builtin/cat-file.c
>> +++ b/builtin/cat-file.c
>> @@ -351,6 +351,13 @@ static void print_object_or_die(struct batch_options *opt, struct expand_data *d
>>  	}
>>  }
>>
>> +static int print_default_format(char *buf, int len, struct expand_data *data)
>> +{
>> +	return xsnprintf(buf, len, "%s %s %"PRIuMAX"\n", oid_to_hex(&data->oid),
>> +			 type_name(data->type),
>> +			 (uintmax_t)*data->info.sizep);
>> +}
>
> Two small nits here. It looks like the indentation on the second and
> third lines is off a little bit, since we'd typically expect those to be
> indented to the same margin as the first argument to xsnprintf().

Thanks for bringing this up. I did have a question about indentation in this
case. for the second line, I did try to indent it to align with buf. I attempted
to do the same with the third line, but it's the ( that lines up with buf so
optically it looks a little off.

>
> The other is that you're reading data->info.sizep by dereferencing it,
> but we know that it points to data->size. So I think there it makes
> sense to just read the value directly out of data->size, though note
> that you'll still need the cast to uintmax_t since you're formatting it
> with PRIuMAX.

good point, I'll adjust this in the next version.

>
>> +
>>  /*
>>   * If "pack" is non-NULL, then "offset" is the byte offset within the pack from
>>   * which the object may be accessed (though note that we may also rely on
>> @@ -381,10 +388,16 @@ static void batch_object_write(const char *obj_name,
>>  		}
>>  	}
>>
>> -	strbuf_reset(scratch);
>> -	strbuf_expand(scratch, opt->format, expand_format, data);
>> -	strbuf_addch(scratch, '\n');
>> -	batch_write(opt, scratch->buf, scratch->len);
>> +	if (!opt->format) {
>> +		char buf[1024];
>> +		int len = print_default_format(buf, 1024, data);
>> +		batch_write(opt, buf, len);
>
> Just curious (and apologies if this was discussed earlier and I missed
> it), but: is there a reason that we have to use a scratch buffer here
> that is separate from the strbuf we already have allocated?
>
> That would avoid a large-ish stack variable, but it means that the two
> paths are a little more similar, and can share the batch_write call
> outside of the if/else statement.

This was holdover code from before. Looks like the scratch buffer gets passed
in. Do you mean we don't need to allocate char buf[1024] and instead we can just
use scratch and pass it into print_default_format?

>
> The rest of the changes in this file all look good to me.
>
>> diff --git a/t/perf/p1006-cat-file.sh b/t/perf/p1006-cat-file.sh
>> new file mode 100755
>> index 00000000000..e463623f5a3
>> --- /dev/null
>> +++ b/t/perf/p1006-cat-file.sh
>> @@ -0,0 +1,16 @@
>> +#!/bin/sh
>> +
>> +test_description='Basic sort performance tests'
>
> Is this description a hold-over from p0071? If so, it may be worth
> updating here.
>
>> +test_expect_success 'setup' '
>> +	git rev-list --all >rla
>> +'
>> +
>> +test_perf 'cat-file --batch-check' '
>> +	git cat-file --batch-check <rla
>> +'
>
> We could probably get away with dropping the setup test and using
> `--batch-all-objects` here. Note that right now you're only printing
> commit objects, so there would be a slight behavior change from the way
> the patch is currently written, but it should demonstrate the same
> performance improvement.

This sounds good to me!

>
> Thanks,
> Taylor

  reply	other threads:[~2022-03-08 23:09 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-04 21:37 [PATCH] cat-file: skip expanding default format John Cai via GitGitGadget
2022-03-07  5:56 ` Junio C Hamano
2022-03-07  6:11   ` Junio C Hamano
2022-03-07 17:41     ` John Cai
2022-03-07 12:15 ` Ævar Arnfjörð Bjarmason
2022-03-08  2:54 ` [PATCH v2] " John Cai via GitGitGadget
2022-03-08 16:59   ` Junio C Hamano
2022-03-08 19:01     ` John Cai
2022-03-08 22:00   ` Taylor Blau
2022-03-08 22:06     ` John Cai
2022-03-08 22:24     ` Taylor Blau
2022-03-08 22:45       ` John Cai
2022-03-08 22:08   ` [PATCH v3] " John Cai via GitGitGadget
2022-03-08 22:30     ` Taylor Blau
2022-03-08 23:09       ` John Cai [this message]
2022-03-08 23:34         ` John Cai
2022-03-15  2:40     ` [PATCH v4] " John Cai via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5A50CFA8-AD78-4774-9340-DF18C493FEC9@gmail.com \
    --to=johncai86@gmail.com \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=me@ttaylorr.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.