git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] Add mailmap mechanism in --batch-check options
@ 2022-09-16 20:59 Siddharth Asthana
  2022-09-16 20:59 ` [PATCH 1/3] doc/cat-file: allow --use-mailmap for --batch options Siddharth Asthana
                   ` (8 more replies)
  0 siblings, 9 replies; 44+ messages in thread
From: Siddharth Asthana @ 2022-09-16 20:59 UTC (permalink / raw)
  To: git; +Cc: christian.couder, gitster, johncai86, Siddharth Asthana

Hi Everyone!

I am working as a GSoC mentee with GitLab organization. My projects aims
to add mailmap support in GitLab. The patch for adding mailmap support
in git cat-file is already merged. So, using git cat-file
`--use-mailmap` with either `-s` and `--batch-check option`, like the
following is allowed:

* git cat-file --use-mailmap -s <commit/tag object sha>
* git cat-file --use-mailmap --batch-check

The current implementation will return the same object size irrespective
of the mailmap option, which is not as useful as it could be. When we
use the mailmap mechanism to replace idents, the size of the object can
change `-s` and `--batch-check` options would be more useful if they
show the size of the changed object. This patch series implemets that.

In this patch series we didn't want to change that how '%(objectsize)`
always show the size of the original object even when `--use-mailmap` is
set because first we are going to unify how the formats for git cat-file
and other commands work. And second existing formats like the "pretty
formats" used by `git log` have different options for fields respecting
mailmap or not respecting it (%an is for author name while %aN is for
author name respecting mailmap).

I would like to thank my mentors, Christian Couder and John Cai, for all
of their help!
Looking forward to the reviews!

= Patch Organization

- The first patch improves the documentation  `--batch`, `--batch-check` and
  `--batch-command` options by adding they can be combined with
  `--use-mailmap` options.
- The second patch makes -s option to return updated size of the <commit/tag>
  object, when combined with `--use-mailmap` options, after replacing the idents
  using the mailmap mechanism.
- The third patch makes --batch-check option to return updated size of
  the <commit/tag> object, when combined with `--use-mailmap` options,
  after replacing the idents using the mailmap mechanism.


Siddharth Asthana (3):
  doc/cat-file: allow --use-mailmap for --batch options
  cat-file: add mailmap support to -s option
  cat-file: add mailmap support to --batch-check option

 Documentation/git-cat-file.txt | 26 ++++++++++++++------------
 builtin/cat-file.c             | 26 ++++++++++++++++++++++++++
 t/t4203-mailmap.sh             | 32 ++++++++++++++++++++++++++++++++
 3 files changed, 72 insertions(+), 12 deletions(-)

-- 
2.38.0.rc0.3.g53c2677cac


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 1/3] doc/cat-file: allow --use-mailmap for --batch options
  2022-09-16 20:59 [PATCH 0/3] Add mailmap mechanism in --batch-check options Siddharth Asthana
@ 2022-09-16 20:59 ` Siddharth Asthana
  2022-09-16 22:02   ` Junio C Hamano
  2022-09-16 20:59 ` [PATCH 2/3] cat-file: add mailmap support to -s option Siddharth Asthana
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 44+ messages in thread
From: Siddharth Asthana @ 2022-09-16 20:59 UTC (permalink / raw)
  To: git; +Cc: christian.couder, gitster, johncai86, Siddharth Asthana

The command git cat-file can now use the mailmap mechanism to replace
idents with their canonical versions for commit and tag objects. There
are several options like `--batch`, `--batch-check` and
`--batch-command` that can be combined with `--use-mailmap`. But, the
documentation for `--batch`, `--batch-check` and `--batch-command`
doesn't say so. This patch fixes that documentation.

Mentored-by: Christian Couder's avatarChristian Couder <christian.couder@gmail.com>
Mentored-by: John Cai's avatarJohn Cai <johncai86@gmail.com>
Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
---
 Documentation/git-cat-file.txt | 20 +++++++++-----------
 1 file changed, 9 insertions(+), 11 deletions(-)

diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
index ec30b5c574..5792f21a72 100644
--- a/Documentation/git-cat-file.txt
+++ b/Documentation/git-cat-file.txt
@@ -89,24 +89,22 @@ OPTIONS
 --batch::
 --batch=<format>::
 	Print object information and contents for each object provided
-	on stdin.  May not be combined with any other options or arguments
-	except `--textconv` or `--filters`, in which case the input lines
-	also need to specify the path, separated by whitespace.  See the
-	section `BATCH OUTPUT` below for details.
+	on stdin. May only be combined with `--use-mailmap`, `--textconv` or `--filters`.
+	In the case of `--textconv` or `--filters` the input lines also need to specify
+	the path, separated by whitespace. See the `BATCH OUTPUT` section below for details.
 
 --batch-check::
 --batch-check=<format>::
-	Print object information for each object provided on stdin.  May
-	not be combined with any other options or arguments except
-	`--textconv` or `--filters`, in which case the input lines also
-	need to specify the path, separated by whitespace.  See the
-	section `BATCH OUTPUT` below for details.
+	Print object information for each object provided on stdin.  May only be combined
+	with `--use-mailmap`, `--textconv` or `--filters`. In the case of `--textconv` or
+	`--filters` the input lines also need to specify the path, separated by whitespace.
+	See the `BATCH OUTPUT` section below for details.
 
 --batch-command::
 --batch-command=<format>::
 	Enter a command mode that reads commands and arguments from stdin. May
-	only be combined with `--buffer`, `--textconv` or `--filters`. In the
-	case of `--textconv` or `--filters`, the input lines also need to specify
+	only be combined with `--buffer`, `--textconv`, `--filters` or `--use-mailmap`.
+	In the case of `--textconv` or `--filters`, the input lines also need to specify
 	the path, separated by whitespace. See the section `BATCH OUTPUT` below
 	for details.
 +
-- 
2.38.0.rc0.3.g53c2677cac


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 2/3] cat-file: add mailmap support to -s option
  2022-09-16 20:59 [PATCH 0/3] Add mailmap mechanism in --batch-check options Siddharth Asthana
  2022-09-16 20:59 ` [PATCH 1/3] doc/cat-file: allow --use-mailmap for --batch options Siddharth Asthana
@ 2022-09-16 20:59 ` Siddharth Asthana
  2022-09-16 22:22   ` Junio C Hamano
  2022-09-16 20:59 ` [PATCH 3/3] cat-file: add mailmap support to --batch-check option Siddharth Asthana
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 44+ messages in thread
From: Siddharth Asthana @ 2022-09-16 20:59 UTC (permalink / raw)
  To: git; +Cc: christian.couder, gitster, johncai86, Siddharth Asthana

Using `git cat-file --use-mailmap` with `-s` option, like the following is
allowed:

 git cat-file --use-mailmap -s <commit/tag object sha>

The current implementation will return the same object size irrespective
of the mailmap option, which is not as useful as it could be. When we
use the mailmap mechanism to replace the idents, the size of the object
can change and `-s` option would be more useful if it shows the size of
the changed object. This patch implements that.

Mentored-by: Christian Couder's avatarChristian Couder <christian.couder@gmail.com>
Mentored-by: John Cai's avatarJohn Cai <johncai86@gmail.com>
Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
---
 Documentation/git-cat-file.txt |  4 +++-
 builtin/cat-file.c             | 13 +++++++++++++
 t/t4203-mailmap.sh             | 10 ++++++++++
 3 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
index 5792f21a72..708d094db4 100644
--- a/Documentation/git-cat-file.txt
+++ b/Documentation/git-cat-file.txt
@@ -45,7 +45,9 @@ OPTIONS
 
 -s::
 	Instead of the content, show the object size identified by
-	`<object>`.
+	`<object>`. If used with `--use-mailmap` option, will show the
+	size of updated object after replacing idents using the mailmap
+	mechanism.
 
 -e::
 	Exit with zero status if `<object>` exists and is a valid
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 989eee0bb4..9942b93867 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -132,8 +132,21 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name,
 
 	case 's':
 		oi.sizep = &size;
+
+		if (use_mailmap) {
+			oi.typep = &type;
+			oi.contentp = (void**)&buf;
+		}
+
 		if (oid_object_info_extended(the_repository, &oid, &oi, flags) < 0)
 			die("git cat-file: could not get object info");
+
+		if (use_mailmap && (type == OBJ_COMMIT || type == OBJ_TAG)) {
+			size_t s = size;
+			buf = replace_idents_using_mailmap(buf, &s);
+			size = cast_size_t_to_ulong(s);
+		}
+
 		printf("%"PRIuMAX"\n", (uintmax_t)size);
 		ret = 0;
 		goto cleanup;
diff --git a/t/t4203-mailmap.sh b/t/t4203-mailmap.sh
index cd1cab3e54..59513e7c57 100755
--- a/t/t4203-mailmap.sh
+++ b/t/t4203-mailmap.sh
@@ -1022,4 +1022,14 @@ test_expect_success '--mailmap enables mailmap in cat-file for annotated tag obj
 	test_cmp expect actual
 '
 
+test_expect_success 'git cat-file -s returns correct size with --use-mailmap' '
+	test_when_finished "rm .mailmap" &&
+	cat >.mailmap <<-EOF &&
+	C O Mitter <committer@example.com> Orig <orig@example.com>
+	EOF
+	echo "220" >expect &&
+	git cat-file --use-mailmap -s HEAD >actual &&
+	test_cmp expect actual
+'
+
 test_done
-- 
2.38.0.rc0.3.g53c2677cac


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 3/3] cat-file: add mailmap support to --batch-check option
  2022-09-16 20:59 [PATCH 0/3] Add mailmap mechanism in --batch-check options Siddharth Asthana
  2022-09-16 20:59 ` [PATCH 1/3] doc/cat-file: allow --use-mailmap for --batch options Siddharth Asthana
  2022-09-16 20:59 ` [PATCH 2/3] cat-file: add mailmap support to -s option Siddharth Asthana
@ 2022-09-16 20:59 ` Siddharth Asthana
  2022-09-16 22:35   ` Junio C Hamano
  2022-09-26 10:53 ` [PATCH v2 0/2] Add mailmap mechanism in cat-file options Siddharth Asthana
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 44+ messages in thread
From: Siddharth Asthana @ 2022-09-16 20:59 UTC (permalink / raw)
  To: git; +Cc: christian.couder, gitster, johncai86, Siddharth Asthana

Using `git cat-file --use-mailmap` with --batch-check option, like the
following is allowed:

 git cat-file --use-mailmap -batch-check

The current implementation will return the same object size irrespective
of the mailmap option, which is not as useful as it could be. When we
use the mailmap mechanism to replace the idents, the size of the object
can change and --batch-check option would be more useful if it shows the
size of the changed object. This patch implements that.

Mentored-by: Christian Couder's avatarChristian Couder <christian.couder@gmail.com>
Mentored-by: John Cai's avatarJohn Cai <johncai86@gmail.com>
Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
---
 Documentation/git-cat-file.txt |  2 ++
 builtin/cat-file.c             | 13 +++++++++++++
 t/t4203-mailmap.sh             | 22 ++++++++++++++++++++++
 3 files changed, 37 insertions(+)

diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
index 708d094db4..0d5cc9335f 100644
--- a/Documentation/git-cat-file.txt
+++ b/Documentation/git-cat-file.txt
@@ -101,6 +101,8 @@ OPTIONS
 	with `--use-mailmap`, `--textconv` or `--filters`. In the case of `--textconv` or
 	`--filters` the input lines also need to specify the path, separated by whitespace.
 	See the `BATCH OUTPUT` section below for details.
+	If used with `--use-mailmap` option, will show the size of updated object after
+	replacing idents using the mailmap mechanism.
 
 --batch-command::
 --batch-command=<format>::
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 9942b93867..93d127d687 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -424,6 +424,12 @@ static void print_object_or_die(struct batch_options *opt, struct expand_data *d
 
 static void print_default_format(struct strbuf *scratch, struct expand_data *data)
 {
+	if (use_mailmap && (data->type == OBJ_COMMIT || data->type == OBJ_TAG)) {
+		size_t s = data->size;
+		*data->info.contentp = replace_idents_using_mailmap((char*)*data->info.contentp, &s);
+		data->size = cast_size_t_to_ulong(s);
+	}
+
 	strbuf_addf(scratch, "%s %s %"PRIuMAX"\n", oid_to_hex(&data->oid),
 		    type_name(data->type),
 		    (uintmax_t)data->size);
@@ -441,9 +447,14 @@ static void batch_object_write(const char *obj_name,
 			       struct packed_git *pack,
 			       off_t offset)
 {
+	void *buf = NULL;
+
 	if (!data->skip_object_info) {
 		int ret;
 
+		if (use_mailmap)
+			data->info.contentp = &buf;
+
 		if (pack)
 			ret = packed_object_info(the_repository, pack, offset,
 						 &data->info);
@@ -474,6 +485,8 @@ static void batch_object_write(const char *obj_name,
 		print_object_or_die(opt, data);
 		batch_write(opt, "\n", 1);
 	}
+
+	free(buf);
 }
 
 static void batch_one_object(const char *obj_name,
diff --git a/t/t4203-mailmap.sh b/t/t4203-mailmap.sh
index 59513e7c57..4b236c68aa 100755
--- a/t/t4203-mailmap.sh
+++ b/t/t4203-mailmap.sh
@@ -1032,4 +1032,26 @@ test_expect_success 'git cat-file -s returns correct size with --use-mailmap' '
 	test_cmp expect actual
 '
 
+test_expect_success 'git cat-file --batch-check returns correct size with --use-mailmap' '
+	test_when_finished "rm .mailmap" &&
+	cat >.mailmap <<-EOF &&
+	C O Mitter <committer@example.com> Orig <orig@example.com>
+	EOF
+	echo "92d99959b011b1cd69fe7be5315d6732fe302575 commit 220" >expect &&
+	echo "HEAD" >in &&
+	git cat-file --use-mailmap --batch-check <in >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git cat-file --batch-command returns correct size with --use-mailmap' '
+	test_when_finished "rm .mailmap" &&
+	cat >.mailmap <<-EOF &&
+	C O Mitter <committer@example.com> Orig <orig@example.com>
+	EOF
+	echo "92d99959b011b1cd69fe7be5315d6732fe302575 commit 220" >expect &&
+	echo "info HEAD" >in &&
+	git cat-file --use-mailmap --batch-command <in >actual &&
+	test_cmp expect actual
+'
+
 test_done
-- 
2.38.0.rc0.3.g53c2677cac


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [PATCH 1/3] doc/cat-file: allow --use-mailmap for --batch options
  2022-09-16 20:59 ` [PATCH 1/3] doc/cat-file: allow --use-mailmap for --batch options Siddharth Asthana
@ 2022-09-16 22:02   ` Junio C Hamano
  0 siblings, 0 replies; 44+ messages in thread
From: Junio C Hamano @ 2022-09-16 22:02 UTC (permalink / raw)
  To: Siddharth Asthana; +Cc: git, christian.couder, johncai86

Siddharth Asthana <siddharthasthana31@gmail.com> writes:

> The command git cat-file can now use the mailmap mechanism to replace
> idents with their canonical versions for commit and tag objects. There
> are several options like `--batch`, `--batch-check` and
> `--batch-command` that can be combined with `--use-mailmap`. But, the
> documentation for `--batch`, `--batch-check` and `--batch-command`
> doesn't say so. This patch fixes that documentation.
>
> Mentored-by: Christian Couder's avatarChristian Couder <christian.couder@gmail.com>
> Mentored-by: John Cai's avatarJohn Cai <johncai86@gmail.com>
> Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
> ---
>  Documentation/git-cat-file.txt | 20 +++++++++-----------
>  1 file changed, 9 insertions(+), 11 deletions(-)
>
> diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
> index ec30b5c574..5792f21a72 100644
> --- a/Documentation/git-cat-file.txt
> +++ b/Documentation/git-cat-file.txt
> @@ -89,24 +89,22 @@ OPTIONS
>  --batch::
>  --batch=<format>::
>  	Print object information and contents for each object provided
> -	on stdin.  May not be combined with any other options or arguments
> -	except `--textconv` or `--filters`, in which case the input lines
> -	also need to specify the path, separated by whitespace.  See the
> -	section `BATCH OUTPUT` below for details.
> +	on stdin. May only be combined with `--use-mailmap`, `--textconv` or `--filters`.
> +	In the case of `--textconv` or `--filters` the input lines also need to specify
> +	the path, separated by whitespace. See the `BATCH OUTPUT` section below for details.

The above is not wrong per-se, but I suspect that phrasing it like
so

    * When used with `--textconv` or `--filters`, the input lines must
      specify the path, separated by whitespace.

    * When used with `--use-mailmap`, THIS HAPPENS.

    Cannot be used with any other options.

would be easier to extend.  The same comment applies to the other
two changes in this patch.

As you do not have any code that implements the behaviour of
`--use-mailmap` at this point in the series yet, it might be nicer
to restructure the existing text for existing options in a
preliminary preparation patch, and then add the explanation and the
implementation of `--use-mailmap` option in a separate patch.

Thanks.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 2/3] cat-file: add mailmap support to -s option
  2022-09-16 20:59 ` [PATCH 2/3] cat-file: add mailmap support to -s option Siddharth Asthana
@ 2022-09-16 22:22   ` Junio C Hamano
  0 siblings, 0 replies; 44+ messages in thread
From: Junio C Hamano @ 2022-09-16 22:22 UTC (permalink / raw)
  To: Siddharth Asthana; +Cc: git, christian.couder, johncai86

Siddharth Asthana <siddharthasthana31@gmail.com> writes:

> Using `git cat-file --use-mailmap` with `-s` option, like the following is
> allowed:
>
>  git cat-file --use-mailmap -s <commit/tag object sha>
>
> The current implementation will return the same object size irrespective
> of the mailmap option, which is not as useful as it could be.
> When we
> use the mailmap mechanism to replace the idents, the size of the object
> can change and `-s` option would be more useful if it shows the size of
> the changed object. This patch implements that.

Simply put, the current implementation is BUGGY, in other words, no?

The above sounds like a quite roundabout way to say that.  "the
following is allowed" (but it does not work, really).  "is not as
useful as it could be" (it is pretty much useless, in fact).

If I were doing this step, I'd summarize it into a single three-line
paragraph, like so:

    Even though the cat-file command with -s option does not complain
    when --use-mailmap option is given, it is ignored.  Compute the size
    of the object after replacing the idents and report it instead.

>  -s::
>  	Instead of the content, show the object size identified by
> -	`<object>`.
> +	`<object>`. If used with `--use-mailmap` option, will show the
> +	size of updated object after replacing idents using the mailmap
> +	mechanism.

We are not modifying the object in the object database, so it would
be preferrable to avoid a misleading phrasing "updated object", if
we can.  The size that the object would have had, if the idents
recorded in it were the ones "corrected" by the mailmap, is what we
report.

> diff --git a/builtin/cat-file.c b/builtin/cat-file.c
> index 989eee0bb4..9942b93867 100644
> --- a/builtin/cat-file.c
> +++ b/builtin/cat-file.c
> @@ -132,8 +132,21 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name,
>  
>  	case 's':
>  		oi.sizep = &size;
> +
> +		if (use_mailmap) {
> +			oi.typep = &type;
> +			oi.contentp = (void**)&buf;
> +		}
> +
>  		if (oid_object_info_extended(the_repository, &oid, &oi, flags) < 0)
>  			die("git cat-file: could not get object info");
> +
> +		if (use_mailmap && (type == OBJ_COMMIT || type == OBJ_TAG)) {
> +			size_t s = size;
> +			buf = replace_idents_using_mailmap(buf, &s);
> +			size = cast_size_t_to_ulong(s);
> +		}
> +
>  		printf("%"PRIuMAX"\n", (uintmax_t)size);
>  		ret = 0;
>  		goto cleanup;

We did not grab the object contents but just asked the API to fill oi.sizep
but now we grab the contents in oi.contentp, and possibly munge it
via mailmap.

Are we merely borrowing the object contents from somebody so we do
not have to release anything before jumping to "cleanup" here?
What's the memory ownership around replace_idents_using_mailmap()?
You pass "buf" to it, and overwrite "buf" with what it returns.  If
you were responsible for releasing the original "buf" you obtained
from oid_object_info_extended(), have you lost the only pointer to
it that you may have wanted to use to release it by doing so?

    ... goes and looks ...

OK, replace_idents_using_mailmap() assumes that object_buf is owned
by the caller and the caller is willing to let it go.  That is why
it attaches it to a strbuf, calls apply_mailmap_to_header() to munge
the strbuf, and detaches.  The net effect is that the original "buf"
we took from oid_object_info_extended() may be freed inside it, and
the returned value may be a newly allocated one to replace it, so
overwriting buf is OK.

We still need to free "buf", but the function assumes that all cases
"buf" would be populated (or left to NULL) and it is OK to free it
at cleanup: label, so we are not leaking anything here.

Good.

> diff --git a/t/t4203-mailmap.sh b/t/t4203-mailmap.sh
> index cd1cab3e54..59513e7c57 100755
> --- a/t/t4203-mailmap.sh
> +++ b/t/t4203-mailmap.sh
> @@ -1022,4 +1022,14 @@ test_expect_success '--mailmap enables mailmap in cat-file for annotated tag obj
>  	test_cmp expect actual
>  '
>  
> +test_expect_success 'git cat-file -s returns correct size with --use-mailmap' '
> +	test_when_finished "rm .mailmap" &&
> +	cat >.mailmap <<-EOF &&
> +	C O Mitter <committer@example.com> Orig <orig@example.com>
> +	EOF
> +	echo "220" >expect &&
> +	git cat-file --use-mailmap -s HEAD >actual &&
> +	test_cmp expect actual
> +'
> +
>  test_done

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/3] cat-file: add mailmap support to --batch-check option
  2022-09-16 20:59 ` [PATCH 3/3] cat-file: add mailmap support to --batch-check option Siddharth Asthana
@ 2022-09-16 22:35   ` Junio C Hamano
  0 siblings, 0 replies; 44+ messages in thread
From: Junio C Hamano @ 2022-09-16 22:35 UTC (permalink / raw)
  To: Siddharth Asthana; +Cc: git, christian.couder, johncai86

Siddharth Asthana <siddharthasthana31@gmail.com> writes:

> Using `git cat-file --use-mailmap` with --batch-check option, like the
> following is allowed:
>
>  git cat-file --use-mailmap -batch-check
>
> The current implementation will return the same object size irrespective
> of the mailmap option, which is not as useful as it could be. When we
> use the mailmap mechanism to replace the idents, the size of the object
> can change and --batch-check option would be more useful if it shows the
> size of the changed object. This patch implements that.

Almost the same comment on the proposed log message as [2/3].

> diff --git a/builtin/cat-file.c b/builtin/cat-file.c
> index 9942b93867..93d127d687 100644
> --- a/builtin/cat-file.c
> +++ b/builtin/cat-file.c
> @@ -424,6 +424,12 @@ static void print_object_or_die(struct batch_options *opt, struct expand_data *d
>  
>  static void print_default_format(struct strbuf *scratch, struct expand_data *data)
>  {
> +	if (use_mailmap && (data->type == OBJ_COMMIT || data->type == OBJ_TAG)) {
> +		size_t s = data->size;
> +		*data->info.contentp = replace_idents_using_mailmap((char*)*data->info.contentp, &s);
> +		data->size = cast_size_t_to_ulong(s);
> +	}
> +
>  	strbuf_addf(scratch, "%s %s %"PRIuMAX"\n", oid_to_hex(&data->oid),
>  		    type_name(data->type),
>  		    (uintmax_t)data->size);
> @@ -441,9 +447,14 @@ static void batch_object_write(const char *obj_name,
>  			       struct packed_git *pack,
>  			       off_t offset)
>  {
> +	void *buf = NULL;
> +
>  	if (!data->skip_object_info) {
>  		int ret;
>  
> +		if (use_mailmap)
> +			data->info.contentp = &buf;
> +
>  		if (pack)
>  			ret = packed_object_info(the_repository, pack, offset,
>  						 &data->info);
> @@ -474,6 +485,8 @@ static void batch_object_write(const char *obj_name,
>  		print_object_or_die(opt, data);
>  		batch_write(opt, "\n", 1);
>  	}
> +
> +	free(buf);

OK.  Do we have _any_ idea what kind of object this is upon entry to
this function so that we can avoid populating .contentp for say a
huge blob object?  Of course, we could probe for type without
loading the contents, something like the attached sketch.  Usually
the blobs and trees are far larger than commits and tags and more
expensive to materialize in core (especially because trees delta so
well), so avoiding the cost to do so may worth it.  I dunno.

diff --git i/builtin/cat-file.c w/builtin/cat-file.c
index 989eee0bb4..562691eb1e 100644
--- i/builtin/cat-file.c
+++ w/builtin/cat-file.c
@@ -431,6 +431,9 @@ static void batch_object_write(const char *obj_name,
 	if (!data->skip_object_info) {
 		int ret;
 
+		if (use_mailmap && !data->info.typep)
+			data->info.typep = &data.type;
+
 		if (pack)
 			ret = packed_object_info(the_repository, pack, offset,
 						 &data->info);
@@ -444,8 +447,14 @@ static void batch_object_write(const char *obj_name,
 			fflush(stdout);
 			return;
 		}
-	}
 
+		if (use_mailmap && 
+		    (*(data->info.typep) == OBJ_COMMIT ||
+		    (*data->info.typep) == OBJ_TAG)) {
+			... load the contents here ...;
+			... replace idents with mailmap ...;
+		}
+	}
 	strbuf_reset(scratch);
 
 	if (!opt->format) {


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 0/2] Add mailmap mechanism in cat-file options
  2022-09-16 20:59 [PATCH 0/3] Add mailmap mechanism in --batch-check options Siddharth Asthana
                   ` (2 preceding siblings ...)
  2022-09-16 20:59 ` [PATCH 3/3] cat-file: add mailmap support to --batch-check option Siddharth Asthana
@ 2022-09-26 10:53 ` Siddharth Asthana
  2022-09-26 10:53   ` [PATCH v2 1/2] cat-file: add mailmap support to -s option Siddharth Asthana
  2022-09-26 10:53   ` [PATCH v2 2/2] cat-file: add mailmap support to --batch-check option Siddharth Asthana
  2022-10-29 10:24 ` [PATCH v3 0/2] Add mailmap mechanism in cat-file options Siddharth Asthana
                   ` (4 subsequent siblings)
  8 siblings, 2 replies; 44+ messages in thread
From: Siddharth Asthana @ 2022-09-26 10:53 UTC (permalink / raw)
  To: git; +Cc: christian.couder, gitster, johncai86, Siddharth Asthana

Thanks a lot Junio for the review :) I have made the suggested changes.

= Description

At present, `git-cat-file` command with `--batch-check` and `-s` options
does not complain when `--use-mailmap` option is given. The latter
option is just ignored. Instead, for commit/tag objects, the command
should compute the size of the object after replacing the idents and
report it. So, this patch series makes `-s` and `--batch-check` options
of `git-cat-file` honor mailmap when used with `--use-mailmap` option.

In this patch series we didn't want to change that '%(objectsize)'
always shows the size of the original object even when `--use-mailmap`
is set because first we have the long term plan to unify how the formats
for `git cat-file` and other commands works. And second existing formats
like the "pretty formats" used bt `git log` have different options for
fields respecting mailmap or not respecting it (%an is for author name
while %aN for author name respecting mailmap).

I would like to thank my mentors, Christian Couder and John Cai, for all
of their help!
Looking forward to the reviews!

= Patch Organization

- The first patch makes `-s` option to return updated size of the
  <commit/tag> object, when combined with `--use-mailmap` option, after
  replacing the idents using the mailmap mechanism.
- The second patch makes `--batch-check` option to return updated size of
  the <commit/tag> object, when combined with `--use-mailmap` option,
  after replacing the idents using the mailmap mechanism.

= Changes in v2:

- The commit messages of both the patches have been improved.
- In the second patch, we were populating the `contentp` field of the
  `object_info` structure when `--batch-check` was combined with
  `--use-mailmap`. Which made us read the contents of tree and blob
  object types as well, which affected the performance. We should only
  be reading the contents for commit or tag object types. The second
  patch has been updated to do just that.

Siddharth Asthana (2):
  cat-file: add mailmap support to -s option
  cat-file: add mailmap support to --batch-check option

 Documentation/git-cat-file.txt |  6 +++++-
 builtin/cat-file.c             | 27 +++++++++++++++++++++++++++
 t/t4203-mailmap.sh             | 32 ++++++++++++++++++++++++++++++++
 3 files changed, 64 insertions(+), 1 deletion(-)

Range-diff against v1:
1:  513ad3b5f7 < -:  ---------- doc/cat-file: allow --use-mailmap for --batch options
2:  6f3dcce9e3 ! 1:  60cf7bc28c cat-file: add mailmap support to -s option
    @@ Metadata
      ## Commit message ##
         cat-file: add mailmap support to -s option
     
    -    Using `git cat-file --use-mailmap` with `-s` option, like the following is
    -    allowed:
    +    Even though the cat-file command with `-s` option does not complain when
    +    `--use-mailmap` option is given, the latter option is ignored. Compute
    +    the size of the object after replacing the idents and report it instead.
     
    -     git cat-file --use-mailmap -s <commit/tag object sha>
    +    In order to make `-s` option honour the mailmap mechanism we have to
    +    read the contents of the commit/tag object. Make use of the call to
    +    `oid_object_info_extended()` to get the contents of the object and store
    +    in `buf`. `buf` is later freed in the function.
     
    -    The current implementation will return the same object size irrespective
    -    of the mailmap option, which is not as useful as it could be. When we
    -    use the mailmap mechanism to replace the idents, the size of the object
    -    can change and `-s` option would be more useful if it shows the size of
    -    the changed object. This patch implements that.
    -
    -    Mentored-by: Christian Couder's avatarChristian Couder <christian.couder@gmail.com>
    -    Mentored-by: John Cai's avatarJohn Cai <johncai86@gmail.com>
    +    Mentored-by: Christian Couder <christian.couder@gmail.com>
    +    Mentored-by: John Cai <johncai86@gmail.com>
         Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
     
      ## Documentation/git-cat-file.txt ##
3:  af90241d32 ! 2:  06c74dd017 cat-file: add mailmap support to --batch-check option
    @@ Metadata
      ## Commit message ##
         cat-file: add mailmap support to --batch-check option
     
    -    Using `git cat-file --use-mailmap` with --batch-check option, like the
    -    following is allowed:
    +    Even though the cat-file command with `--batch-check` option does not
    +    complain when `--use-mailmap` option is given, the latter option is
    +    ignored. Compute the size of the object after replacing the idents and
    +    report it instead.
     
    -     git cat-file --use-mailmap -batch-check
    +    In order to make `--batch-check` option honour the mailmap mechanism we
    +    have to read the contents of the commit/tag object.
     
    -    The current implementation will return the same object size irrespective
    -    of the mailmap option, which is not as useful as it could be. When we
    -    use the mailmap mechanism to replace the idents, the size of the object
    -    can change and --batch-check option would be more useful if it shows the
    -    size of the changed object. This patch implements that.
    +    There were two ways to do it:
     
    -    Mentored-by: Christian Couder's avatarChristian Couder <christian.couder@gmail.com>
    -    Mentored-by: John Cai's avatarJohn Cai <johncai86@gmail.com>
    +    1. Make two calls to `oid_object_info_extended()`. If `--use-mailmap`
    +       option is given, the first call will get us the type of the object
    +       and second call will only be made if the object type is either a
    +       commit or tag to get the contents of the object.
    +
    +    2. Make one call to `oid_object_info_extended()` to get the type of the
    +       object. Then, if the object type is either of commit or tag, make a
    +       call to `read_object_file()` to read the contents of the object.
    +
    +    I benchmarked the following command with both the above approaches and
    +    compared against the current implementation where `--use-mailmap`
    +    option is ignored:
    +
    +    `git cat-file --use-mailmap --batch-all-objects --batch-check --buffer
    +    --unordered`
    +
    +    The results can be summarized as follows:
    +                           Time (mean ± σ)
    +    default               827.7 ms ± 104.8 ms
    +    first approach        6.197 s ± 0.093 s
    +    second approach       1.975 s ± 0.217 s
    +
    +    Since, the second approach is faster than the first one, I implemented
    +    it in this patch.
    +
    +    Mentored-by: Christian Couder <christian.couder@gmail.com>
    +    Mentored-by: John Cai <johncai86@gmail.com>
         Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
     
      ## Documentation/git-cat-file.txt ##
     @@ Documentation/git-cat-file.txt: OPTIONS
    - 	with `--use-mailmap`, `--textconv` or `--filters`. In the case of `--textconv` or
    - 	`--filters` the input lines also need to specify the path, separated by whitespace.
    - 	See the `BATCH OUTPUT` section below for details.
    -+	If used with `--use-mailmap` option, will show the size of updated object after
    -+	replacing idents using the mailmap mechanism.
    + 	`--textconv` or `--filters`, in which case the input lines also
    + 	need to specify the path, separated by whitespace.  See the
    + 	section `BATCH OUTPUT` below for details.
    ++	If used with `--use-mailmap` option, will show the size of
    ++	updated object after replacing idents using the mailmap mechanism.
      
      --batch-command::
      --batch-command=<format>::
     
      ## builtin/cat-file.c ##
    -@@ builtin/cat-file.c: static void print_object_or_die(struct batch_options *opt, struct expand_data *d
    - 
    - static void print_default_format(struct strbuf *scratch, struct expand_data *data)
    - {
    -+	if (use_mailmap && (data->type == OBJ_COMMIT || data->type == OBJ_TAG)) {
    -+		size_t s = data->size;
    -+		*data->info.contentp = replace_idents_using_mailmap((char*)*data->info.contentp, &s);
    -+		data->size = cast_size_t_to_ulong(s);
    -+	}
    -+
    - 	strbuf_addf(scratch, "%s %s %"PRIuMAX"\n", oid_to_hex(&data->oid),
    - 		    type_name(data->type),
    - 		    (uintmax_t)data->size);
     @@ builtin/cat-file.c: static void batch_object_write(const char *obj_name,
    - 			       struct packed_git *pack,
    - 			       off_t offset)
    - {
    -+	void *buf = NULL;
    -+
      	if (!data->skip_object_info) {
      		int ret;
      
     +		if (use_mailmap)
    -+			data->info.contentp = &buf;
    ++			data->info.typep = &data->type;
     +
      		if (pack)
      			ret = packed_object_info(the_repository, pack, offset,
      						 &data->info);
     @@ builtin/cat-file.c: static void batch_object_write(const char *obj_name,
    - 		print_object_or_die(opt, data);
    - 		batch_write(opt, "\n", 1);
    - 	}
    + 			fflush(stdout);
    + 			return;
    + 		}
     +
    -+	free(buf);
    - }
    ++		if (use_mailmap && (data->type == OBJ_COMMIT || data->type == OBJ_TAG)) {
    ++			size_t s = data->size;
    ++			char *buf = NULL;
    ++
    ++			buf = read_object_file(&data->oid, &data->type, &data->size);
    ++			buf = replace_idents_using_mailmap(buf, &s);
    ++			data->size = cast_size_t_to_ulong(s);
    ++
    ++			free(buf);
    ++		}
    + 	}
      
    - static void batch_one_object(const char *obj_name,
    + 	strbuf_reset(scratch);
     
      ## t/t4203-mailmap.sh ##
     @@ t/t4203-mailmap.sh: test_expect_success 'git cat-file -s returns correct size with --use-mailmap' '
-- 
2.38.0.rc1.8.g9592ff2ba4


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v2 1/2] cat-file: add mailmap support to -s option
  2022-09-26 10:53 ` [PATCH v2 0/2] Add mailmap mechanism in cat-file options Siddharth Asthana
@ 2022-09-26 10:53   ` Siddharth Asthana
  2022-09-26 13:16     ` Ævar Arnfjörð Bjarmason
  2022-09-26 13:25     ` Ævar Arnfjörð Bjarmason
  2022-09-26 10:53   ` [PATCH v2 2/2] cat-file: add mailmap support to --batch-check option Siddharth Asthana
  1 sibling, 2 replies; 44+ messages in thread
From: Siddharth Asthana @ 2022-09-26 10:53 UTC (permalink / raw)
  To: git; +Cc: christian.couder, gitster, johncai86, Siddharth Asthana

Even though the cat-file command with `-s` option does not complain when
`--use-mailmap` option is given, the latter option is ignored. Compute
the size of the object after replacing the idents and report it instead.

In order to make `-s` option honour the mailmap mechanism we have to
read the contents of the commit/tag object. Make use of the call to
`oid_object_info_extended()` to get the contents of the object and store
in `buf`. `buf` is later freed in the function.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: John Cai <johncai86@gmail.com>
Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
---
 Documentation/git-cat-file.txt |  4 +++-
 builtin/cat-file.c             | 13 +++++++++++++
 t/t4203-mailmap.sh             | 10 ++++++++++
 3 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
index ec30b5c574..594b6f2dfd 100644
--- a/Documentation/git-cat-file.txt
+++ b/Documentation/git-cat-file.txt
@@ -45,7 +45,9 @@ OPTIONS
 
 -s::
 	Instead of the content, show the object size identified by
-	`<object>`.
+	`<object>`. If used with `--use-mailmap` option, will show the
+	size of updated object after replacing idents using the mailmap
+	mechanism.
 
 -e::
 	Exit with zero status if `<object>` exists and is a valid
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 989eee0bb4..9942b93867 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -132,8 +132,21 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name,
 
 	case 's':
 		oi.sizep = &size;
+
+		if (use_mailmap) {
+			oi.typep = &type;
+			oi.contentp = (void**)&buf;
+		}
+
 		if (oid_object_info_extended(the_repository, &oid, &oi, flags) < 0)
 			die("git cat-file: could not get object info");
+
+		if (use_mailmap && (type == OBJ_COMMIT || type == OBJ_TAG)) {
+			size_t s = size;
+			buf = replace_idents_using_mailmap(buf, &s);
+			size = cast_size_t_to_ulong(s);
+		}
+
 		printf("%"PRIuMAX"\n", (uintmax_t)size);
 		ret = 0;
 		goto cleanup;
diff --git a/t/t4203-mailmap.sh b/t/t4203-mailmap.sh
index cd1cab3e54..59513e7c57 100755
--- a/t/t4203-mailmap.sh
+++ b/t/t4203-mailmap.sh
@@ -1022,4 +1022,14 @@ test_expect_success '--mailmap enables mailmap in cat-file for annotated tag obj
 	test_cmp expect actual
 '
 
+test_expect_success 'git cat-file -s returns correct size with --use-mailmap' '
+	test_when_finished "rm .mailmap" &&
+	cat >.mailmap <<-EOF &&
+	C O Mitter <committer@example.com> Orig <orig@example.com>
+	EOF
+	echo "220" >expect &&
+	git cat-file --use-mailmap -s HEAD >actual &&
+	test_cmp expect actual
+'
+
 test_done
-- 
2.38.0.rc1.8.g9592ff2ba4


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 2/2] cat-file: add mailmap support to --batch-check option
  2022-09-26 10:53 ` [PATCH v2 0/2] Add mailmap mechanism in cat-file options Siddharth Asthana
  2022-09-26 10:53   ` [PATCH v2 1/2] cat-file: add mailmap support to -s option Siddharth Asthana
@ 2022-09-26 10:53   ` Siddharth Asthana
  1 sibling, 0 replies; 44+ messages in thread
From: Siddharth Asthana @ 2022-09-26 10:53 UTC (permalink / raw)
  To: git; +Cc: christian.couder, gitster, johncai86, Siddharth Asthana

Even though the cat-file command with `--batch-check` option does not
complain when `--use-mailmap` option is given, the latter option is
ignored. Compute the size of the object after replacing the idents and
report it instead.

In order to make `--batch-check` option honour the mailmap mechanism we
have to read the contents of the commit/tag object.

There were two ways to do it:

1. Make two calls to `oid_object_info_extended()`. If `--use-mailmap`
   option is given, the first call will get us the type of the object
   and second call will only be made if the object type is either a
   commit or tag to get the contents of the object.

2. Make one call to `oid_object_info_extended()` to get the type of the
   object. Then, if the object type is either of commit or tag, make a
   call to `read_object_file()` to read the contents of the object.

I benchmarked the following command with both the above approaches and
compared against the current implementation where `--use-mailmap`
option is ignored:

`git cat-file --use-mailmap --batch-all-objects --batch-check --buffer
--unordered`

The results can be summarized as follows:
                       Time (mean ± σ)
default               827.7 ms ± 104.8 ms
first approach        6.197 s ± 0.093 s
second approach       1.975 s ± 0.217 s

Since, the second approach is faster than the first one, I implemented
it in this patch.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: John Cai <johncai86@gmail.com>
Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
---
 Documentation/git-cat-file.txt |  2 ++
 builtin/cat-file.c             | 14 ++++++++++++++
 t/t4203-mailmap.sh             | 22 ++++++++++++++++++++++
 3 files changed, 38 insertions(+)

diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
index 594b6f2dfd..a8e7906b3d 100644
--- a/Documentation/git-cat-file.txt
+++ b/Documentation/git-cat-file.txt
@@ -103,6 +103,8 @@ OPTIONS
 	`--textconv` or `--filters`, in which case the input lines also
 	need to specify the path, separated by whitespace.  See the
 	section `BATCH OUTPUT` below for details.
+	If used with `--use-mailmap` option, will show the size of
+	updated object after replacing idents using the mailmap mechanism.
 
 --batch-command::
 --batch-command=<format>::
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 9942b93867..7ee8cb453f 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -444,6 +444,9 @@ static void batch_object_write(const char *obj_name,
 	if (!data->skip_object_info) {
 		int ret;
 
+		if (use_mailmap)
+			data->info.typep = &data->type;
+
 		if (pack)
 			ret = packed_object_info(the_repository, pack, offset,
 						 &data->info);
@@ -457,6 +460,17 @@ static void batch_object_write(const char *obj_name,
 			fflush(stdout);
 			return;
 		}
+
+		if (use_mailmap && (data->type == OBJ_COMMIT || data->type == OBJ_TAG)) {
+			size_t s = data->size;
+			char *buf = NULL;
+
+			buf = read_object_file(&data->oid, &data->type, &data->size);
+			buf = replace_idents_using_mailmap(buf, &s);
+			data->size = cast_size_t_to_ulong(s);
+
+			free(buf);
+		}
 	}
 
 	strbuf_reset(scratch);
diff --git a/t/t4203-mailmap.sh b/t/t4203-mailmap.sh
index 59513e7c57..4b236c68aa 100755
--- a/t/t4203-mailmap.sh
+++ b/t/t4203-mailmap.sh
@@ -1032,4 +1032,26 @@ test_expect_success 'git cat-file -s returns correct size with --use-mailmap' '
 	test_cmp expect actual
 '
 
+test_expect_success 'git cat-file --batch-check returns correct size with --use-mailmap' '
+	test_when_finished "rm .mailmap" &&
+	cat >.mailmap <<-EOF &&
+	C O Mitter <committer@example.com> Orig <orig@example.com>
+	EOF
+	echo "92d99959b011b1cd69fe7be5315d6732fe302575 commit 220" >expect &&
+	echo "HEAD" >in &&
+	git cat-file --use-mailmap --batch-check <in >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git cat-file --batch-command returns correct size with --use-mailmap' '
+	test_when_finished "rm .mailmap" &&
+	cat >.mailmap <<-EOF &&
+	C O Mitter <committer@example.com> Orig <orig@example.com>
+	EOF
+	echo "92d99959b011b1cd69fe7be5315d6732fe302575 commit 220" >expect &&
+	echo "info HEAD" >in &&
+	git cat-file --use-mailmap --batch-command <in >actual &&
+	test_cmp expect actual
+'
+
 test_done
-- 
2.38.0.rc1.8.g9592ff2ba4


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 1/2] cat-file: add mailmap support to -s option
  2022-09-26 10:53   ` [PATCH v2 1/2] cat-file: add mailmap support to -s option Siddharth Asthana
@ 2022-09-26 13:16     ` Ævar Arnfjörð Bjarmason
  2022-09-26 13:25     ` Ævar Arnfjörð Bjarmason
  1 sibling, 0 replies; 44+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-09-26 13:16 UTC (permalink / raw)
  To: Siddharth Asthana; +Cc: git, christian.couder, gitster, johncai86


On Mon, Sep 26 2022, Siddharth Asthana wrote:

> Even though the cat-file command with `-s` option does not complain when
> `--use-mailmap` option is given, the latter option is ignored. Compute
> the size of the object after replacing the idents and report it instead.
>
> In order to make `-s` option honour the mailmap mechanism we have to
> read the contents of the commit/tag object. Make use of the call to
> `oid_object_info_extended()` to get the contents of the object and store
> in `buf`. `buf` is later freed in the function.
>
> Mentored-by: Christian Couder <christian.couder@gmail.com>
> Mentored-by: John Cai <johncai86@gmail.com>
> Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
> ---
>  Documentation/git-cat-file.txt |  4 +++-
>  builtin/cat-file.c             | 13 +++++++++++++
>  t/t4203-mailmap.sh             | 10 ++++++++++
>  3 files changed, 26 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
> index ec30b5c574..594b6f2dfd 100644
> --- a/Documentation/git-cat-file.txt
> +++ b/Documentation/git-cat-file.txt
> @@ -45,7 +45,9 @@ OPTIONS
>  
>  -s::
>  	Instead of the content, show the object size identified by
> -	`<object>`.
> +	`<object>`. If used with `--use-mailmap` option, will show the
> +	size of updated object after replacing idents using the mailmap
> +	mechanism.
>  
>  -e::
>  	Exit with zero status if `<object>` exists and is a valid
> diff --git a/builtin/cat-file.c b/builtin/cat-file.c
> index 989eee0bb4..9942b93867 100644
> --- a/builtin/cat-file.c
> +++ b/builtin/cat-file.c
> @@ -132,8 +132,21 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name,
>  
>  	case 's':
>  		oi.sizep = &size;
> +
> +		if (use_mailmap) {
> +			oi.typep = &type;
> +			oi.contentp = (void**)&buf;
> +		}
> +
>  		if (oid_object_info_extended(the_repository, &oid, &oi, flags) < 0)
>  			die("git cat-file: could not get object info");
> +
> +		if (use_mailmap && (type == OBJ_COMMIT || type == OBJ_TAG)) {
> +			size_t s = size;
> +			buf = replace_idents_using_mailmap(buf, &s);

This is partially commentary on your already-landed series for cat-file
--mailmap support. I wondered why we needed this temporary variable, and
why we needed the cast_size_t_to_ulong() at all. On "master" we have a
size, but e.g. for cat_one_file()'s *current* codpaths we just pass it
to write_or_die().

So the net effect is that we refuse to use write_or_die() if the number
in size_t doesn't fit an unsigned long, even though we never need an
unsigned long in that case.

We have *other* things in the object code that need unsigned long, so it
probably amounts to no practical limitation, but it's confusing & I
think per [1] below we could do away with it.

There's also a subtle gotcha on "master", we
replace_idents_using_mailmap() with a possibly NULL "contents", which is
a misuse of the strbuf API (the "buf" member should never be NULL), but
we're about to die anyway...

> +			size = cast_size_t_to_ulong(s);
> +		}
> +
>  		printf("%"PRIuMAX"\n", (uintmax_t)size);

...but expanding on "master", here we have seemingly the first use of
cast_size_t_to_ulong() thaht's actually needed in this file. I.e. we are
about to use PRIuMAX.

But why not skip the cast(s) and make this more obvious by having the
printf() argument be cast_size_t_to_ulong(size)?

In your 2/2 you then have another use of cast_size_t_to_ulong() which
*is* needed in that case (we're about to stick it in a "unsigned long"
member, and the "size_t s" temporary variable is also needed in that
case.


1.

diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 989eee0bb4c..676c34cba4b 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -178,11 +178,8 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name,
 		if (!buf)
 			die("Cannot read object %s", obj_name);
 
-		if (use_mailmap) {
-			size_t s = size;
-			buf = replace_idents_using_mailmap(buf, &s);
-			size = cast_size_t_to_ulong(s);
-		}
+		if (use_mailmap)
+			buf = replace_idents_using_mailmap(buf, &size);
 
 		/* otherwise just spit out the data */
 		break;
@@ -218,11 +215,8 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name,
 		buf = read_object_with_reference(the_repository, &oid,
 						 exp_type_id, &size, NULL);
 
-		if (use_mailmap) {
-			size_t s = size;
-			buf = replace_idents_using_mailmap(buf, &s);
-			size = cast_size_t_to_ulong(s);
-		}
+		if (use_mailmap)
+			buf = replace_idents_using_mailmap(buf, &size);
 		break;
 	}
 	default:
@@ -391,12 +385,8 @@ static void print_object_or_die(struct batch_options *opt, struct expand_data *d
 
 		contents = read_object_file(oid, &type, &size);
 
-		if (use_mailmap) {
-			size_t s = size;
-			contents = replace_idents_using_mailmap(contents, &s);
-			size = cast_size_t_to_ulong(s);
-		}
-
+		if (use_mailmap)
+			contents = replace_idents_using_mailmap(contents, &size);
 		if (!contents)
 			die("object %s disappeared", oid_to_hex(oid));
 		if (type != data->type)

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 1/2] cat-file: add mailmap support to -s option
  2022-09-26 10:53   ` [PATCH v2 1/2] cat-file: add mailmap support to -s option Siddharth Asthana
  2022-09-26 13:16     ` Ævar Arnfjörð Bjarmason
@ 2022-09-26 13:25     ` Ævar Arnfjörð Bjarmason
  1 sibling, 0 replies; 44+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-09-26 13:25 UTC (permalink / raw)
  To: Siddharth Asthana; +Cc: git, christian.couder, gitster, johncai86


On Mon, Sep 26 2022, Siddharth Asthana wrote:

> Even though the cat-file command with `-s` option does not complain when
> `--use-mailmap` option is given, the latter option is ignored. Compute
> the size of the object after replacing the idents and report it instead.
>
> In order to make `-s` option honour the mailmap mechanism we have to
> read the contents of the commit/tag object. Make use of the call to
> `oid_object_info_extended()` to get the contents of the object and store
> in `buf`. `buf` is later freed in the function.
>
> Mentored-by: Christian Couder <christian.couder@gmail.com>
> Mentored-by: John Cai <johncai86@gmail.com>
> Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
> ---
>  Documentation/git-cat-file.txt |  4 +++-
>  builtin/cat-file.c             | 13 +++++++++++++
>  t/t4203-mailmap.sh             | 10 ++++++++++
>  3 files changed, 26 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
> index ec30b5c574..594b6f2dfd 100644
> --- a/Documentation/git-cat-file.txt
> +++ b/Documentation/git-cat-file.txt
> @@ -45,7 +45,9 @@ OPTIONS
>  
>  -s::
>  	Instead of the content, show the object size identified by
> -	`<object>`.
> +	`<object>`. If used with `--use-mailmap` option, will show the
> +	size of updated object after replacing idents using the mailmap
> +	mechanism.
>  
>  -e::
>  	Exit with zero status if `<object>` exists and is a valid
> diff --git a/builtin/cat-file.c b/builtin/cat-file.c
> index 989eee0bb4..9942b93867 100644
> --- a/builtin/cat-file.c
> +++ b/builtin/cat-file.c
> @@ -132,8 +132,21 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name,
>  
>  	case 's':
>  		oi.sizep = &size;
> +
> +		if (use_mailmap) {
> +			oi.typep = &type;
> +			oi.contentp = (void**)&buf;
> +		}
> +
>  		if (oid_object_info_extended(the_repository, &oid, &oi, flags) < 0)
>  			die("git cat-file: could not get object info");
> +
> +		if (use_mailmap && (type == OBJ_COMMIT || type == OBJ_TAG)) {

Just following along here: We want to handle both tag printing and size
computations. I.e. we happily search-replace the author in tag objects:
	
	$ git -P diff -- .mailmap
	diff --git a/.mailmap b/.mailmap
	index 07db36a9bb9..cace49e462b 100644
	--- a/.mailmap
	+++ b/.mailmap
	@@ -125,7 +125,7 @@ Jonathan del Strother <jon.delStrother@bestbefore.tv> <maillist@steelskies.com>
	 Josh Triplett <josh@joshtriplett.org> <josh@freedesktop.org>
	 Josh Triplett <josh@joshtriplett.org> <josht@us.ibm.com>
	 Julian Phillips <julian@quantumfyre.co.uk> <jp3@quantumfyre.co.uk>
	-Junio C Hamano <gitster@pobox.com> <gitster@pobox.com>
	+Foo <bar@baz.blah> Junio C Hamano <gitster@pobox.com>
	 Junio C Hamano <gitster@pobox.com> <junio@hera.kernel.org>
	 Junio C Hamano <gitster@pobox.com> <junio@kernel.org>
	 Junio C Hamano <gitster@pobox.com> <junio@pobox.com>
	$ ./git cat-file --use-mailmap tag v2.37.0 | head -n 4
	object e4a4b31577c7419497ac30cebe30d755b97752c5
	type commit
	tag v2.37.0
	tagger Foo <bar@baz.blah> 1656346695 -0700

And we want the "-s" to match, okey, but... (continued below)

> +			size_t s = size;
> +			buf = replace_idents_using_mailmap(buf, &s);
> +			size = cast_size_t_to_ulong(s);
> +		}
> +
>  		printf("%"PRIuMAX"\n", (uintmax_t)size);
>  		ret = 0;
>  		goto cleanup;
> diff --git a/t/t4203-mailmap.sh b/t/t4203-mailmap.sh
> index cd1cab3e54..59513e7c57 100755
> --- a/t/t4203-mailmap.sh
> +++ b/t/t4203-mailmap.sh
> @@ -1022,4 +1022,14 @@ test_expect_success '--mailmap enables mailmap in cat-file for annotated tag obj
>  	test_cmp expect actual
>  '
>  
> +test_expect_success 'git cat-file -s returns correct size with --use-mailmap' '
> +	test_when_finished "rm .mailmap" &&
> +	cat >.mailmap <<-EOF &&

nit: use \ before EOF, no variables here.

> +	C O Mitter <committer@example.com> Orig <orig@example.com>
> +	EOF
> +	echo "220" >expect &&

nit: no need for "" quotes.

> +	git cat-file --use-mailmap -s HEAD >actual &&

I'd find this a bit easier to follow if acter setting up .mailmap we did
something like (I didn't look up what the actual "234" value is):

	>actual &&
	git cat-file -s HEAD >actual &&
	git cat-file -s --use-mailmap HEAD >>actual &&
	cat >expect <<-\EOF
        234
        220
	EOF

We surely test that somewhere else, but it would be a bit more
self-documenting, as the difference in sizes would correspond to the
size of the address (or a multiple thereof, if it's used replaced N
times).

> +	test_cmp expect actual
> +'

...our test only checks the commit handling. Let's be a bit more
defensive here & test both potential paths through that new "if".

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v3 0/2] Add mailmap mechanism in cat-file options
  2022-09-16 20:59 [PATCH 0/3] Add mailmap mechanism in --batch-check options Siddharth Asthana
                   ` (3 preceding siblings ...)
  2022-09-26 10:53 ` [PATCH v2 0/2] Add mailmap mechanism in cat-file options Siddharth Asthana
@ 2022-10-29 10:24 ` Siddharth Asthana
  2022-10-29 10:24   ` [PATCH v3 1/2] cat-file: add mailmap support to -s option Siddharth Asthana
                     ` (2 more replies)
  2022-11-13 21:28 ` [PATCH v4 0/3] " Siddharth Asthana
                   ` (3 subsequent siblings)
  8 siblings, 3 replies; 44+ messages in thread
From: Siddharth Asthana @ 2022-10-29 10:24 UTC (permalink / raw)
  To: git
  Cc: christian.couder, gitster, johncai86, Johannes.Schindelin,
	avarab, Siddharth Asthana

Thanks a lot Avar for the review :) I have made the suggested changes.

= Description

At present, `git-cat-file` command with `--batch-check` and `-s` options
does not complain when `--use-mailmap` option is given. The latter
option is just ignored. Instead, for commit/tag objects, the command
should compute the size of the object after replacing the idents and
report it. So, this patch series makes `-s` and `--batch-check` options
of `git-cat-file` honor mailmap when used with `--use-mailmap` option.

In this patch series we didn't want to change that '%(objectsize)'
always shows the size of the original object even when `--use-mailmap`
is set because first we have the long term plan to unify how the formats
for `git cat-file` and other commands works. And second existing formats
like the "pretty formats" used bt `git log` have different options for
fields respecting mailmap or not respecting it (%an is for author name
while %aN for author name respecting mailmap).

I would like to thank my mentors, Christian Couder and John Cai, for all
of their help!
Looking forward to the reviews!

= Patch Organization

- The first patch makes `-s` option to return updated size of the
  <commit/tag> object, when combined with `--use-mailmap` option, after
  replacing the idents using the mailmap mechanism.
- The second patch makes `--batch-check` option to return updated size of
  the <commit/tag> object, when combined with `--use-mailmap` option,
  after replacing the idents using the mailmap mechanism.

= Changes in v2:

- The commit messages of both the patches have been improved.
- In the second patch, we were populating the `contentp` field of the
  `object_info` structure when `--batch-check` was combined with
  `--use-mailmap`. Which made us read the contents of tree and blob
  object types as well, which affected the performance. We should only
  be reading the contents for commit or tag object types. The second
  patch has been updated to do just that.

= Changes in v3:

- Make the tests a bit more self documenting by running commands in test
  without using mailmap flag.
- In the first patch added the test for tag objects as well.

Siddharth Asthana (2):
  cat-file: add mailmap support to -s option
  cat-file: add mailmap support to --batch-check option

 Documentation/git-cat-file.txt |  6 +++-
 builtin/cat-file.c             | 27 ++++++++++++++++
 t/t4203-mailmap.sh             | 59 ++++++++++++++++++++++++++++++++++
 3 files changed, 91 insertions(+), 1 deletion(-)

Range-diff against v2:
-:  ---------- > 1:  4b0231504b cat-file: add mailmap support to -s option
1:  0e142beb2b ! 2:  2e87897fbb cat-file: add mailmap support to --batch-check option
    @@ builtin/cat-file.c: static void batch_object_write(const char *obj_name,
      	strbuf_reset(scratch);
     
      ## t/t4203-mailmap.sh ##
    -@@ t/t4203-mailmap.sh: test_expect_success 'git cat-file -s returns correct size with --use-mailmap' '
    +@@ t/t4203-mailmap.sh: test_expect_success 'git cat-file -s returns correct size with --use-mailmap for
      	test_cmp expect actual
      '
      
     +test_expect_success 'git cat-file --batch-check returns correct size with --use-mailmap' '
     +	test_when_finished "rm .mailmap" &&
    -+	cat >.mailmap <<-EOF &&
    ++	cat >.mailmap <<-\EOF &&
     +	C O Mitter <committer@example.com> Orig <orig@example.com>
     +	EOF
    -+	echo "92d99959b011b1cd69fe7be5315d6732fe302575 commit 220" >expect &&
    ++	cat >expect <<-\EOF &&
    ++	92d99959b011b1cd69fe7be5315d6732fe302575 commit 209
    ++	92d99959b011b1cd69fe7be5315d6732fe302575 commit 220
    ++	EOF
     +	echo "HEAD" >in &&
    -+	git cat-file --use-mailmap --batch-check <in >actual &&
    ++	git cat-file --batch-check <in >actual &&
    ++	git cat-file --use-mailmap --batch-check <in >>actual &&
     +	test_cmp expect actual
     +'
     +
     +test_expect_success 'git cat-file --batch-command returns correct size with --use-mailmap' '
     +	test_when_finished "rm .mailmap" &&
    -+	cat >.mailmap <<-EOF &&
    ++	cat >.mailmap <<-\EOF &&
     +	C O Mitter <committer@example.com> Orig <orig@example.com>
     +	EOF
    -+	echo "92d99959b011b1cd69fe7be5315d6732fe302575 commit 220" >expect &&
    ++	cat >expect <<-\EOF &&
    ++	92d99959b011b1cd69fe7be5315d6732fe302575 commit 209
    ++	92d99959b011b1cd69fe7be5315d6732fe302575 commit 220
    ++	EOF
     +	echo "info HEAD" >in &&
    -+	git cat-file --use-mailmap --batch-command <in >actual &&
    ++	git cat-file --batch-command <in >actual &&
    ++	git cat-file --use-mailmap --batch-command <in >>actual &&
     +	test_cmp expect actual
     +'
     +
2:  e95cbd7932 < -:  ---------- Add Test
-- 
2.38.1.282.g2e87897fbb


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v3 1/2] cat-file: add mailmap support to -s option
  2022-10-29 10:24 ` [PATCH v3 0/2] Add mailmap mechanism in cat-file options Siddharth Asthana
@ 2022-10-29 10:24   ` Siddharth Asthana
  2022-10-31 11:49     ` Christian Couder
  2022-10-29 10:24   ` [PATCH v3 2/2] cat-file: add mailmap support to --batch-check option Siddharth Asthana
  2022-10-29 18:00   ` [PATCH v3 0/2] Add mailmap mechanism in cat-file options Taylor Blau
  2 siblings, 1 reply; 44+ messages in thread
From: Siddharth Asthana @ 2022-10-29 10:24 UTC (permalink / raw)
  To: git
  Cc: christian.couder, gitster, johncai86, Johannes.Schindelin,
	avarab, Siddharth Asthana

Even though the cat-file command with `-s` option does not complain when
`--use-mailmap` option is given, the latter option is ignored. Compute
the size of the object after replacing the idents and report it instead.

In order to make `-s` option honour the mailmap mechanism we have to
read the contents of the commit/tag object. Make use of the call to
`oid_object_info_extended()` to get the contents of the object and store
in `buf`. `buf` is later freed in the function.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: John Cai <johncai86@gmail.com>
Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
---
 Documentation/git-cat-file.txt |  4 +++-
 builtin/cat-file.c             | 13 +++++++++++++
 t/t4203-mailmap.sh             | 29 +++++++++++++++++++++++++++++
 3 files changed, 45 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
index ec30b5c574..594b6f2dfd 100644
--- a/Documentation/git-cat-file.txt
+++ b/Documentation/git-cat-file.txt
@@ -45,7 +45,9 @@ OPTIONS
 
 -s::
 	Instead of the content, show the object size identified by
-	`<object>`.
+	`<object>`. If used with `--use-mailmap` option, will show the
+	size of updated object after replacing idents using the mailmap
+	mechanism.
 
 -e::
 	Exit with zero status if `<object>` exists and is a valid
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index fa7bd89169..8a6e2343ec 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -132,8 +132,21 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name,
 
 	case 's':
 		oi.sizep = &size;
+
+		if (use_mailmap) {
+			oi.typep = &type;
+			oi.contentp = (void**)&buf;
+		}
+
 		if (oid_object_info_extended(the_repository, &oid, &oi, flags) < 0)
 			die("git cat-file: could not get object info");
+
+		if (use_mailmap && (type == OBJ_COMMIT || type == OBJ_TAG)) {
+			size_t s = size;
+			buf = replace_idents_using_mailmap(buf, &s);
+			size = cast_size_t_to_ulong(s);
+		}
+
 		printf("%"PRIuMAX"\n", (uintmax_t)size);
 		ret = 0;
 		goto cleanup;
diff --git a/t/t4203-mailmap.sh b/t/t4203-mailmap.sh
index cd1cab3e54..b500b31c92 100755
--- a/t/t4203-mailmap.sh
+++ b/t/t4203-mailmap.sh
@@ -1022,4 +1022,33 @@ test_expect_success '--mailmap enables mailmap in cat-file for annotated tag obj
 	test_cmp expect actual
 '
 
+test_expect_success 'git cat-file -s returns correct size with --use-mailmap' '
+	test_when_finished "rm .mailmap" &&
+	cat >.mailmap <<-\EOF &&
+	C O Mitter <committer@example.com> Orig <orig@example.com>
+	EOF
+	cat >expect <<-\EOF &&
+	209
+	220
+	EOF
+	git cat-file -s HEAD >actual &&
+	git cat-file --use-mailmap -s HEAD >>actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git cat-file -s returns correct size with --use-mailmap for tag objects' '
+	test_when_finished "rm .mailmap" &&
+	cat >.mailmap <<-\EOF &&
+	Orig <orig@example.com> C O Mitter <committer@example.com>
+	EOF
+	git tag -a -m "annotated tag" v3 &&
+	cat >expect <<-\EOF &&
+	141
+	130
+	EOF
+	git cat-file -s v3 >actual &&
+	git cat-file --use-mailmap -s v3 >>actual &&
+	test_cmp expect actual
+'
+
 test_done
-- 
2.38.1.282.g2e87897fbb


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 2/2] cat-file: add mailmap support to --batch-check option
  2022-10-29 10:24 ` [PATCH v3 0/2] Add mailmap mechanism in cat-file options Siddharth Asthana
  2022-10-29 10:24   ` [PATCH v3 1/2] cat-file: add mailmap support to -s option Siddharth Asthana
@ 2022-10-29 10:24   ` Siddharth Asthana
  2022-10-31 11:43     ` Christian Couder
  2022-10-29 18:00   ` [PATCH v3 0/2] Add mailmap mechanism in cat-file options Taylor Blau
  2 siblings, 1 reply; 44+ messages in thread
From: Siddharth Asthana @ 2022-10-29 10:24 UTC (permalink / raw)
  To: git
  Cc: christian.couder, gitster, johncai86, Johannes.Schindelin,
	avarab, Siddharth Asthana

Even though the cat-file command with `--batch-check` option does not
complain when `--use-mailmap` option is given, the latter option is
ignored. Compute the size of the object after replacing the idents and
report it instead.

In order to make `--batch-check` option honour the mailmap mechanism we
have to read the contents of the commit/tag object.

There were two ways to do it:

1. Make two calls to `oid_object_info_extended()`. If `--use-mailmap`
   option is given, the first call will get us the type of the object
   and second call will only be made if the object type is either a
   commit or tag to get the contents of the object.

2. Make one call to `oid_object_info_extended()` to get the type of the
   object. Then, if the object type is either of commit or tag, make a
   call to `read_object_file()` to read the contents of the object.

I benchmarked the following command with both the above approaches and
compared against the current implementation where `--use-mailmap`
option is ignored:

`git cat-file --use-mailmap --batch-all-objects --batch-check --buffer
--unordered`

The results can be summarized as follows:
                       Time (mean ± σ)
default               827.7 ms ± 104.8 ms
first approach        6.197 s ± 0.093 s
second approach       1.975 s ± 0.217 s

Since, the second approach is faster than the first one, I implemented
it in this patch.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: John Cai <johncai86@gmail.com>
Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
---
 Documentation/git-cat-file.txt |  2 ++
 builtin/cat-file.c             | 14 ++++++++++++++
 t/t4203-mailmap.sh             | 30 ++++++++++++++++++++++++++++++
 3 files changed, 46 insertions(+)

diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
index 594b6f2dfd..a8e7906b3d 100644
--- a/Documentation/git-cat-file.txt
+++ b/Documentation/git-cat-file.txt
@@ -103,6 +103,8 @@ OPTIONS
 	`--textconv` or `--filters`, in which case the input lines also
 	need to specify the path, separated by whitespace.  See the
 	section `BATCH OUTPUT` below for details.
+	If used with `--use-mailmap` option, will show the size of
+	updated object after replacing idents using the mailmap mechanism.
 
 --batch-command::
 --batch-command=<format>::
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 8a6e2343ec..39f2a2483f 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -444,6 +444,9 @@ static void batch_object_write(const char *obj_name,
 	if (!data->skip_object_info) {
 		int ret;
 
+		if (use_mailmap)
+			data->info.typep = &data->type;
+
 		if (pack)
 			ret = packed_object_info(the_repository, pack, offset,
 						 &data->info);
@@ -457,6 +460,17 @@ static void batch_object_write(const char *obj_name,
 			fflush(stdout);
 			return;
 		}
+
+		if (use_mailmap && (data->type == OBJ_COMMIT || data->type == OBJ_TAG)) {
+			size_t s = data->size;
+			char *buf = NULL;
+
+			buf = read_object_file(&data->oid, &data->type, &data->size);
+			buf = replace_idents_using_mailmap(buf, &s);
+			data->size = cast_size_t_to_ulong(s);
+
+			free(buf);
+		}
 	}
 
 	strbuf_reset(scratch);
diff --git a/t/t4203-mailmap.sh b/t/t4203-mailmap.sh
index b500b31c92..b4355c3e51 100755
--- a/t/t4203-mailmap.sh
+++ b/t/t4203-mailmap.sh
@@ -1051,4 +1051,34 @@ test_expect_success 'git cat-file -s returns correct size with --use-mailmap for
 	test_cmp expect actual
 '
 
+test_expect_success 'git cat-file --batch-check returns correct size with --use-mailmap' '
+	test_when_finished "rm .mailmap" &&
+	cat >.mailmap <<-\EOF &&
+	C O Mitter <committer@example.com> Orig <orig@example.com>
+	EOF
+	cat >expect <<-\EOF &&
+	92d99959b011b1cd69fe7be5315d6732fe302575 commit 209
+	92d99959b011b1cd69fe7be5315d6732fe302575 commit 220
+	EOF
+	echo "HEAD" >in &&
+	git cat-file --batch-check <in >actual &&
+	git cat-file --use-mailmap --batch-check <in >>actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git cat-file --batch-command returns correct size with --use-mailmap' '
+	test_when_finished "rm .mailmap" &&
+	cat >.mailmap <<-\EOF &&
+	C O Mitter <committer@example.com> Orig <orig@example.com>
+	EOF
+	cat >expect <<-\EOF &&
+	92d99959b011b1cd69fe7be5315d6732fe302575 commit 209
+	92d99959b011b1cd69fe7be5315d6732fe302575 commit 220
+	EOF
+	echo "info HEAD" >in &&
+	git cat-file --batch-command <in >actual &&
+	git cat-file --use-mailmap --batch-command <in >>actual &&
+	test_cmp expect actual
+'
+
 test_done
-- 
2.38.1.282.g2e87897fbb


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 0/2] Add mailmap mechanism in cat-file options
  2022-10-29 10:24 ` [PATCH v3 0/2] Add mailmap mechanism in cat-file options Siddharth Asthana
  2022-10-29 10:24   ` [PATCH v3 1/2] cat-file: add mailmap support to -s option Siddharth Asthana
  2022-10-29 10:24   ` [PATCH v3 2/2] cat-file: add mailmap support to --batch-check option Siddharth Asthana
@ 2022-10-29 18:00   ` Taylor Blau
  2 siblings, 0 replies; 44+ messages in thread
From: Taylor Blau @ 2022-10-29 18:00 UTC (permalink / raw)
  To: Siddharth Asthana
  Cc: git, christian.couder, gitster, johncai86, Johannes.Schindelin, avarab

On Sat, Oct 29, 2022 at 03:54:57PM +0530, Siddharth Asthana wrote:
> Siddharth Asthana (2):
>   cat-file: add mailmap support to -s option
>   cat-file: add mailmap support to --batch-check option
>
>  Documentation/git-cat-file.txt |  6 +++-
>  builtin/cat-file.c             | 27 ++++++++++++++++
>  t/t4203-mailmap.sh             | 59 ++++++++++++++++++++++++++++++++++
>  3 files changed, 91 insertions(+), 1 deletion(-)

This approach in this round looks OK to my eyes. Would some of the
other reviewers (perhaps Ævar or Christian) chime in and see if they
agree?

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 2/2] cat-file: add mailmap support to --batch-check option
  2022-10-29 10:24   ` [PATCH v3 2/2] cat-file: add mailmap support to --batch-check option Siddharth Asthana
@ 2022-10-31 11:43     ` Christian Couder
  0 siblings, 0 replies; 44+ messages in thread
From: Christian Couder @ 2022-10-31 11:43 UTC (permalink / raw)
  To: Siddharth Asthana; +Cc: git, gitster, johncai86, Johannes.Schindelin, avarab

On Sat, Oct 29, 2022 at 12:25 PM Siddharth Asthana
<siddharthasthana31@gmail.com> wrote:

> diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
> index 594b6f2dfd..a8e7906b3d 100644
> --- a/Documentation/git-cat-file.txt
> +++ b/Documentation/git-cat-file.txt
> @@ -103,6 +103,8 @@ OPTIONS
>         `--textconv` or `--filters`, in which case the input lines also
>         need to specify the path, separated by whitespace.  See the
>         section `BATCH OUTPUT` below for details.
> +       If used with `--use-mailmap` option, will show the size of
> +       updated object after replacing idents using the mailmap mechanism.

I think all documentation changes should probably go to the
documentation patch that was split off this series, instead of in this
series.

Whether the other patch gets merged before or after this series, I
think it will simplify things.

Otherwise, this LGTM, thanks!

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 1/2] cat-file: add mailmap support to -s option
  2022-10-29 10:24   ` [PATCH v3 1/2] cat-file: add mailmap support to -s option Siddharth Asthana
@ 2022-10-31 11:49     ` Christian Couder
  0 siblings, 0 replies; 44+ messages in thread
From: Christian Couder @ 2022-10-31 11:49 UTC (permalink / raw)
  To: Siddharth Asthana; +Cc: git, gitster, johncai86, Johannes.Schindelin, avarab

On Sat, Oct 29, 2022 at 12:25 PM Siddharth Asthana
<siddharthasthana31@gmail.com> wrote:

> diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
> index ec30b5c574..594b6f2dfd 100644
> --- a/Documentation/git-cat-file.txt
> +++ b/Documentation/git-cat-file.txt
> @@ -45,7 +45,9 @@ OPTIONS
>
>  -s::
>         Instead of the content, show the object size identified by
> -       `<object>`.
> +       `<object>`. If used with `--use-mailmap` option, will show the
> +       size of updated object after replacing idents using the mailmap
> +       mechanism.

I think this should also go in the other documentation patch that was
split off from this series, but it is less clear cut here as there are
probably no conflicts with the other patch.

Otherwise this patch also LGTM, thanks!

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v4 0/3] Add mailmap mechanism in cat-file options
  2022-09-16 20:59 [PATCH 0/3] Add mailmap mechanism in --batch-check options Siddharth Asthana
                   ` (4 preceding siblings ...)
  2022-10-29 10:24 ` [PATCH v3 0/2] Add mailmap mechanism in cat-file options Siddharth Asthana
@ 2022-11-13 21:28 ` Siddharth Asthana
  2022-11-13 21:28   ` [PATCH v4 1/3] cat-file: add mailmap support to -s option Siddharth Asthana
                     ` (3 more replies)
  2022-11-20  7:48 ` [PATCH v5 0/2] " Siddharth Asthana
                   ` (2 subsequent siblings)
  8 siblings, 4 replies; 44+ messages in thread
From: Siddharth Asthana @ 2022-11-13 21:28 UTC (permalink / raw)
  To: git
  Cc: christian.couder, gitster, johncai86, Johannes.Schindelin,
	avarab, me, Siddharth Asthana

Thanks a lot Junio, Taylor and Christian for the review :) I have made
the suggested changes.

= Description

At present, `git-cat-file` command with `--batch-check` and `-s` options
does not complain when `--use-mailmap` option is given. The latter
option is just ignored. Instead, for commit/tag objects, the command
should compute the size of the object after replacing the idents and
report it. So, this patch series makes `-s` and `--batch-check` options
of `git-cat-file` honor mailmap when used with `--use-mailmap` option.

In this patch series we didn't want to change that '%(objectsize)'
always shows the size of the original object even when `--use-mailmap`
is set because first we have the long term plan to unify how the formats
for `git cat-file` and other commands works. And second existing formats
like the "pretty formats" used bt `git log` have different options for
fields respecting mailmap or not respecting it (%an is for author name
while %aN for author name respecting mailmap).

I would like to thank my mentors, Christian Couder and John Cai, for all
of their help!
Looking forward to the reviews!

= Patch Organization

- The first patch makes `-s` option to return updated size of the
  <commit/tag> object, when combined with `--use-mailmap` option, after
  replacing the idents using the mailmap mechanism.
- The second patch makes `--batch-check` option to return updated size of
  the <commit/tag> object, when combined with `--use-mailmap` option,
  after replacing the idents using the mailmap mechanism.
- The third patch improves the documentation of `-s`, `--batch`,
  `--batch-check` and `--batch-command` options by adding they can be
  combined with `--use-mailmap` options.

= Changes in v4

- Improve the documentation patch to clearly state that the `-s`,
  `--batch-check`, `--batch-command` and `--batch` options can be only
  be used with `--textconv`, `--filters` or `--use-mailmap`.

Siddharth Asthana (3):
  cat-file: add mailmap support to -s option
  cat-file: add mailmap support to --batch-check option
  doc/cat-file: allow --use-mailmap for --batch options

 Documentation/git-cat-file.txt | 53 ++++++++++++++++++++++--------
 builtin/cat-file.c             | 27 ++++++++++++++++
 t/t4203-mailmap.sh             | 59 ++++++++++++++++++++++++++++++++++
 3 files changed, 125 insertions(+), 14 deletions(-)

Range-diff against v3:
1:  38bb89d350 ! 1:  4ae3af37d2 cat-file: add mailmap support to -s option
    @@ Commit message
     
         Mentored-by: Christian Couder <christian.couder@gmail.com>
         Mentored-by: John Cai <johncai86@gmail.com>
    +    Helped-by: Taylor Blau <me@ttaylorr.com>
    +    Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
         Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
     
    - ## Documentation/git-cat-file.txt ##
    -@@ Documentation/git-cat-file.txt: OPTIONS
    - 
    - -s::
    - 	Instead of the content, show the object size identified by
    --	`<object>`.
    -+	`<object>`. If used with `--use-mailmap` option, will show the
    -+	size of updated object after replacing idents using the mailmap
    -+	mechanism.
    - 
    - -e::
    - 	Exit with zero status if `<object>` exists and is a valid
    -
      ## builtin/cat-file.c ##
     @@ builtin/cat-file.c: static int cat_one_file(int opt, const char *exp_type, const char *obj_name,
      
2:  4d49cfde73 ! 2:  a692646228 cat-file: add mailmap support to --batch-check option
    @@ Commit message
     
         Mentored-by: Christian Couder <christian.couder@gmail.com>
         Mentored-by: John Cai <johncai86@gmail.com>
    +    Helped-by: Taylor Blau <me@ttaylorr.com>
    +    Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
         Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
     
    - ## Documentation/git-cat-file.txt ##
    -@@ Documentation/git-cat-file.txt: OPTIONS
    - 	except `--textconv` or `--filters`, in which case the input lines
    - 	also need to specify the path, separated by whitespace.  See the
    - 	section `BATCH OUTPUT` below for details.
    -+	If used with `--use-mailmap` option, will show the size of
    -+	updated object after replacing idents using the mailmap mechanism.
    - 
    - --batch-check::
    - --batch-check=<format>::
    -
      ## builtin/cat-file.c ##
     @@ builtin/cat-file.c: static void batch_object_write(const char *obj_name,
      	if (!data->skip_object_info) {
-:  ---------- > 3:  41b4650b24 doc/cat-file: allow --use-mailmap for --batch options
-- 
2.38.1.420.g319605f8f0


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v4 1/3] cat-file: add mailmap support to -s option
  2022-11-13 21:28 ` [PATCH v4 0/3] " Siddharth Asthana
@ 2022-11-13 21:28   ` Siddharth Asthana
  2022-11-13 21:28   ` [PATCH v4 2/3] cat-file: add mailmap support to --batch-check option Siddharth Asthana
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 44+ messages in thread
From: Siddharth Asthana @ 2022-11-13 21:28 UTC (permalink / raw)
  To: git
  Cc: christian.couder, gitster, johncai86, Johannes.Schindelin,
	avarab, me, Siddharth Asthana

Even though the cat-file command with `-s` option does not complain when
`--use-mailmap` option is given, the latter option is ignored. Compute
the size of the object after replacing the idents and report it instead.

In order to make `-s` option honour the mailmap mechanism we have to
read the contents of the commit/tag object. Make use of the call to
`oid_object_info_extended()` to get the contents of the object and store
in `buf`. `buf` is later freed in the function.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: John Cai <johncai86@gmail.com>
Helped-by: Taylor Blau <me@ttaylorr.com>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
---
 builtin/cat-file.c | 13 +++++++++++++
 t/t4203-mailmap.sh | 29 +++++++++++++++++++++++++++++
 2 files changed, 42 insertions(+)

diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index fa7bd89169..8a6e2343ec 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -132,8 +132,21 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name,
 
 	case 's':
 		oi.sizep = &size;
+
+		if (use_mailmap) {
+			oi.typep = &type;
+			oi.contentp = (void**)&buf;
+		}
+
 		if (oid_object_info_extended(the_repository, &oid, &oi, flags) < 0)
 			die("git cat-file: could not get object info");
+
+		if (use_mailmap && (type == OBJ_COMMIT || type == OBJ_TAG)) {
+			size_t s = size;
+			buf = replace_idents_using_mailmap(buf, &s);
+			size = cast_size_t_to_ulong(s);
+		}
+
 		printf("%"PRIuMAX"\n", (uintmax_t)size);
 		ret = 0;
 		goto cleanup;
diff --git a/t/t4203-mailmap.sh b/t/t4203-mailmap.sh
index cd1cab3e54..b500b31c92 100755
--- a/t/t4203-mailmap.sh
+++ b/t/t4203-mailmap.sh
@@ -1022,4 +1022,33 @@ test_expect_success '--mailmap enables mailmap in cat-file for annotated tag obj
 	test_cmp expect actual
 '
 
+test_expect_success 'git cat-file -s returns correct size with --use-mailmap' '
+	test_when_finished "rm .mailmap" &&
+	cat >.mailmap <<-\EOF &&
+	C O Mitter <committer@example.com> Orig <orig@example.com>
+	EOF
+	cat >expect <<-\EOF &&
+	209
+	220
+	EOF
+	git cat-file -s HEAD >actual &&
+	git cat-file --use-mailmap -s HEAD >>actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git cat-file -s returns correct size with --use-mailmap for tag objects' '
+	test_when_finished "rm .mailmap" &&
+	cat >.mailmap <<-\EOF &&
+	Orig <orig@example.com> C O Mitter <committer@example.com>
+	EOF
+	git tag -a -m "annotated tag" v3 &&
+	cat >expect <<-\EOF &&
+	141
+	130
+	EOF
+	git cat-file -s v3 >actual &&
+	git cat-file --use-mailmap -s v3 >>actual &&
+	test_cmp expect actual
+'
+
 test_done
-- 
2.38.1.420.g319605f8f0


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v4 2/3] cat-file: add mailmap support to --batch-check option
  2022-11-13 21:28 ` [PATCH v4 0/3] " Siddharth Asthana
  2022-11-13 21:28   ` [PATCH v4 1/3] cat-file: add mailmap support to -s option Siddharth Asthana
@ 2022-11-13 21:28   ` Siddharth Asthana
  2022-11-15 21:40     ` Taylor Blau
  2022-11-13 21:28   ` [PATCH v4 3/3] doc/cat-file: allow --use-mailmap for --batch options Siddharth Asthana
  2022-11-14 17:48   ` [PATCH v4 0/3] Add mailmap mechanism in cat-file options Christian Couder
  3 siblings, 1 reply; 44+ messages in thread
From: Siddharth Asthana @ 2022-11-13 21:28 UTC (permalink / raw)
  To: git
  Cc: christian.couder, gitster, johncai86, Johannes.Schindelin,
	avarab, me, Siddharth Asthana

Even though the cat-file command with `--batch-check` option does not
complain when `--use-mailmap` option is given, the latter option is
ignored. Compute the size of the object after replacing the idents and
report it instead.

In order to make `--batch-check` option honour the mailmap mechanism we
have to read the contents of the commit/tag object.

There were two ways to do it:

1. Make two calls to `oid_object_info_extended()`. If `--use-mailmap`
   option is given, the first call will get us the type of the object
   and second call will only be made if the object type is either a
   commit or tag to get the contents of the object.

2. Make one call to `oid_object_info_extended()` to get the type of the
   object. Then, if the object type is either of commit or tag, make a
   call to `read_object_file()` to read the contents of the object.

I benchmarked the following command with both the above approaches and
compared against the current implementation where `--use-mailmap`
option is ignored:

`git cat-file --use-mailmap --batch-all-objects --batch-check --buffer
--unordered`

The results can be summarized as follows:
                       Time (mean ± σ)
default               827.7 ms ± 104.8 ms
first approach        6.197 s ± 0.093 s
second approach       1.975 s ± 0.217 s

Since, the second approach is faster than the first one, I implemented
it in this patch.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: John Cai <johncai86@gmail.com>
Helped-by: Taylor Blau <me@ttaylorr.com>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
---
 builtin/cat-file.c | 14 ++++++++++++++
 t/t4203-mailmap.sh | 30 ++++++++++++++++++++++++++++++
 2 files changed, 44 insertions(+)

diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 8a6e2343ec..39f2a2483f 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -444,6 +444,9 @@ static void batch_object_write(const char *obj_name,
 	if (!data->skip_object_info) {
 		int ret;
 
+		if (use_mailmap)
+			data->info.typep = &data->type;
+
 		if (pack)
 			ret = packed_object_info(the_repository, pack, offset,
 						 &data->info);
@@ -457,6 +460,17 @@ static void batch_object_write(const char *obj_name,
 			fflush(stdout);
 			return;
 		}
+
+		if (use_mailmap && (data->type == OBJ_COMMIT || data->type == OBJ_TAG)) {
+			size_t s = data->size;
+			char *buf = NULL;
+
+			buf = read_object_file(&data->oid, &data->type, &data->size);
+			buf = replace_idents_using_mailmap(buf, &s);
+			data->size = cast_size_t_to_ulong(s);
+
+			free(buf);
+		}
 	}
 
 	strbuf_reset(scratch);
diff --git a/t/t4203-mailmap.sh b/t/t4203-mailmap.sh
index b500b31c92..b4355c3e51 100755
--- a/t/t4203-mailmap.sh
+++ b/t/t4203-mailmap.sh
@@ -1051,4 +1051,34 @@ test_expect_success 'git cat-file -s returns correct size with --use-mailmap for
 	test_cmp expect actual
 '
 
+test_expect_success 'git cat-file --batch-check returns correct size with --use-mailmap' '
+	test_when_finished "rm .mailmap" &&
+	cat >.mailmap <<-\EOF &&
+	C O Mitter <committer@example.com> Orig <orig@example.com>
+	EOF
+	cat >expect <<-\EOF &&
+	92d99959b011b1cd69fe7be5315d6732fe302575 commit 209
+	92d99959b011b1cd69fe7be5315d6732fe302575 commit 220
+	EOF
+	echo "HEAD" >in &&
+	git cat-file --batch-check <in >actual &&
+	git cat-file --use-mailmap --batch-check <in >>actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git cat-file --batch-command returns correct size with --use-mailmap' '
+	test_when_finished "rm .mailmap" &&
+	cat >.mailmap <<-\EOF &&
+	C O Mitter <committer@example.com> Orig <orig@example.com>
+	EOF
+	cat >expect <<-\EOF &&
+	92d99959b011b1cd69fe7be5315d6732fe302575 commit 209
+	92d99959b011b1cd69fe7be5315d6732fe302575 commit 220
+	EOF
+	echo "info HEAD" >in &&
+	git cat-file --batch-command <in >actual &&
+	git cat-file --use-mailmap --batch-command <in >>actual &&
+	test_cmp expect actual
+'
+
 test_done
-- 
2.38.1.420.g319605f8f0


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v4 3/3] doc/cat-file: allow --use-mailmap for --batch options
  2022-11-13 21:28 ` [PATCH v4 0/3] " Siddharth Asthana
  2022-11-13 21:28   ` [PATCH v4 1/3] cat-file: add mailmap support to -s option Siddharth Asthana
  2022-11-13 21:28   ` [PATCH v4 2/3] cat-file: add mailmap support to --batch-check option Siddharth Asthana
@ 2022-11-13 21:28   ` Siddharth Asthana
  2022-11-14 17:48   ` [PATCH v4 0/3] Add mailmap mechanism in cat-file options Christian Couder
  3 siblings, 0 replies; 44+ messages in thread
From: Siddharth Asthana @ 2022-11-13 21:28 UTC (permalink / raw)
  To: git
  Cc: christian.couder, gitster, johncai86, Johannes.Schindelin,
	avarab, me, Siddharth Asthana

The command git cat-file can now use the mailmap mechanism to replace
idents with their canonical versions for commit and tag objects. There
are several options like `-s`, `--batch`, `--batch-check` and
`--batch-command` that can be combined with `--use-mailmap`. But the
documentation for `-s`, `--batch`, `--batch-check` and `--batch-command`
doesn't say so. This patch fixes that documentation.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: John Cai <johncai86@gmail.com>
Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
---
 Documentation/git-cat-file.txt | 53 +++++++++++++++++++++++++---------
 1 file changed, 39 insertions(+), 14 deletions(-)

diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
index ec30b5c574..81235c60a3 100644
--- a/Documentation/git-cat-file.txt
+++ b/Documentation/git-cat-file.txt
@@ -45,7 +45,9 @@ OPTIONS
 
 -s::
 	Instead of the content, show the object size identified by
-	`<object>`.
+	`<object>`. If used with `--use-mailmap` option, will show
+	the size of updated object after replacing idents using the
+	mailmap mechanism.
 
 -e::
 	Exit with zero status if `<object>` exists and is a valid
@@ -89,26 +91,49 @@ OPTIONS
 --batch::
 --batch=<format>::
 	Print object information and contents for each object provided
-	on stdin.  May not be combined with any other options or arguments
-	except `--textconv` or `--filters`, in which case the input lines
-	also need to specify the path, separated by whitespace.  See the
-	section `BATCH OUTPUT` below for details.
+	on stdin. May not be combined with any other options or arguments
+	except --textconv, --filters, or --use-mailmap.
+	+
+	* When used with `--textconv` or `--filters`, the input lines
+	  must specify the path, separated by whitespace. See the section
+	  `BATCH OUTPUT` below for details.
+	+
+	* When used with `--use-mailmap`, for commit and tag objects, the
+	  contents part of the output shows the identities replaced using the
+	  mailmap mechanism, while the information part of the output shows
+	  the size of the object as if it actually recorded the replacement
+	  identities.
 
 --batch-check::
 --batch-check=<format>::
-	Print object information for each object provided on stdin.  May
-	not be combined with any other options or arguments except
-	`--textconv` or `--filters`, in which case the input lines also
-	need to specify the path, separated by whitespace.  See the
-	section `BATCH OUTPUT` below for details.
+	Print object information for each object provided on stdin. May not be
+	combined with any other options or arguments except --textconv, --filters
+	or --use-mailmap.
+	+
+	* When used with `--textconv` or `--filters`, the input lines must
+	 specify the path, separated by whitespace. See the section
+	 `BATCH OUTPUT` below for details.
+	+
+	* When used with `--use-mailmap`, for commit and tag objects, the
+	  printed object information shows the size of the object as if the
+	  identities recorded in it were replaced by the mailmap mechanism.
 
 --batch-command::
 --batch-command=<format>::
 	Enter a command mode that reads commands and arguments from stdin. May
-	only be combined with `--buffer`, `--textconv` or `--filters`. In the
-	case of `--textconv` or `--filters`, the input lines also need to specify
-	the path, separated by whitespace. See the section `BATCH OUTPUT` below
-	for details.
+	only be combined with `--buffer`, `--textconv`, `--use-mailmap` or
+	`--filters`.
+	+
+	* When used with `--textconv` or `--filters`, the input lines must
+	  specify the path, separated by whitespace. See the section
+	  `BATCH OUTPUT` below for details.
+	+
+	* When used with `--use-mailmap`, for commit and tag objects, the
+	  `contents` command shows the identities replaced using the
+	  mailmap mechanism, while the `info` command shows the size
+	  of the object as if it actually recorded the replacement
+	  identities.
+
 +
 `--batch-command` recognizes the following commands:
 +
-- 
2.38.1.420.g319605f8f0


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [PATCH v4 0/3] Add mailmap mechanism in cat-file options
  2022-11-13 21:28 ` [PATCH v4 0/3] " Siddharth Asthana
                     ` (2 preceding siblings ...)
  2022-11-13 21:28   ` [PATCH v4 3/3] doc/cat-file: allow --use-mailmap for --batch options Siddharth Asthana
@ 2022-11-14 17:48   ` Christian Couder
  2022-11-14 22:30     ` Taylor Blau
  2022-11-20  7:42     ` Siddharth Asthana
  3 siblings, 2 replies; 44+ messages in thread
From: Christian Couder @ 2022-11-14 17:48 UTC (permalink / raw)
  To: Siddharth Asthana
  Cc: git, gitster, johncai86, Johannes.Schindelin, avarab, me

On Sun, Nov 13, 2022 at 10:28 PM Siddharth Asthana
<siddharthasthana31@gmail.com> wrote:

> In this patch series we didn't want to change that '%(objectsize)'
> always shows the size of the original object even when `--use-mailmap`
> is set because first we have the long term plan to unify how the formats
> for `git cat-file` and other commands works. And second existing formats
> like the "pretty formats" used bt `git log` have different options for

s/used bt/used by/

> fields respecting mailmap or not respecting it (%an is for author name
> while %aN for author name respecting mailmap).

[...]

> = Patch Organization
>
> - The first patch makes `-s` option to return updated size of the
>   <commit/tag> object, when combined with `--use-mailmap` option, after
>   replacing the idents using the mailmap mechanism.
> - The second patch makes `--batch-check` option to return updated size of
>   the <commit/tag> object, when combined with `--use-mailmap` option,
>   after replacing the idents using the mailmap mechanism.
> - The third patch improves the documentation of `-s`, `--batch`,
>   `--batch-check` and `--batch-command` options by adding they can be
>   combined with `--use-mailmap` options.

So the documentation patch is now part of this small series again.

Even if this documentation patch is a bug fix, it might be better at
this point to squash this patch into the patches 1/3 and 2/3. At least
I think that would better follow Junio's last comments about this. If
you go this way, you might want to squash the documentation parts
about -s from patch 3/3 into patch 1/3 and the rest of patch 3/3 into
patch 2/3.

> = Changes in v4
>
> - Improve the documentation patch to clearly state that the `-s`,
>   `--batch-check`, `--batch-command` and `--batch` options can be only
>   be used with `--textconv`, `--filters` or `--use-mailmap`.

Here I think you should say that the documentation patch is part of
this series again and explain a bit why.

Anyway I took a look at the actual patches in this series and they
look good to me now. So I would be Ok to merge it either as is or with
patch 3/3 squashed into patches 1/3 and 2/3 as discussed above.

Thanks!

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v4 0/3] Add mailmap mechanism in cat-file options
  2022-11-14 17:48   ` [PATCH v4 0/3] Add mailmap mechanism in cat-file options Christian Couder
@ 2022-11-14 22:30     ` Taylor Blau
  2022-11-20  7:42     ` Siddharth Asthana
  1 sibling, 0 replies; 44+ messages in thread
From: Taylor Blau @ 2022-11-14 22:30 UTC (permalink / raw)
  To: Christian Couder
  Cc: Siddharth Asthana, git, gitster, johncai86, Johannes.Schindelin,
	avarab, me

On Mon, Nov 14, 2022 at 06:48:54PM +0100, Christian Couder wrote:
> > = Patch Organization
> >
> > - The first patch makes `-s` option to return updated size of the
> >   <commit/tag> object, when combined with `--use-mailmap` option, after
> >   replacing the idents using the mailmap mechanism.
> > - The second patch makes `--batch-check` option to return updated size of
> >   the <commit/tag> object, when combined with `--use-mailmap` option,
> >   after replacing the idents using the mailmap mechanism.
> > - The third patch improves the documentation of `-s`, `--batch`,
> >   `--batch-check` and `--batch-command` options by adding they can be
> >   combined with `--use-mailmap` options.
>
> So the documentation patch is now part of this small series again.

For what it's worth, I'm fine to include the documentation updates in
this series. It makes it vaguely easier to queue, but I think if we're
close to merging this one down, then there's no reason to separate the
two.

> Even if this documentation patch is a bug fix, it might be better at
> this point to squash this patch into the patches 1/3 and 2/3. At least
> I think that would better follow Junio's last comments about this. If
> you go this way, you might want to squash the documentation parts
> about -s from patch 3/3 into patch 1/3 and the rest of patch 3/3 into
> patch 2/3.

Thanks. I agree with you that following Junio's advice in this instance
is good to do. Let's wait for one hopefully-final reroll and then start
merging this down.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v4 2/3] cat-file: add mailmap support to --batch-check option
  2022-11-13 21:28   ` [PATCH v4 2/3] cat-file: add mailmap support to --batch-check option Siddharth Asthana
@ 2022-11-15 21:40     ` Taylor Blau
  0 siblings, 0 replies; 44+ messages in thread
From: Taylor Blau @ 2022-11-15 21:40 UTC (permalink / raw)
  To: Siddharth Asthana
  Cc: git, christian.couder, gitster, johncai86, Johannes.Schindelin, avarab

On Mon, Nov 14, 2022 at 02:58:29AM +0530, Siddharth Asthana wrote:
> @@ -1051,4 +1051,34 @@ test_expect_success 'git cat-file -s returns correct size with --use-mailmap for
>  	test_cmp expect actual
>  '
>
> +test_expect_success 'git cat-file --batch-check returns correct size with --use-mailmap' '
> +	test_when_finished "rm .mailmap" &&
> +	cat >.mailmap <<-\EOF &&
> +	C O Mitter <committer@example.com> Orig <orig@example.com>
> +	EOF
> +	cat >expect <<-\EOF &&
> +	92d99959b011b1cd69fe7be5315d6732fe302575 commit 209
> +	92d99959b011b1cd69fe7be5315d6732fe302575 commit 220
> +	EOF
> +	echo "HEAD" >in &&
> +	git cat-file --batch-check <in >actual &&
> +	git cat-file --use-mailmap --batch-check <in >>actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'git cat-file --batch-command returns correct size with --use-mailmap' '
> +	test_when_finished "rm .mailmap" &&
> +	cat >.mailmap <<-\EOF &&
> +	C O Mitter <committer@example.com> Orig <orig@example.com>
> +	EOF
> +	cat >expect <<-\EOF &&
> +	92d99959b011b1cd69fe7be5315d6732fe302575 commit 209
> +	92d99959b011b1cd69fe7be5315d6732fe302575 commit 220
> +	EOF
> +	echo "info HEAD" >in &&
> +	git cat-file --batch-command <in >actual &&
> +	git cat-file --use-mailmap --batch-command <in >>actual &&
> +	test_cmp expect actual
> +'
> +
>  test_done

These two tests (and the -s ones above it) are not going to work when
run under SHA-256 mode, i.e., with 'GIT_TEST_DEFAULT_HASH=sha256' in the
environment:

  not ok 66 - git cat-file -s returns correct size with --use-mailmap
  not ok 67 - git cat-file -s returns correct size with --use-mailmap for tag objects
  not ok 68 - git cat-file --batch-check returns correct size with --use-mailmap
  not ok 69 - git cat-file --batch-command returns correct size with --use-mailmap

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v4 0/3] Add mailmap mechanism in cat-file options
  2022-11-14 17:48   ` [PATCH v4 0/3] Add mailmap mechanism in cat-file options Christian Couder
  2022-11-14 22:30     ` Taylor Blau
@ 2022-11-20  7:42     ` Siddharth Asthana
  1 sibling, 0 replies; 44+ messages in thread
From: Siddharth Asthana @ 2022-11-20  7:42 UTC (permalink / raw)
  To: Christian Couder; +Cc: git, gitster, johncai86, Johannes.Schindelin, avarab, me



On 14/11/22 23:18, Christian Couder wrote:
> On Sun, Nov 13, 2022 at 10:28 PM Siddharth Asthana
> <siddharthasthana31@gmail.com> wrote:
> 
>> In this patch series we didn't want to change that '%(objectsize)'
>> always shows the size of the original object even when `--use-mailmap`
>> is set because first we have the long term plan to unify how the formats
>> for `git cat-file` and other commands works. And second existing formats
>> like the "pretty formats" used bt `git log` have different options for
> 
> s/used bt/used by/
> 
>> fields respecting mailmap or not respecting it (%an is for author name
>> while %aN for author name respecting mailmap).
> 
> [...]
> 
>> = Patch Organization
>>
>> - The first patch makes `-s` option to return updated size of the
>>    <commit/tag> object, when combined with `--use-mailmap` option, after
>>    replacing the idents using the mailmap mechanism.
>> - The second patch makes `--batch-check` option to return updated size of
>>    the <commit/tag> object, when combined with `--use-mailmap` option,
>>    after replacing the idents using the mailmap mechanism.
>> - The third patch improves the documentation of `-s`, `--batch`,
>>    `--batch-check` and `--batch-command` options by adding they can be
>>    combined with `--use-mailmap` options.
> 
> So the documentation patch is now part of this small series again.
> 
> Even if this documentation patch is a bug fix, it might be better at
> this point to squash this patch into the patches 1/3 and 2/3. At least
> I think that would better follow Junio's last comments about this. If
> you go this way, you might want to squash the documentation parts
> about -s from patch 3/3 into patch 1/3 and the rest of patch 3/3 into
> patch 2/3.
> 
>> = Changes in v4
>>
>> - Improve the documentation patch to clearly state that the `-s`,
>>    `--batch-check`, `--batch-command` and `--batch` options can be only
>>    be used with `--textconv`, `--filters` or `--use-mailmap`.
> 
> Here I think you should say that the documentation patch is part of
> this series again and explain a bit why.
> 
> Anyway I took a look at the actual patches in this series and they
> look good to me now. So I would be Ok to merge it either as is or with
> patch 3/3 squashed into patches 1/3 and 2/3 as discussed above.
> 
> Thanks!

Thanks a lot for the review Christian :) I have made the suggested 
changes in v5.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v5 0/2] Add mailmap mechanism in cat-file options
  2022-09-16 20:59 [PATCH 0/3] Add mailmap mechanism in --batch-check options Siddharth Asthana
                   ` (5 preceding siblings ...)
  2022-11-13 21:28 ` [PATCH v4 0/3] " Siddharth Asthana
@ 2022-11-20  7:48 ` Siddharth Asthana
  2022-11-20  7:48   ` [PATCH v5 1/2] cat-file: add mailmap support to -s option Siddharth Asthana
  2022-11-20  7:48   ` [PATCH v5 2/2] cat-file: add mailmap support to --batch-check option Siddharth Asthana
  2022-12-01 15:55 ` [PATCH v6 0/2] Add mailmap mechanism in cat-file options Siddharth Asthana
  2022-12-20  6:01 ` [PATCH v7 0/2] Add mailmap mechanism in cat-file options Siddharth Asthana
  8 siblings, 2 replies; 44+ messages in thread
From: Siddharth Asthana @ 2022-11-20  7:48 UTC (permalink / raw)
  To: git
  Cc: christian.couder, gitster, johncai86, Johannes.Schindelin,
	avarab, me, Siddharth Asthana

Thanks a lot Christian, Taylor and Junio for review :) I have made the
suggested in this patch series.

= Description

At present, `git-cat-file` command with `--batch-check` and `-s` options
does not complain when `--use-mailmap` option is given. The latter
option is just ignored. Instead, for commit/tag objects, the command
should compute the size of the object after replacing the idents and
report it. So, this patch series makes `-s` and `--batch-check` options
of `git-cat-file` honor mailmap when used with `--use-mailmap` option.

In this patch series we didn't want to change that '%(objectsize)'
always shows the size of the original object even when `--use-mailmap`
is set because first we have the long term plan to unify how the formats
for `git cat-file` and other commands works. And second existing formats
like the "pretty formats" used by `git log` have different options for
fields respecting mailmap or not respecting it (%an is for author name
while %aN for author name respecting mailmap).

I would like to thank my mentors, Christian Couder and John Cai, for all
of their help!
Looking forward to the reviews!

= Patch Organization

- The first patch makes `-s` option to return updated size of the
  <commit/tag> object, when combined with `--use-mailmap` option, after
  replacing the idents using the mailmap mechanism.
- The second patch makes `--batch-check` option to return updated size of
  the <commit/tag> object, when combined with `--use-mailmap` option,
  after replacing the idents using the mailmap mechanism.

= Changes in v5

- The patch series which improves the documentation for `-s`, `--batch`,
  `--batch-check` and `--batch-command` is again part of this patch
  series with patch 3/3 squashed into patches 1/3 and 2/3 as suggested
  by Junio, Taylor and Christian. The doc patch series was perviously
  sent independently for improving the documentation of git cat-file
  options:
  https://lore.kernel.org/git/20221029092513.73982-1-siddharthasthana31@gmail.com/

- Improve the tests according to run under SHA-256 mode.

Siddharth Asthana (2):
  cat-file: add mailmap support to -s option
  cat-file: add mailmap support to --batch-check option

 Documentation/git-cat-file.txt | 53 ++++++++++++++++++++++---------
 builtin/cat-file.c             | 27 ++++++++++++++++
 t/t4203-mailmap.sh             | 57 ++++++++++++++++++++++++++++++++++
 3 files changed, 123 insertions(+), 14 deletions(-)

Range-diff against v4:
1:  4ae3af37d2 ! 1:  0db3583535 cat-file: add mailmap support to -s option
    @@ Commit message
         Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
         Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
     
    + ## Documentation/git-cat-file.txt ##
    +@@ Documentation/git-cat-file.txt: OPTIONS
    + 
    + -s::
    + 	Instead of the content, show the object size identified by
    +-	`<object>`.
    ++	`<object>`. If used with `--use-mailmap` option, will show
    ++	the size of updated object after replacing idents using the
    ++	mailmap mechanism.
    + 
    + -e::
    + 	Exit with zero status if `<object>` exists and is a valid
    +
      ## builtin/cat-file.c ##
     @@ builtin/cat-file.c: static int cat_one_file(int opt, const char *exp_type, const char *obj_name,
      
    @@ t/t4203-mailmap.sh: test_expect_success '--mailmap enables mailmap in cat-file f
     +	cat >.mailmap <<-\EOF &&
     +	C O Mitter <committer@example.com> Orig <orig@example.com>
     +	EOF
    -+	cat >expect <<-\EOF &&
    -+	209
    -+	220
    -+	EOF
    ++	git cat-file commit HEAD | wc -c >expect &&
    ++	git cat-file --use-mailmap commit HEAD | wc -c >>expect &&
     +	git cat-file -s HEAD >actual &&
     +	git cat-file --use-mailmap -s HEAD >>actual &&
     +	test_cmp expect actual
    @@ t/t4203-mailmap.sh: test_expect_success '--mailmap enables mailmap in cat-file f
     +	Orig <orig@example.com> C O Mitter <committer@example.com>
     +	EOF
     +	git tag -a -m "annotated tag" v3 &&
    -+	cat >expect <<-\EOF &&
    -+	141
    -+	130
    -+	EOF
    ++	git cat-file tag v3 | wc -c >expect &&
    ++	git cat-file --use-mailmap tag v3 | wc -c >>expect &&
     +	git cat-file -s v3 >actual &&
     +	git cat-file --use-mailmap -s v3 >>actual &&
     +	test_cmp expect actual
2:  a692646228 < -:  ---------- cat-file: add mailmap support to --batch-check option
3:  41b4650b24 ! 2:  b8205ede7d doc/cat-file: allow --use-mailmap for --batch options
    @@ Metadata
     Author: Siddharth Asthana <siddharthasthana31@gmail.com>
     
      ## Commit message ##
    -    doc/cat-file: allow --use-mailmap for --batch options
    +    cat-file: add mailmap support to --batch-check option
    +
    +    Even though the cat-file command with `--batch-check` option does not
    +    complain when `--use-mailmap` option is given, the latter option is
    +    ignored. Compute the size of the object after replacing the idents and
    +    report it instead.
    +
    +    In order to make `--batch-check` option honour the mailmap mechanism we
    +    have to read the contents of the commit/tag object.
    +
    +    There were two ways to do it:
    +
    +    1. Make two calls to `oid_object_info_extended()`. If `--use-mailmap`
    +       option is given, the first call will get us the type of the object
    +       and second call will only be made if the object type is either a
    +       commit or tag to get the contents of the object.
    +
    +    2. Make one call to `oid_object_info_extended()` to get the type of the
    +       object. Then, if the object type is either of commit or tag, make a
    +       call to `read_object_file()` to read the contents of the object.
    +
    +    I benchmarked the following command with both the above approaches and
    +    compared against the current implementation where `--use-mailmap`
    +    option is ignored:
    +
    +    `git cat-file --use-mailmap --batch-all-objects --batch-check --buffer
    +    --unordered`
    +
    +    The results can be summarized as follows:
    +                           Time (mean ± σ)
    +    default               827.7 ms ± 104.8 ms
    +    first approach        6.197 s ± 0.093 s
    +    second approach       1.975 s ± 0.217 s
    +
    +    Since, the second approach is faster than the first one, I implemented
    +    it in this patch.
     
         The command git cat-file can now use the mailmap mechanism to replace
    -    idents with their canonical versions for commit and tag objects. There
    -    are several options like `-s`, `--batch`, `--batch-check` and
    -    `--batch-command` that can be combined with `--use-mailmap`. But the
    -    documentation for `-s`, `--batch`, `--batch-check` and `--batch-command`
    -    doesn't say so. This patch fixes that documentation.
    +    idents with canonical versions for commit and tag objects. There are
    +    several options like `--batch`, `--batch-check` and `--batch-command`
    +    that can be combined with `--use-mailmap`. But the documentation for
    +    `--batch`, `--batch-check` and `--batch-command` doesn't say so. This
    +    patch fixes that documentation.
     
         Mentored-by: Christian Couder <christian.couder@gmail.com>
         Mentored-by: John Cai <johncai86@gmail.com>
         Helped-by: Taylor Blau <me@ttaylorr.com>
    +    Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
         Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
     
      ## Documentation/git-cat-file.txt ##
     @@ Documentation/git-cat-file.txt: OPTIONS
    - 
    - -s::
    - 	Instead of the content, show the object size identified by
    --	`<object>`.
    -+	`<object>`. If used with `--use-mailmap` option, will show
    -+	the size of updated object after replacing idents using the
    -+	mailmap mechanism.
    - 
    - -e::
    - 	Exit with zero status if `<object>` exists and is a valid
    -@@ Documentation/git-cat-file.txt: OPTIONS
      --batch::
      --batch=<format>::
      	Print object information and contents for each object provided
    @@ Documentation/git-cat-file.txt: OPTIONS
      +
      `--batch-command` recognizes the following commands:
      +
    +
    + ## builtin/cat-file.c ##
    +@@ builtin/cat-file.c: static void batch_object_write(const char *obj_name,
    + 	if (!data->skip_object_info) {
    + 		int ret;
    + 
    ++		if (use_mailmap)
    ++			data->info.typep = &data->type;
    ++
    + 		if (pack)
    + 			ret = packed_object_info(the_repository, pack, offset,
    + 						 &data->info);
    +@@ builtin/cat-file.c: static void batch_object_write(const char *obj_name,
    + 			fflush(stdout);
    + 			return;
    + 		}
    ++
    ++		if (use_mailmap && (data->type == OBJ_COMMIT || data->type == OBJ_TAG)) {
    ++			size_t s = data->size;
    ++			char *buf = NULL;
    ++
    ++			buf = read_object_file(&data->oid, &data->type, &data->size);
    ++			buf = replace_idents_using_mailmap(buf, &s);
    ++			data->size = cast_size_t_to_ulong(s);
    ++
    ++			free(buf);
    ++		}
    + 	}
    + 
    + 	strbuf_reset(scratch);
    +
    + ## t/t4203-mailmap.sh ##
    +@@ t/t4203-mailmap.sh: test_expect_success 'git cat-file -s returns correct size with --use-mailmap for
    + 	test_cmp expect actual
    + '
    + 
    ++test_expect_success 'git cat-file --batch-check returns correct size with --use-mailmap' '
    ++	test_when_finished "rm .mailmap" &&
    ++	cat >.mailmap <<-\EOF &&
    ++	C O Mitter <committer@example.com> Orig <orig@example.com>
    ++	EOF
    ++	commit_size=`git cat-file commit HEAD | wc -c` &&
    ++	commit_sha=`git log --pretty=format:'%H' -n 1` &&
    ++	echo "$commit_sha commit $commit_size" >expect &&
    ++	commit_size=`git cat-file --use-mailmap commit HEAD | wc -c` &&
    ++	echo "$commit_sha commit $commit_size" >>expect &&
    ++	echo "HEAD" >in &&
    ++	git cat-file --batch-check <in >actual &&
    ++	git cat-file --use-mailmap --batch-check <in >>actual &&
    ++	test_cmp expect actual
    ++'
    ++
    ++test_expect_success 'git cat-file --batch-command returns correct size with --use-mailmap' '
    ++	test_when_finished "rm .mailmap" &&
    ++	cat >.mailmap <<-\EOF &&
    ++	C O Mitter <committer@example.com> Orig <orig@example.com>
    ++	EOF
    ++	commit_size=`git cat-file commit HEAD | wc -c` &&
    ++	commit_sha=`git log --pretty=format:'%H' -n 1` &&
    ++	echo "$commit_sha commit $commit_size" >expect &&
    ++	commit_size=`git cat-file --use-mailmap commit HEAD | wc -c` &&
    ++	echo "$commit_sha commit $commit_size" >>expect &&
    ++	echo "info HEAD" >in &&
    ++	git cat-file --batch-command <in >actual &&
    ++	git cat-file --use-mailmap --batch-command <in >>actual &&
    ++	test_cmp expect actual
    ++'
    ++
    + test_done
-- 
2.38.1.438.gd2340c8913


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v5 1/2] cat-file: add mailmap support to -s option
  2022-11-20  7:48 ` [PATCH v5 0/2] " Siddharth Asthana
@ 2022-11-20  7:48   ` Siddharth Asthana
  2022-11-21  7:27     ` Junio C Hamano
  2022-11-20  7:48   ` [PATCH v5 2/2] cat-file: add mailmap support to --batch-check option Siddharth Asthana
  1 sibling, 1 reply; 44+ messages in thread
From: Siddharth Asthana @ 2022-11-20  7:48 UTC (permalink / raw)
  To: git
  Cc: christian.couder, gitster, johncai86, Johannes.Schindelin,
	avarab, me, Siddharth Asthana

Even though the cat-file command with `-s` option does not complain when
`--use-mailmap` option is given, the latter option is ignored. Compute
the size of the object after replacing the idents and report it instead.

In order to make `-s` option honour the mailmap mechanism we have to
read the contents of the commit/tag object. Make use of the call to
`oid_object_info_extended()` to get the contents of the object and store
in `buf`. `buf` is later freed in the function.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: John Cai <johncai86@gmail.com>
Helped-by: Taylor Blau <me@ttaylorr.com>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
---
 Documentation/git-cat-file.txt |  4 +++-
 builtin/cat-file.c             | 13 +++++++++++++
 t/t4203-mailmap.sh             | 25 +++++++++++++++++++++++++
 3 files changed, 41 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
index ec30b5c574..f82d702d6b 100644
--- a/Documentation/git-cat-file.txt
+++ b/Documentation/git-cat-file.txt
@@ -45,7 +45,9 @@ OPTIONS
 
 -s::
 	Instead of the content, show the object size identified by
-	`<object>`.
+	`<object>`. If used with `--use-mailmap` option, will show
+	the size of updated object after replacing idents using the
+	mailmap mechanism.
 
 -e::
 	Exit with zero status if `<object>` exists and is a valid
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index fa7bd89169..8a6e2343ec 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -132,8 +132,21 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name,
 
 	case 's':
 		oi.sizep = &size;
+
+		if (use_mailmap) {
+			oi.typep = &type;
+			oi.contentp = (void**)&buf;
+		}
+
 		if (oid_object_info_extended(the_repository, &oid, &oi, flags) < 0)
 			die("git cat-file: could not get object info");
+
+		if (use_mailmap && (type == OBJ_COMMIT || type == OBJ_TAG)) {
+			size_t s = size;
+			buf = replace_idents_using_mailmap(buf, &s);
+			size = cast_size_t_to_ulong(s);
+		}
+
 		printf("%"PRIuMAX"\n", (uintmax_t)size);
 		ret = 0;
 		goto cleanup;
diff --git a/t/t4203-mailmap.sh b/t/t4203-mailmap.sh
index cd1cab3e54..87b77fc5c9 100755
--- a/t/t4203-mailmap.sh
+++ b/t/t4203-mailmap.sh
@@ -1022,4 +1022,29 @@ test_expect_success '--mailmap enables mailmap in cat-file for annotated tag obj
 	test_cmp expect actual
 '
 
+test_expect_success 'git cat-file -s returns correct size with --use-mailmap' '
+	test_when_finished "rm .mailmap" &&
+	cat >.mailmap <<-\EOF &&
+	C O Mitter <committer@example.com> Orig <orig@example.com>
+	EOF
+	git cat-file commit HEAD | wc -c >expect &&
+	git cat-file --use-mailmap commit HEAD | wc -c >>expect &&
+	git cat-file -s HEAD >actual &&
+	git cat-file --use-mailmap -s HEAD >>actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git cat-file -s returns correct size with --use-mailmap for tag objects' '
+	test_when_finished "rm .mailmap" &&
+	cat >.mailmap <<-\EOF &&
+	Orig <orig@example.com> C O Mitter <committer@example.com>
+	EOF
+	git tag -a -m "annotated tag" v3 &&
+	git cat-file tag v3 | wc -c >expect &&
+	git cat-file --use-mailmap tag v3 | wc -c >>expect &&
+	git cat-file -s v3 >actual &&
+	git cat-file --use-mailmap -s v3 >>actual &&
+	test_cmp expect actual
+'
+
 test_done
-- 
2.38.1.438.gd2340c8913


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 2/2] cat-file: add mailmap support to --batch-check option
  2022-11-20  7:48 ` [PATCH v5 0/2] " Siddharth Asthana
  2022-11-20  7:48   ` [PATCH v5 1/2] cat-file: add mailmap support to -s option Siddharth Asthana
@ 2022-11-20  7:48   ` Siddharth Asthana
  2022-11-21  7:38     ` Junio C Hamano
  1 sibling, 1 reply; 44+ messages in thread
From: Siddharth Asthana @ 2022-11-20  7:48 UTC (permalink / raw)
  To: git
  Cc: christian.couder, gitster, johncai86, Johannes.Schindelin,
	avarab, me, Siddharth Asthana

Even though the cat-file command with `--batch-check` option does not
complain when `--use-mailmap` option is given, the latter option is
ignored. Compute the size of the object after replacing the idents and
report it instead.

In order to make `--batch-check` option honour the mailmap mechanism we
have to read the contents of the commit/tag object.

There were two ways to do it:

1. Make two calls to `oid_object_info_extended()`. If `--use-mailmap`
   option is given, the first call will get us the type of the object
   and second call will only be made if the object type is either a
   commit or tag to get the contents of the object.

2. Make one call to `oid_object_info_extended()` to get the type of the
   object. Then, if the object type is either of commit or tag, make a
   call to `read_object_file()` to read the contents of the object.

I benchmarked the following command with both the above approaches and
compared against the current implementation where `--use-mailmap`
option is ignored:

`git cat-file --use-mailmap --batch-all-objects --batch-check --buffer
--unordered`

The results can be summarized as follows:
                       Time (mean ± σ)
default               827.7 ms ± 104.8 ms
first approach        6.197 s ± 0.093 s
second approach       1.975 s ± 0.217 s

Since, the second approach is faster than the first one, I implemented
it in this patch.

The command git cat-file can now use the mailmap mechanism to replace
idents with canonical versions for commit and tag objects. There are
several options like `--batch`, `--batch-check` and `--batch-command`
that can be combined with `--use-mailmap`. But the documentation for
`--batch`, `--batch-check` and `--batch-command` doesn't say so. This
patch also fixes that documentation.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: John Cai <johncai86@gmail.com>
Helped-by: Taylor Blau <me@ttaylorr.com>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
---
 Documentation/git-cat-file.txt | 49 +++++++++++++++++++++++++---------
 builtin/cat-file.c             | 14 ++++++++++
 t/t4203-mailmap.sh             | 32 ++++++++++++++++++++++
 3 files changed, 82 insertions(+), 13 deletions(-)

diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
index f82d702d6b..81235c60a3 100644
--- a/Documentation/git-cat-file.txt
+++ b/Documentation/git-cat-file.txt
@@ -91,26 +91,49 @@ OPTIONS
 --batch::
 --batch=<format>::
 	Print object information and contents for each object provided
-	on stdin.  May not be combined with any other options or arguments
-	except `--textconv` or `--filters`, in which case the input lines
-	also need to specify the path, separated by whitespace.  See the
-	section `BATCH OUTPUT` below for details.
+	on stdin. May not be combined with any other options or arguments
+	except --textconv, --filters, or --use-mailmap.
+	+
+	* When used with `--textconv` or `--filters`, the input lines
+	  must specify the path, separated by whitespace. See the section
+	  `BATCH OUTPUT` below for details.
+	+
+	* When used with `--use-mailmap`, for commit and tag objects, the
+	  contents part of the output shows the identities replaced using the
+	  mailmap mechanism, while the information part of the output shows
+	  the size of the object as if it actually recorded the replacement
+	  identities.
 
 --batch-check::
 --batch-check=<format>::
-	Print object information for each object provided on stdin.  May
-	not be combined with any other options or arguments except
-	`--textconv` or `--filters`, in which case the input lines also
-	need to specify the path, separated by whitespace.  See the
-	section `BATCH OUTPUT` below for details.
+	Print object information for each object provided on stdin. May not be
+	combined with any other options or arguments except --textconv, --filters
+	or --use-mailmap.
+	+
+	* When used with `--textconv` or `--filters`, the input lines must
+	 specify the path, separated by whitespace. See the section
+	 `BATCH OUTPUT` below for details.
+	+
+	* When used with `--use-mailmap`, for commit and tag objects, the
+	  printed object information shows the size of the object as if the
+	  identities recorded in it were replaced by the mailmap mechanism.
 
 --batch-command::
 --batch-command=<format>::
 	Enter a command mode that reads commands and arguments from stdin. May
-	only be combined with `--buffer`, `--textconv` or `--filters`. In the
-	case of `--textconv` or `--filters`, the input lines also need to specify
-	the path, separated by whitespace. See the section `BATCH OUTPUT` below
-	for details.
+	only be combined with `--buffer`, `--textconv`, `--use-mailmap` or
+	`--filters`.
+	+
+	* When used with `--textconv` or `--filters`, the input lines must
+	  specify the path, separated by whitespace. See the section
+	  `BATCH OUTPUT` below for details.
+	+
+	* When used with `--use-mailmap`, for commit and tag objects, the
+	  `contents` command shows the identities replaced using the
+	  mailmap mechanism, while the `info` command shows the size
+	  of the object as if it actually recorded the replacement
+	  identities.
+
 +
 `--batch-command` recognizes the following commands:
 +
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 8a6e2343ec..39f2a2483f 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -444,6 +444,9 @@ static void batch_object_write(const char *obj_name,
 	if (!data->skip_object_info) {
 		int ret;
 
+		if (use_mailmap)
+			data->info.typep = &data->type;
+
 		if (pack)
 			ret = packed_object_info(the_repository, pack, offset,
 						 &data->info);
@@ -457,6 +460,17 @@ static void batch_object_write(const char *obj_name,
 			fflush(stdout);
 			return;
 		}
+
+		if (use_mailmap && (data->type == OBJ_COMMIT || data->type == OBJ_TAG)) {
+			size_t s = data->size;
+			char *buf = NULL;
+
+			buf = read_object_file(&data->oid, &data->type, &data->size);
+			buf = replace_idents_using_mailmap(buf, &s);
+			data->size = cast_size_t_to_ulong(s);
+
+			free(buf);
+		}
 	}
 
 	strbuf_reset(scratch);
diff --git a/t/t4203-mailmap.sh b/t/t4203-mailmap.sh
index 87b77fc5c9..21ba6bc278 100755
--- a/t/t4203-mailmap.sh
+++ b/t/t4203-mailmap.sh
@@ -1047,4 +1047,36 @@ test_expect_success 'git cat-file -s returns correct size with --use-mailmap for
 	test_cmp expect actual
 '
 
+test_expect_success 'git cat-file --batch-check returns correct size with --use-mailmap' '
+	test_when_finished "rm .mailmap" &&
+	cat >.mailmap <<-\EOF &&
+	C O Mitter <committer@example.com> Orig <orig@example.com>
+	EOF
+	commit_size=`git cat-file commit HEAD | wc -c` &&
+	commit_sha=`git log --pretty=format:'%H' -n 1` &&
+	echo "$commit_sha commit $commit_size" >expect &&
+	commit_size=`git cat-file --use-mailmap commit HEAD | wc -c` &&
+	echo "$commit_sha commit $commit_size" >>expect &&
+	echo "HEAD" >in &&
+	git cat-file --batch-check <in >actual &&
+	git cat-file --use-mailmap --batch-check <in >>actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git cat-file --batch-command returns correct size with --use-mailmap' '
+	test_when_finished "rm .mailmap" &&
+	cat >.mailmap <<-\EOF &&
+	C O Mitter <committer@example.com> Orig <orig@example.com>
+	EOF
+	commit_size=`git cat-file commit HEAD | wc -c` &&
+	commit_sha=`git log --pretty=format:'%H' -n 1` &&
+	echo "$commit_sha commit $commit_size" >expect &&
+	commit_size=`git cat-file --use-mailmap commit HEAD | wc -c` &&
+	echo "$commit_sha commit $commit_size" >>expect &&
+	echo "info HEAD" >in &&
+	git cat-file --batch-command <in >actual &&
+	git cat-file --use-mailmap --batch-command <in >>actual &&
+	test_cmp expect actual
+'
+
 test_done
-- 
2.38.1.438.gd2340c8913


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 1/2] cat-file: add mailmap support to -s option
  2022-11-20  7:48   ` [PATCH v5 1/2] cat-file: add mailmap support to -s option Siddharth Asthana
@ 2022-11-21  7:27     ` Junio C Hamano
  2022-11-21  9:40       ` Christian Couder
  0 siblings, 1 reply; 44+ messages in thread
From: Junio C Hamano @ 2022-11-21  7:27 UTC (permalink / raw)
  To: Siddharth Asthana
  Cc: git, christian.couder, johncai86, Johannes.Schindelin, avarab, me

Siddharth Asthana <siddharthasthana31@gmail.com> writes:

> +test_expect_success 'git cat-file -s returns correct size with --use-mailmap' '
> +	test_when_finished "rm .mailmap" &&
> +	cat >.mailmap <<-\EOF &&
> +	C O Mitter <committer@example.com> Orig <orig@example.com>
> +	EOF
> +	git cat-file commit HEAD | wc -c >expect &&
> +	git cat-file --use-mailmap commit HEAD | wc -c >>expect &&
> +	git cat-file -s HEAD >actual &&
> +	git cat-file --use-mailmap -s HEAD >>actual &&

Doesn't this break under macOS where wc output tends to be padded
with SP on the right?  We used to often see test breakage when a
carelessly written test like

	test "$(wc -l <outout)" = 2

which expects the output file to have exactly two files (the
solution in this sample case is to lose the double quotes around the
command substitution).

Besides, having "cat-file" on the upstream side of a pipe is a bad
practice.


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 2/2] cat-file: add mailmap support to --batch-check option
  2022-11-20  7:48   ` [PATCH v5 2/2] cat-file: add mailmap support to --batch-check option Siddharth Asthana
@ 2022-11-21  7:38     ` Junio C Hamano
  2022-11-30  9:19       ` Junio C Hamano
  0 siblings, 1 reply; 44+ messages in thread
From: Junio C Hamano @ 2022-11-21  7:38 UTC (permalink / raw)
  To: Siddharth Asthana
  Cc: git, christian.couder, johncai86, Johannes.Schindelin, avarab, me

Siddharth Asthana <siddharthasthana31@gmail.com> writes:

> diff --git a/t/t4203-mailmap.sh b/t/t4203-mailmap.sh
> index 87b77fc5c9..21ba6bc278 100755
> --- a/t/t4203-mailmap.sh
> +++ b/t/t4203-mailmap.sh
> @@ -1047,4 +1047,36 @@ test_expect_success 'git cat-file -s returns correct size with --use-mailmap for
>  	test_cmp expect actual
>  '
>  
> +test_expect_success 'git cat-file --batch-check returns correct size with --use-mailmap' '
> +	test_when_finished "rm .mailmap" &&
> +	cat >.mailmap <<-\EOF &&
> +	C O Mitter <committer@example.com> Orig <orig@example.com>
> +	EOF
> +	commit_size=`git cat-file commit HEAD | wc -c` &&

We prefer $(command substitution) over `command substitution`.  

When "cat-file" segfaults and dumps core, having it upstream of a
pipe would mean its crashing will be hidden.

Note that some implementations of "wc" pads its output with SP.  The
implication will be seen in a few paragraphs below.

> +	commit_sha=`git log --pretty=format:'%H' -n 1` &&

These single quotes are not doing what you think they are doing.
The body of the test is inside a pair of single quotes, and the one
after format: makes you step outside the single quote, take two
bytes %H literally, and the other single quote opens a new singly
quoted string segment.  Which is not wrong per-se, because there is
no special meaning attached to the sequence %H in the shell language,
but then you'd be better off writing format:%H without any quotes,
as that is more direct way to write what you are writing.

Also, --pretty=format:<something> is almost always a mistake.
Unless you have a good reason to use it, you'd most likely want to
use --format=<something> instead.

In any case, don't abuse "log" when you mean

    commit_object_name=$(git rev-parse HEAD) &&

> +	echo "$commit_sha commit $commit_size" >expect &&

As $commit_size here may have extra and unwanted SP before it, this
may break with the implementation of "wc" on certain platforms.  In
this particular instance, losing quoting, i.e.

	echo $commit_sha commit $commit_size >expect

may be a good workaround.

> +	commit_size=`git cat-file --use-mailmap commit HEAD | wc -c` &&

Exactly the same set of comments as above apply to this side, too.

> +	echo "$commit_sha commit $commit_size" >>expect &&
> +	echo "HEAD" >in &&
> +	git cat-file --batch-check <in >actual &&
> +	git cat-file --use-mailmap --batch-check <in >>actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'git cat-file --batch-command returns correct size with --use-mailmap' '
> +	test_when_finished "rm .mailmap" &&
> +	cat >.mailmap <<-\EOF &&
> +	C O Mitter <committer@example.com> Orig <orig@example.com>
> +	EOF
> +	commit_size=`git cat-file commit HEAD | wc -c` &&
> +	commit_sha=`git log --pretty=format:'%H' -n 1` &&
> +	echo "$commit_sha commit $commit_size" >expect &&
> +	commit_size=`git cat-file --use-mailmap commit HEAD | wc -c` &&
> +	echo "$commit_sha commit $commit_size" >>expect &&
> +	echo "info HEAD" >in &&
> +	git cat-file --batch-command <in >actual &&
> +	git cat-file --use-mailmap --batch-command <in >>actual &&
> +	test_cmp expect actual
> +'
> +
>  test_done

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 1/2] cat-file: add mailmap support to -s option
  2022-11-21  7:27     ` Junio C Hamano
@ 2022-11-21  9:40       ` Christian Couder
  2022-11-21  9:45         ` Junio C Hamano
  2022-11-21 11:27         ` Ævar Arnfjörð Bjarmason
  0 siblings, 2 replies; 44+ messages in thread
From: Christian Couder @ 2022-11-21  9:40 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Siddharth Asthana, git, johncai86, Johannes.Schindelin, avarab, me

On Mon, Nov 21, 2022 at 8:27 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Siddharth Asthana <siddharthasthana31@gmail.com> writes:
>
> > +test_expect_success 'git cat-file -s returns correct size with --use-mailmap' '
> > +     test_when_finished "rm .mailmap" &&
> > +     cat >.mailmap <<-\EOF &&
> > +     C O Mitter <committer@example.com> Orig <orig@example.com>
> > +     EOF
> > +     git cat-file commit HEAD | wc -c >expect &&
> > +     git cat-file --use-mailmap commit HEAD | wc -c >>expect &&
> > +     git cat-file -s HEAD >actual &&
> > +     git cat-file --use-mailmap -s HEAD >>actual &&
>
> Doesn't this break under macOS where wc output tends to be padded
> with SP on the right?  We used to often see test breakage when a
> carelessly written test like
>
>         test "$(wc -l <outout)" = 2
>
> which expects the output file to have exactly two files (the
> solution in this sample case is to lose the double quotes around the
> command substitution).

I guess that's the reason why `wc -c | sed -e 's/^ *//'` is used in
the strlen() function in t1006-cat-file.sh. There are a number of
places in the tests where wc -c or wc -l are used without piping the
result into sed -e 's/^ *//' though. So it's not easy to understand
why it's sometimes needed.

> Besides, having "cat-file" on the upstream side of a pipe is a bad
> practice.

Yeah, right. So I would suggest the following:

     git cat-file commit HEAD >commit.out &&
     wc -c <commit.out | sed -e 's/^ *//' >expect &&
     git cat-file --use-mailmap commit HEAD >commit.out &&
     wc -c <commit.out | sed -e 's/^ *//' >>expect &&

Thanks!

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 1/2] cat-file: add mailmap support to -s option
  2022-11-21  9:40       ` Christian Couder
@ 2022-11-21  9:45         ` Junio C Hamano
  2022-11-21 11:27         ` Ævar Arnfjörð Bjarmason
  1 sibling, 0 replies; 44+ messages in thread
From: Junio C Hamano @ 2022-11-21  9:45 UTC (permalink / raw)
  To: Christian Couder
  Cc: Siddharth Asthana, git, johncai86, Johannes.Schindelin, avarab, me

Christian Couder <christian.couder@gmail.com> writes:

> Yeah, right. So I would suggest the following:
>
>      git cat-file commit HEAD >commit.out &&
>      wc -c <commit.out | sed -e 's/^ *//' >expect &&

You should not use sed for things like that.

	echo $(wc -c <file)

should suffice to strip away unwanted $IFS around the output
(Note. lack of any quotihg around $() is the key).

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 1/2] cat-file: add mailmap support to -s option
  2022-11-21  9:40       ` Christian Couder
  2022-11-21  9:45         ` Junio C Hamano
@ 2022-11-21 11:27         ` Ævar Arnfjörð Bjarmason
  1 sibling, 0 replies; 44+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-11-21 11:27 UTC (permalink / raw)
  To: Christian Couder
  Cc: Junio C Hamano, Siddharth Asthana, git, johncai86,
	Johannes.Schindelin, me


On Mon, Nov 21 2022, Christian Couder wrote:

> On Mon, Nov 21, 2022 at 8:27 AM Junio C Hamano <gitster@pobox.com> wrote:
>>
>> Siddharth Asthana <siddharthasthana31@gmail.com> writes:
>>
>> > +test_expect_success 'git cat-file -s returns correct size with --use-mailmap' '
>> > +     test_when_finished "rm .mailmap" &&
>> > +     cat >.mailmap <<-\EOF &&
>> > +     C O Mitter <committer@example.com> Orig <orig@example.com>
>> > +     EOF
>> > +     git cat-file commit HEAD | wc -c >expect &&
>> > +     git cat-file --use-mailmap commit HEAD | wc -c >>expect &&
>> > +     git cat-file -s HEAD >actual &&
>> > +     git cat-file --use-mailmap -s HEAD >>actual &&
>>
>> Doesn't this break under macOS where wc output tends to be padded
>> with SP on the right?  We used to often see test breakage when a
>> carelessly written test like
>>
>>         test "$(wc -l <outout)" = 2
>>
>> which expects the output file to have exactly two files (the
>> solution in this sample case is to lose the double quotes around the
>> command substitution).
>
> I guess that's the reason why `wc -c | sed -e 's/^ *//'` is used in
> the strlen() function in t1006-cat-file.sh. There are a number of
> places in the tests where wc -c or wc -l are used without piping the
> result into sed -e 's/^ *//' though. So it's not easy to understand
> why it's sometimes needed.

It's because in "t1006-cat-file.sh" we're assigning the "wc -c" to a
variable, because it's used to "test_cmp" the number of bytes in some
free-form text.

It would be nicer to split "test_line_count" into some utility function
that knew how to parse out "wc -l", "wc -c" etc. for a given input file,
and return that as a string.

In that case the "sed" isn't needed, and we're just (ab)using it to do
things we can do with whitespace managent + shell built-ins. E.g. this
works too (the "echo; echo; echo" showing that we're stripping out
whitespace "wc -c" might emit:
	
	diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
	index 23b8942edba..9ae4b534421 100755
	--- a/t/t1006-cat-file.sh
	+++ b/t/t1006-cat-file.sh
	@@ -106,7 +106,7 @@ echo_without_newline_nul () {
	 }
	 
	 strlen () {
	-    echo_without_newline "$1" | wc -c | sed -e 's/^ *//'
	+    printf "%s" $(printf "%s" "$1" | (echo ; echo ; echo ; wc -c))
	 }
	 
	 maybe_remove_timestamp () {

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 2/2] cat-file: add mailmap support to --batch-check option
  2022-11-21  7:38     ` Junio C Hamano
@ 2022-11-30  9:19       ` Junio C Hamano
  0 siblings, 0 replies; 44+ messages in thread
From: Junio C Hamano @ 2022-11-30  9:19 UTC (permalink / raw)
  To: Siddharth Asthana
  Cc: git, christian.couder, johncai86, Johannes.Schindelin, avarab, me

Junio C Hamano <gitster@pobox.com> writes:

> Note that some implementations of "wc" pads its output with SP.  The
> implication will be seen in a few paragraphs below.
> ...
> In any case, don't abuse "log" when you mean
>
>     commit_object_name=$(git rev-parse HEAD) &&
>
>> +	echo "$commit_sha commit $commit_size" >expect &&
>
> As $commit_size here may have extra and unwanted SP before it, this
> may break with the implementation of "wc" on certain platforms.  In
> this particular instance, losing quoting, i.e.
>
> 	echo $commit_sha commit $commit_size >expect
>
> may be a good workaround.

And this indeed does break GitHub CI osx jobs ...

  https://github.com/git/git/actions/runs/3581605069/jobs/6024866010#step:4:1860

... in the way exactly I predicted to break in the message I am
responding to.


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v6 0/2] Add mailmap mechanism in cat-file options
  2022-09-16 20:59 [PATCH 0/3] Add mailmap mechanism in --batch-check options Siddharth Asthana
                   ` (6 preceding siblings ...)
  2022-11-20  7:48 ` [PATCH v5 0/2] " Siddharth Asthana
@ 2022-12-01 15:55 ` Siddharth Asthana
  2022-12-01 15:55   ` [PATCH v6 1/2] cat-file: add mailmap support to -s option Siddharth Asthana
  2022-12-01 15:55   ` [PATCH v6 2/2] cat-file: add mailmap support to --batch-check option Siddharth Asthana
  2022-12-20  6:01 ` [PATCH v7 0/2] Add mailmap mechanism in cat-file options Siddharth Asthana
  8 siblings, 2 replies; 44+ messages in thread
From: Siddharth Asthana @ 2022-12-01 15:55 UTC (permalink / raw)
  To: git
  Cc: christian.couder, gitster, johncai86, Johannes.Schindelin,
	avarab, me, Siddharth Asthana

Thanks a ton Christian, Ævar and Junio for review :) I have made the
suggested changes in this patch series.

= Description

At present, `git-cat-file` command with `--batch-check` and `-s` options
does not complain when `--use-mailmap` option is given. The latter
option is just ignored. Instead, for commit/tag objects, the command
should compute the size of the object after replacing the idents and
report it. So, this patch series makes `-s` and `--batch-check` options
of `git-cat-file` honor mailmap when used with `--use-mailmap` option.

In this patch series we didn't want to change that '%(objectsize)'
always shows the size of the original object even when `--use-mailmap`
is set because first we have the long term plan to unify how the formats
for `git cat-file` and other commands works. And second existing formats
like the "pretty formats" used by `git log` have different options for
fields respecting mailmap or not respecting it (%an is for author name
while %aN for author name respecting mailmap).

I would like to thank my mentors, Christian Couder and John Cai, for all
of their help!
Looking forward to the reviews!

= Patch Organization

- The first patch makes `-s` option to return updated size of the
  <commit/tag> object, when combined with `--use-mailmap` option, after
  replacing the idents using the mailmap mechanism.
- The second patch makes `--batch-check` option to return updated size of
  the <commit/tag> object, when combined with `--use-mailmap` option,
  after replacing the idents using the mailmap mechanism.

= Changes in v5

- The patch series which improves the documentation for `-s`, `--batch`,
  `--batch-check` and `--batch-command` is again part of this patch
  series with patch 3/3 squashed into patches 1/3 and 2/3 as suggested
  by Junio, Taylor and Christian. The doc patch series was perviously
  sent independently for improving the documentation of git cat-file
  options:
  https://lore.kernel.org/git/20221029092513.73982-1-siddharthasthana31@gmail.com/

- Improve the tests according to run under SHA-256 mode.

= Changes in v6

- Improve the tests so it doesn't break under macOS.

Siddharth Asthana (2):
  cat-file: add mailmap support to -s option
  cat-file: add mailmap support to --batch-check option

 Documentation/git-cat-file.txt | 53 +++++++++++++++++++--------
 builtin/cat-file.c             | 27 ++++++++++++++
 t/t4203-mailmap.sh             | 65 ++++++++++++++++++++++++++++++++++
 3 files changed, 131 insertions(+), 14 deletions(-)

Range-diff against v5:
1:  0db3583535 ! 1:  15366d872e cat-file: add mailmap support to -s option
    @@ t/t4203-mailmap.sh: test_expect_success '--mailmap enables mailmap in cat-file f
     +	cat >.mailmap <<-\EOF &&
     +	C O Mitter <committer@example.com> Orig <orig@example.com>
     +	EOF
    -+	git cat-file commit HEAD | wc -c >expect &&
    -+	git cat-file --use-mailmap commit HEAD | wc -c >>expect &&
    ++	git cat-file commit HEAD >commit.out &&
    ++	echo $(wc -c <commit.out) >expect &&
    ++	git cat-file --use-mailmap commit HEAD >commit.out &&
    ++	echo $(wc -c <commit.out) >>expect &&
     +	git cat-file -s HEAD >actual &&
     +	git cat-file --use-mailmap -s HEAD >>actual &&
     +	test_cmp expect actual
    @@ t/t4203-mailmap.sh: test_expect_success '--mailmap enables mailmap in cat-file f
     +	Orig <orig@example.com> C O Mitter <committer@example.com>
     +	EOF
     +	git tag -a -m "annotated tag" v3 &&
    -+	git cat-file tag v3 | wc -c >expect &&
    -+	git cat-file --use-mailmap tag v3 | wc -c >>expect &&
    ++	git cat-file tag v3 >tag.out &&
    ++	echo $(wc -c <tag.out) >expect &&
    ++	git cat-file --use-mailmap tag v3 >tag.out &&
    ++	echo $(wc -c <tag.out) >>expect &&
     +	git cat-file -s v3 >actual &&
     +	git cat-file --use-mailmap -s v3 >>actual &&
     +	test_cmp expect actual
2:  b8205ede7d ! 2:  86692db720 cat-file: add mailmap support to --batch-check option
    @@ t/t4203-mailmap.sh: test_expect_success 'git cat-file -s returns correct size wi
     +	cat >.mailmap <<-\EOF &&
     +	C O Mitter <committer@example.com> Orig <orig@example.com>
     +	EOF
    -+	commit_size=`git cat-file commit HEAD | wc -c` &&
    -+	commit_sha=`git log --pretty=format:'%H' -n 1` &&
    -+	echo "$commit_sha commit $commit_size" >expect &&
    -+	commit_size=`git cat-file --use-mailmap commit HEAD | wc -c` &&
    -+	echo "$commit_sha commit $commit_size" >>expect &&
    ++	git cat-file commit HEAD >commit.out &&
    ++	commit_size=$(wc -c <commit.out) &&
    ++	commit_sha=$(git rev-parse HEAD) &&
    ++	echo $commit_sha commit $commit_size >expect &&
    ++	git cat-file --use-mailmap commit HEAD >commit.out &&
    ++	commit_size=$(wc -c <commit.out) &&
    ++	echo $commit_sha commit $commit_size >>expect &&
     +	echo "HEAD" >in &&
     +	git cat-file --batch-check <in >actual &&
     +	git cat-file --use-mailmap --batch-check <in >>actual &&
    @@ t/t4203-mailmap.sh: test_expect_success 'git cat-file -s returns correct size wi
     +	cat >.mailmap <<-\EOF &&
     +	C O Mitter <committer@example.com> Orig <orig@example.com>
     +	EOF
    -+	commit_size=`git cat-file commit HEAD | wc -c` &&
    -+	commit_sha=`git log --pretty=format:'%H' -n 1` &&
    -+	echo "$commit_sha commit $commit_size" >expect &&
    -+	commit_size=`git cat-file --use-mailmap commit HEAD | wc -c` &&
    -+	echo "$commit_sha commit $commit_size" >>expect &&
    ++	git cat-file commit HEAD >commit.out &&
    ++	commit_size=$(wc -c <commit.out) &&
    ++	commit_sha=$(git rev-parse HEAD) &&
    ++	echo $commit_sha commit $commit_size >expect &&
    ++	git cat-file --use-mailmap commit HEAD >commit.out &&
    ++	commit_size=$(wc -c <commit.out) &&
    ++	echo $commit_sha commit $commit_size >>expect &&
     +	echo "info HEAD" >in &&
     +	git cat-file --batch-command <in >actual &&
     +	git cat-file --use-mailmap --batch-command <in >>actual &&
-- 
2.39.0.rc1.6.g86692db720


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v6 1/2] cat-file: add mailmap support to -s option
  2022-12-01 15:55 ` [PATCH v6 0/2] Add mailmap mechanism in cat-file options Siddharth Asthana
@ 2022-12-01 15:55   ` Siddharth Asthana
  2022-12-01 15:55   ` [PATCH v6 2/2] cat-file: add mailmap support to --batch-check option Siddharth Asthana
  1 sibling, 0 replies; 44+ messages in thread
From: Siddharth Asthana @ 2022-12-01 15:55 UTC (permalink / raw)
  To: git
  Cc: christian.couder, gitster, johncai86, Johannes.Schindelin,
	avarab, me, Siddharth Asthana

Even though the cat-file command with `-s` option does not complain when
`--use-mailmap` option is given, the latter option is ignored. Compute
the size of the object after replacing the idents and report it instead.

In order to make `-s` option honour the mailmap mechanism we have to
read the contents of the commit/tag object. Make use of the call to
`oid_object_info_extended()` to get the contents of the object and store
in `buf`. `buf` is later freed in the function.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: John Cai <johncai86@gmail.com>
Helped-by: Taylor Blau <me@ttaylorr.com>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
---
 Documentation/git-cat-file.txt |  4 +++-
 builtin/cat-file.c             | 13 +++++++++++++
 t/t4203-mailmap.sh             | 29 +++++++++++++++++++++++++++++
 3 files changed, 45 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
index ec30b5c574..f82d702d6b 100644
--- a/Documentation/git-cat-file.txt
+++ b/Documentation/git-cat-file.txt
@@ -45,7 +45,9 @@ OPTIONS
 
 -s::
 	Instead of the content, show the object size identified by
-	`<object>`.
+	`<object>`. If used with `--use-mailmap` option, will show
+	the size of updated object after replacing idents using the
+	mailmap mechanism.
 
 -e::
 	Exit with zero status if `<object>` exists and is a valid
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index b3be58b1fb..dde8dbeacd 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -132,8 +132,21 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name,
 
 	case 's':
 		oi.sizep = &size;
+
+		if (use_mailmap) {
+			oi.typep = &type;
+			oi.contentp = (void**)&buf;
+		}
+
 		if (oid_object_info_extended(the_repository, &oid, &oi, flags) < 0)
 			die("git cat-file: could not get object info");
+
+		if (use_mailmap && (type == OBJ_COMMIT || type == OBJ_TAG)) {
+			size_t s = size;
+			buf = replace_idents_using_mailmap(buf, &s);
+			size = cast_size_t_to_ulong(s);
+		}
+
 		printf("%"PRIuMAX"\n", (uintmax_t)size);
 		ret = 0;
 		goto cleanup;
diff --git a/t/t4203-mailmap.sh b/t/t4203-mailmap.sh
index cd1cab3e54..b8ec5e0959 100755
--- a/t/t4203-mailmap.sh
+++ b/t/t4203-mailmap.sh
@@ -1022,4 +1022,33 @@ test_expect_success '--mailmap enables mailmap in cat-file for annotated tag obj
 	test_cmp expect actual
 '
 
+test_expect_success 'git cat-file -s returns correct size with --use-mailmap' '
+	test_when_finished "rm .mailmap" &&
+	cat >.mailmap <<-\EOF &&
+	C O Mitter <committer@example.com> Orig <orig@example.com>
+	EOF
+	git cat-file commit HEAD >commit.out &&
+	echo $(wc -c <commit.out) >expect &&
+	git cat-file --use-mailmap commit HEAD >commit.out &&
+	echo $(wc -c <commit.out) >>expect &&
+	git cat-file -s HEAD >actual &&
+	git cat-file --use-mailmap -s HEAD >>actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git cat-file -s returns correct size with --use-mailmap for tag objects' '
+	test_when_finished "rm .mailmap" &&
+	cat >.mailmap <<-\EOF &&
+	Orig <orig@example.com> C O Mitter <committer@example.com>
+	EOF
+	git tag -a -m "annotated tag" v3 &&
+	git cat-file tag v3 >tag.out &&
+	echo $(wc -c <tag.out) >expect &&
+	git cat-file --use-mailmap tag v3 >tag.out &&
+	echo $(wc -c <tag.out) >>expect &&
+	git cat-file -s v3 >actual &&
+	git cat-file --use-mailmap -s v3 >>actual &&
+	test_cmp expect actual
+'
+
 test_done
-- 
2.39.0.rc1.6.g86692db720


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v6 2/2] cat-file: add mailmap support to --batch-check option
  2022-12-01 15:55 ` [PATCH v6 0/2] Add mailmap mechanism in cat-file options Siddharth Asthana
  2022-12-01 15:55   ` [PATCH v6 1/2] cat-file: add mailmap support to -s option Siddharth Asthana
@ 2022-12-01 15:55   ` Siddharth Asthana
  2022-12-14 11:27     ` Ævar Arnfjörð Bjarmason
  2022-12-14 14:04     ` Christian Couder
  1 sibling, 2 replies; 44+ messages in thread
From: Siddharth Asthana @ 2022-12-01 15:55 UTC (permalink / raw)
  To: git
  Cc: christian.couder, gitster, johncai86, Johannes.Schindelin,
	avarab, me, Siddharth Asthana

Even though the cat-file command with `--batch-check` option does not
complain when `--use-mailmap` option is given, the latter option is
ignored. Compute the size of the object after replacing the idents and
report it instead.

In order to make `--batch-check` option honour the mailmap mechanism we
have to read the contents of the commit/tag object.

There were two ways to do it:

1. Make two calls to `oid_object_info_extended()`. If `--use-mailmap`
   option is given, the first call will get us the type of the object
   and second call will only be made if the object type is either a
   commit or tag to get the contents of the object.

2. Make one call to `oid_object_info_extended()` to get the type of the
   object. Then, if the object type is either of commit or tag, make a
   call to `read_object_file()` to read the contents of the object.

I benchmarked the following command with both the above approaches and
compared against the current implementation where `--use-mailmap`
option is ignored:

`git cat-file --use-mailmap --batch-all-objects --batch-check --buffer
--unordered`

The results can be summarized as follows:
                       Time (mean ± σ)
default               827.7 ms ± 104.8 ms
first approach        6.197 s ± 0.093 s
second approach       1.975 s ± 0.217 s

Since, the second approach is faster than the first one, I implemented
it in this patch.

The command git cat-file can now use the mailmap mechanism to replace
idents with canonical versions for commit and tag objects. There are
several options like `--batch`, `--batch-check` and `--batch-command`
that can be combined with `--use-mailmap`. But the documentation for
`--batch`, `--batch-check` and `--batch-command` doesn't say so. This
patch fixes that documentation.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: John Cai <johncai86@gmail.com>
Helped-by: Taylor Blau <me@ttaylorr.com>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
---
 Documentation/git-cat-file.txt | 49 +++++++++++++++++++++++++---------
 builtin/cat-file.c             | 14 ++++++++++
 t/t4203-mailmap.sh             | 36 +++++++++++++++++++++++++
 3 files changed, 86 insertions(+), 13 deletions(-)

diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
index f82d702d6b..81235c60a3 100644
--- a/Documentation/git-cat-file.txt
+++ b/Documentation/git-cat-file.txt
@@ -91,26 +91,49 @@ OPTIONS
 --batch::
 --batch=<format>::
 	Print object information and contents for each object provided
-	on stdin.  May not be combined with any other options or arguments
-	except `--textconv` or `--filters`, in which case the input lines
-	also need to specify the path, separated by whitespace.  See the
-	section `BATCH OUTPUT` below for details.
+	on stdin. May not be combined with any other options or arguments
+	except --textconv, --filters, or --use-mailmap.
+	+
+	* When used with `--textconv` or `--filters`, the input lines
+	  must specify the path, separated by whitespace. See the section
+	  `BATCH OUTPUT` below for details.
+	+
+	* When used with `--use-mailmap`, for commit and tag objects, the
+	  contents part of the output shows the identities replaced using the
+	  mailmap mechanism, while the information part of the output shows
+	  the size of the object as if it actually recorded the replacement
+	  identities.
 
 --batch-check::
 --batch-check=<format>::
-	Print object information for each object provided on stdin.  May
-	not be combined with any other options or arguments except
-	`--textconv` or `--filters`, in which case the input lines also
-	need to specify the path, separated by whitespace.  See the
-	section `BATCH OUTPUT` below for details.
+	Print object information for each object provided on stdin. May not be
+	combined with any other options or arguments except --textconv, --filters
+	or --use-mailmap.
+	+
+	* When used with `--textconv` or `--filters`, the input lines must
+	 specify the path, separated by whitespace. See the section
+	 `BATCH OUTPUT` below for details.
+	+
+	* When used with `--use-mailmap`, for commit and tag objects, the
+	  printed object information shows the size of the object as if the
+	  identities recorded in it were replaced by the mailmap mechanism.
 
 --batch-command::
 --batch-command=<format>::
 	Enter a command mode that reads commands and arguments from stdin. May
-	only be combined with `--buffer`, `--textconv` or `--filters`. In the
-	case of `--textconv` or `--filters`, the input lines also need to specify
-	the path, separated by whitespace. See the section `BATCH OUTPUT` below
-	for details.
+	only be combined with `--buffer`, `--textconv`, `--use-mailmap` or
+	`--filters`.
+	+
+	* When used with `--textconv` or `--filters`, the input lines must
+	  specify the path, separated by whitespace. See the section
+	  `BATCH OUTPUT` below for details.
+	+
+	* When used with `--use-mailmap`, for commit and tag objects, the
+	  `contents` command shows the identities replaced using the
+	  mailmap mechanism, while the `info` command shows the size
+	  of the object as if it actually recorded the replacement
+	  identities.
+
 +
 `--batch-command` recognizes the following commands:
 +
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index dde8dbeacd..7811016ae8 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -444,6 +444,9 @@ static void batch_object_write(const char *obj_name,
 	if (!data->skip_object_info) {
 		int ret;
 
+		if (use_mailmap)
+			data->info.typep = &data->type;
+
 		if (pack)
 			ret = packed_object_info(the_repository, pack, offset,
 						 &data->info);
@@ -457,6 +460,17 @@ static void batch_object_write(const char *obj_name,
 			fflush(stdout);
 			return;
 		}
+
+		if (use_mailmap && (data->type == OBJ_COMMIT || data->type == OBJ_TAG)) {
+			size_t s = data->size;
+			char *buf = NULL;
+
+			buf = read_object_file(&data->oid, &data->type, &data->size);
+			buf = replace_idents_using_mailmap(buf, &s);
+			data->size = cast_size_t_to_ulong(s);
+
+			free(buf);
+		}
 	}
 
 	strbuf_reset(scratch);
diff --git a/t/t4203-mailmap.sh b/t/t4203-mailmap.sh
index b8ec5e0959..fa7f987284 100755
--- a/t/t4203-mailmap.sh
+++ b/t/t4203-mailmap.sh
@@ -1051,4 +1051,40 @@ test_expect_success 'git cat-file -s returns correct size with --use-mailmap for
 	test_cmp expect actual
 '
 
+test_expect_success 'git cat-file --batch-check returns correct size with --use-mailmap' '
+	test_when_finished "rm .mailmap" &&
+	cat >.mailmap <<-\EOF &&
+	C O Mitter <committer@example.com> Orig <orig@example.com>
+	EOF
+	git cat-file commit HEAD >commit.out &&
+	commit_size=$(wc -c <commit.out) &&
+	commit_sha=$(git rev-parse HEAD) &&
+	echo $commit_sha commit $commit_size >expect &&
+	git cat-file --use-mailmap commit HEAD >commit.out &&
+	commit_size=$(wc -c <commit.out) &&
+	echo $commit_sha commit $commit_size >>expect &&
+	echo "HEAD" >in &&
+	git cat-file --batch-check <in >actual &&
+	git cat-file --use-mailmap --batch-check <in >>actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git cat-file --batch-command returns correct size with --use-mailmap' '
+	test_when_finished "rm .mailmap" &&
+	cat >.mailmap <<-\EOF &&
+	C O Mitter <committer@example.com> Orig <orig@example.com>
+	EOF
+	git cat-file commit HEAD >commit.out &&
+	commit_size=$(wc -c <commit.out) &&
+	commit_sha=$(git rev-parse HEAD) &&
+	echo $commit_sha commit $commit_size >expect &&
+	git cat-file --use-mailmap commit HEAD >commit.out &&
+	commit_size=$(wc -c <commit.out) &&
+	echo $commit_sha commit $commit_size >>expect &&
+	echo "info HEAD" >in &&
+	git cat-file --batch-command <in >actual &&
+	git cat-file --use-mailmap --batch-command <in >>actual &&
+	test_cmp expect actual
+'
+
 test_done
-- 
2.39.0.rc1.6.g86692db720


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [PATCH v6 2/2] cat-file: add mailmap support to --batch-check option
  2022-12-01 15:55   ` [PATCH v6 2/2] cat-file: add mailmap support to --batch-check option Siddharth Asthana
@ 2022-12-14 11:27     ` Ævar Arnfjörð Bjarmason
  2022-12-14 14:04     ` Christian Couder
  1 sibling, 0 replies; 44+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-12-14 11:27 UTC (permalink / raw)
  To: Siddharth Asthana
  Cc: git, christian.couder, gitster, johncai86, Johannes.Schindelin, me


On Thu, Dec 01 2022, Siddharth Asthana wrote:

> 2. Make one call to `oid_object_info_extended()` to get the type of the
>    object. Then, if the object type is either of commit or tag, make a
>    call to `read_object_file()` to read the contents of the object.
> [...]
> +			buf = read_object_file(&data->oid, &data->type, &data->size);

Could you please change this to:

	buf = repo_read_object_file(the_repository, &data->oid, &data->type,
				    &data->size);

I.e. just use the repo_*() variant. We have been trying for a long time
to migrate away from these, and new code should use the non-macro
variants.

I noticed this because I have a local topic to do that migration, which
has a semantic conflict with this.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v6 2/2] cat-file: add mailmap support to --batch-check option
  2022-12-01 15:55   ` [PATCH v6 2/2] cat-file: add mailmap support to --batch-check option Siddharth Asthana
  2022-12-14 11:27     ` Ævar Arnfjörð Bjarmason
@ 2022-12-14 14:04     ` Christian Couder
  1 sibling, 0 replies; 44+ messages in thread
From: Christian Couder @ 2022-12-14 14:04 UTC (permalink / raw)
  To: Siddharth Asthana
  Cc: git, gitster, johncai86, Johannes.Schindelin, avarab, me

On Thu, Dec 1, 2022 at 4:55 PM Siddharth Asthana
<siddharthasthana31@gmail.com> wrote:

> diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
> index f82d702d6b..81235c60a3 100644
> --- a/Documentation/git-cat-file.txt
> +++ b/Documentation/git-cat-file.txt
> @@ -91,26 +91,49 @@ OPTIONS
>  --batch::
>  --batch=<format>::
>         Print object information and contents for each object provided
> -       on stdin.  May not be combined with any other options or arguments
> -       except `--textconv` or `--filters`, in which case the input lines

Nit: here there were backticks around --textconv and --filters ...

> -       also need to specify the path, separated by whitespace.  See the
> -       section `BATCH OUTPUT` below for details.
> +       on stdin. May not be combined with any other options or arguments
> +       except --textconv, --filters, or --use-mailmap.

... but here there are no backticks anymore.

It would be better if backticks were used.

> +       +
> +       * When used with `--textconv` or `--filters`, the input lines

Here and below backticks are used which is good.

> +         must specify the path, separated by whitespace. See the section
> +         `BATCH OUTPUT` below for details.
> +       +
> +       * When used with `--use-mailmap`, for commit and tag objects, the
> +         contents part of the output shows the identities replaced using the
> +         mailmap mechanism, while the information part of the output shows
> +         the size of the object as if it actually recorded the replacement
> +         identities.
>
>  --batch-check::
>  --batch-check=<format>::
> -       Print object information for each object provided on stdin.  May
> -       not be combined with any other options or arguments except
> -       `--textconv` or `--filters`, in which case the input lines also
> -       need to specify the path, separated by whitespace.  See the
> -       section `BATCH OUTPUT` below for details.
> +       Print object information for each object provided on stdin. May not be
> +       combined with any other options or arguments except --textconv, --filters
> +       or --use-mailmap.

Here backticks are also missing.

> +       +
> +       * When used with `--textconv` or `--filters`, the input lines must
> +        specify the path, separated by whitespace. See the section
> +        `BATCH OUTPUT` below for details.
> +       +
> +       * When used with `--use-mailmap`, for commit and tag objects, the
> +         printed object information shows the size of the object as if the
> +         identities recorded in it were replaced by the mailmap mechanism.
>
>  --batch-command::
>  --batch-command=<format>::
>         Enter a command mode that reads commands and arguments from stdin. May
> -       only be combined with `--buffer`, `--textconv` or `--filters`. In the
> -       case of `--textconv` or `--filters`, the input lines also need to specify
> -       the path, separated by whitespace. See the section `BATCH OUTPUT` below
> -       for details.
> +       only be combined with `--buffer`, `--textconv`, `--use-mailmap` or
> +       `--filters`.

Here they are used which is good.

> +       +
> +       * When used with `--textconv` or `--filters`, the input lines must
> +         specify the path, separated by whitespace. See the section
> +         `BATCH OUTPUT` below for details.
> +       +
> +       * When used with `--use-mailmap`, for commit and tag objects, the
> +         `contents` command shows the identities replaced using the
> +         mailmap mechanism, while the `info` command shows the size
> +         of the object as if it actually recorded the replacement
> +         identities.

Thanks!

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v7 0/2] Add mailmap mechanism in cat-file options
  2022-09-16 20:59 [PATCH 0/3] Add mailmap mechanism in --batch-check options Siddharth Asthana
                   ` (7 preceding siblings ...)
  2022-12-01 15:55 ` [PATCH v6 0/2] Add mailmap mechanism in cat-file options Siddharth Asthana
@ 2022-12-20  6:01 ` Siddharth Asthana
  2022-12-20  6:01   ` [PATCH v7 1/2] cat-file: add mailmap support to -s option Siddharth Asthana
                     ` (2 more replies)
  8 siblings, 3 replies; 44+ messages in thread
From: Siddharth Asthana @ 2022-12-20  6:01 UTC (permalink / raw)
  To: git
  Cc: christian.couder, gitster, johncai86, Johannes.Schindelin,
	avarab, me, Siddharth Asthana

Thanks a ton Christian and Ævar for the review :) I have made the
suggested changes in v7.

= Description

At present, `git-cat-file` command with `--batch-check` and `-s` options
does not complain when `--use-mailmap` option is given. The latter
option is just ignored. Instead, for commit/tag objects, the command
should compute the size of the object after replacing the idents and
report it. So, this patch series makes `-s` and `--batch-check` options
of `git-cat-file` honor mailmap when used with `--use-mailmap` option.

In this patch series we didn't want to change that '%(objectsize)'
always shows the size of the original object even when `--use-mailmap`
is set because first we have the long term plan to unify how the formats
for `git cat-file` and other commands works. And second existing formats
like the "pretty formats" used by `git log` have different options for
fields respecting mailmap or not respecting it (%an is for author name
while %aN for author name respecting mailmap).

I would like to thank my mentors, Christian Couder and John Cai, for all
of their help!
Looking forward to the reviews!

= Patch Organization

- The first patch makes `-s` option to return updated size of the
  <commit/tag> object, when combined with `--use-mailmap` option, after
  replacing the idents using the mailmap mechanism.
- The second patch makes `--batch-check` option to return updated size of
  the <commit/tag> object, when combined with `--use-mailmap` option,
  after replacing the idents using the mailmap mechanism.

= Changes in v5

- The patch series which improves the documentation for `-s`, `--batch`,
  `--batch-check` and `--batch-command` is again part of this patch
  series with patch 3/3 squashed into patches 1/3 and 2/3 as suggested
  by Junio, Taylor and Christian. The doc patch series was perviously
  sent independently for improving the documentation of git cat-file
  options:
  https://lore.kernel.org/git/20221029092513.73982-1-siddharthasthana31@gmail.com/

- Improve the tests according to run under SHA-256 mode.

= Changes in v6

- Improve the tests so it doesn't break under macOS.

= Changes in v7

- Used `repo_read_object_file` instead of `read_object_file` because we
  have been trying to migrate away from these, and new code should use
  the non-macro variants.
- Improve the documentation of patch series.

Siddharth Asthana (2):
  cat-file: add mailmap support to -s option
  cat-file: add mailmap support to --batch-check option

 Documentation/git-cat-file.txt | 53 +++++++++++++++++++--------
 builtin/cat-file.c             | 28 +++++++++++++++
 t/t4203-mailmap.sh             | 65 ++++++++++++++++++++++++++++++++++
 3 files changed, 132 insertions(+), 14 deletions(-)

Range-diff against v6:
1:  7f20f45183 = 1:  2097544b2d cat-file: add mailmap support to -s option
2:  971f38e064 ! 2:  8305148718 cat-file: add mailmap support to --batch-check option
    @@ Commit message
     
         2. Make one call to `oid_object_info_extended()` to get the type of the
            object. Then, if the object type is either of commit or tag, make a
    -       call to `read_object_file()` to read the contents of the object.
    +       call to `repo_read_object_file()` to read the contents of the object.
     
         I benchmarked the following command with both the above approaches and
         compared against the current implementation where `--use-mailmap`
    @@ Documentation/git-cat-file.txt: OPTIONS
     -	also need to specify the path, separated by whitespace.  See the
     -	section `BATCH OUTPUT` below for details.
     +	on stdin. May not be combined with any other options or arguments
    -+	except --textconv, --filters, or --use-mailmap.
    ++	except `--textconv`, `--filters`, or `--use-mailmap`.
     +	+
     +	* When used with `--textconv` or `--filters`, the input lines
     +	  must specify the path, separated by whitespace. See the section
    @@ Documentation/git-cat-file.txt: OPTIONS
     -	need to specify the path, separated by whitespace.  See the
     -	section `BATCH OUTPUT` below for details.
     +	Print object information for each object provided on stdin. May not be
    -+	combined with any other options or arguments except --textconv, --filters
    -+	or --use-mailmap.
    ++	combined with any other options or arguments except `--textconv`, `--filters`
    ++	or `--use-mailmap`.
     +	+
     +	* When used with `--textconv` or `--filters`, the input lines must
     +	 specify the path, separated by whitespace. See the section
    @@ builtin/cat-file.c: static void batch_object_write(const char *obj_name,
     +			size_t s = data->size;
     +			char *buf = NULL;
     +
    -+			buf = read_object_file(&data->oid, &data->type, &data->size);
    ++			buf = repo_read_object_file(the_repository, &data->oid, &data->type,
    ++						    &data->size);
     +			buf = replace_idents_using_mailmap(buf, &s);
     +			data->size = cast_size_t_to_ulong(s);
     +
-- 
2.39.0.97.g8305148718


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v7 1/2] cat-file: add mailmap support to -s option
  2022-12-20  6:01 ` [PATCH v7 0/2] Add mailmap mechanism in cat-file options Siddharth Asthana
@ 2022-12-20  6:01   ` Siddharth Asthana
  2022-12-20  6:01   ` [PATCH v7 2/2] cat-file: add mailmap support to --batch-check option Siddharth Asthana
  2022-12-20 13:02   ` [PATCH v7 0/2] Add mailmap mechanism in cat-file options Ævar Arnfjörð Bjarmason
  2 siblings, 0 replies; 44+ messages in thread
From: Siddharth Asthana @ 2022-12-20  6:01 UTC (permalink / raw)
  To: git
  Cc: christian.couder, gitster, johncai86, Johannes.Schindelin,
	avarab, me, Siddharth Asthana

Even though the cat-file command with `-s` option does not complain when
`--use-mailmap` option is given, the latter option is ignored. Compute
the size of the object after replacing the idents and report it instead.

In order to make `-s` option honour the mailmap mechanism we have to
read the contents of the commit/tag object. Make use of the call to
`oid_object_info_extended()` to get the contents of the object and store
in `buf`. `buf` is later freed in the function.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: John Cai <johncai86@gmail.com>
Helped-by: Taylor Blau <me@ttaylorr.com>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
---
 Documentation/git-cat-file.txt |  4 +++-
 builtin/cat-file.c             | 13 +++++++++++++
 t/t4203-mailmap.sh             | 29 +++++++++++++++++++++++++++++
 3 files changed, 45 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
index ec30b5c574..f82d702d6b 100644
--- a/Documentation/git-cat-file.txt
+++ b/Documentation/git-cat-file.txt
@@ -45,7 +45,9 @@ OPTIONS
 
 -s::
 	Instead of the content, show the object size identified by
-	`<object>`.
+	`<object>`. If used with `--use-mailmap` option, will show
+	the size of updated object after replacing idents using the
+	mailmap mechanism.
 
 -e::
 	Exit with zero status if `<object>` exists and is a valid
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index b3be58b1fb..dde8dbeacd 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -132,8 +132,21 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name,
 
 	case 's':
 		oi.sizep = &size;
+
+		if (use_mailmap) {
+			oi.typep = &type;
+			oi.contentp = (void**)&buf;
+		}
+
 		if (oid_object_info_extended(the_repository, &oid, &oi, flags) < 0)
 			die("git cat-file: could not get object info");
+
+		if (use_mailmap && (type == OBJ_COMMIT || type == OBJ_TAG)) {
+			size_t s = size;
+			buf = replace_idents_using_mailmap(buf, &s);
+			size = cast_size_t_to_ulong(s);
+		}
+
 		printf("%"PRIuMAX"\n", (uintmax_t)size);
 		ret = 0;
 		goto cleanup;
diff --git a/t/t4203-mailmap.sh b/t/t4203-mailmap.sh
index cd1cab3e54..b8ec5e0959 100755
--- a/t/t4203-mailmap.sh
+++ b/t/t4203-mailmap.sh
@@ -1022,4 +1022,33 @@ test_expect_success '--mailmap enables mailmap in cat-file for annotated tag obj
 	test_cmp expect actual
 '
 
+test_expect_success 'git cat-file -s returns correct size with --use-mailmap' '
+	test_when_finished "rm .mailmap" &&
+	cat >.mailmap <<-\EOF &&
+	C O Mitter <committer@example.com> Orig <orig@example.com>
+	EOF
+	git cat-file commit HEAD >commit.out &&
+	echo $(wc -c <commit.out) >expect &&
+	git cat-file --use-mailmap commit HEAD >commit.out &&
+	echo $(wc -c <commit.out) >>expect &&
+	git cat-file -s HEAD >actual &&
+	git cat-file --use-mailmap -s HEAD >>actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git cat-file -s returns correct size with --use-mailmap for tag objects' '
+	test_when_finished "rm .mailmap" &&
+	cat >.mailmap <<-\EOF &&
+	Orig <orig@example.com> C O Mitter <committer@example.com>
+	EOF
+	git tag -a -m "annotated tag" v3 &&
+	git cat-file tag v3 >tag.out &&
+	echo $(wc -c <tag.out) >expect &&
+	git cat-file --use-mailmap tag v3 >tag.out &&
+	echo $(wc -c <tag.out) >>expect &&
+	git cat-file -s v3 >actual &&
+	git cat-file --use-mailmap -s v3 >>actual &&
+	test_cmp expect actual
+'
+
 test_done
-- 
2.39.0.97.g8305148718


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v7 2/2] cat-file: add mailmap support to --batch-check option
  2022-12-20  6:01 ` [PATCH v7 0/2] Add mailmap mechanism in cat-file options Siddharth Asthana
  2022-12-20  6:01   ` [PATCH v7 1/2] cat-file: add mailmap support to -s option Siddharth Asthana
@ 2022-12-20  6:01   ` Siddharth Asthana
  2022-12-20 13:02   ` [PATCH v7 0/2] Add mailmap mechanism in cat-file options Ævar Arnfjörð Bjarmason
  2 siblings, 0 replies; 44+ messages in thread
From: Siddharth Asthana @ 2022-12-20  6:01 UTC (permalink / raw)
  To: git
  Cc: christian.couder, gitster, johncai86, Johannes.Schindelin,
	avarab, me, Siddharth Asthana

Even though the cat-file command with `--batch-check` option does not
complain when `--use-mailmap` option is given, the latter option is
ignored. Compute the size of the object after replacing the idents and
report it instead.

In order to make `--batch-check` option honour the mailmap mechanism we
have to read the contents of the commit/tag object.

There were two ways to do it:

1. Make two calls to `oid_object_info_extended()`. If `--use-mailmap`
   option is given, the first call will get us the type of the object
   and second call will only be made if the object type is either a
   commit or tag to get the contents of the object.

2. Make one call to `oid_object_info_extended()` to get the type of the
   object. Then, if the object type is either of commit or tag, make a
   call to `repo_read_object_file()` to read the contents of the object.

I benchmarked the following command with both the above approaches and
compared against the current implementation where `--use-mailmap`
option is ignored:

`git cat-file --use-mailmap --batch-all-objects --batch-check --buffer
--unordered`

The results can be summarized as follows:
                       Time (mean ± σ)
default               827.7 ms ± 104.8 ms
first approach        6.197 s ± 0.093 s
second approach       1.975 s ± 0.217 s

Since, the second approach is faster than the first one, I implemented
it in this patch.

The command git cat-file can now use the mailmap mechanism to replace
idents with canonical versions for commit and tag objects. There are
several options like `--batch`, `--batch-check` and `--batch-command`
that can be combined with `--use-mailmap`. But the documentation for
`--batch`, `--batch-check` and `--batch-command` doesn't say so. This
patch fixes that documentation.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: John Cai <johncai86@gmail.com>
Helped-by: Taylor Blau <me@ttaylorr.com>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
---
 Documentation/git-cat-file.txt | 49 +++++++++++++++++++++++++---------
 builtin/cat-file.c             | 15 +++++++++++
 t/t4203-mailmap.sh             | 36 +++++++++++++++++++++++++
 3 files changed, 87 insertions(+), 13 deletions(-)

diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
index f82d702d6b..830f0a2eff 100644
--- a/Documentation/git-cat-file.txt
+++ b/Documentation/git-cat-file.txt
@@ -91,26 +91,49 @@ OPTIONS
 --batch::
 --batch=<format>::
 	Print object information and contents for each object provided
-	on stdin.  May not be combined with any other options or arguments
-	except `--textconv` or `--filters`, in which case the input lines
-	also need to specify the path, separated by whitespace.  See the
-	section `BATCH OUTPUT` below for details.
+	on stdin. May not be combined with any other options or arguments
+	except `--textconv`, `--filters`, or `--use-mailmap`.
+	+
+	* When used with `--textconv` or `--filters`, the input lines
+	  must specify the path, separated by whitespace. See the section
+	  `BATCH OUTPUT` below for details.
+	+
+	* When used with `--use-mailmap`, for commit and tag objects, the
+	  contents part of the output shows the identities replaced using the
+	  mailmap mechanism, while the information part of the output shows
+	  the size of the object as if it actually recorded the replacement
+	  identities.
 
 --batch-check::
 --batch-check=<format>::
-	Print object information for each object provided on stdin.  May
-	not be combined with any other options or arguments except
-	`--textconv` or `--filters`, in which case the input lines also
-	need to specify the path, separated by whitespace.  See the
-	section `BATCH OUTPUT` below for details.
+	Print object information for each object provided on stdin. May not be
+	combined with any other options or arguments except `--textconv`, `--filters`
+	or `--use-mailmap`.
+	+
+	* When used with `--textconv` or `--filters`, the input lines must
+	 specify the path, separated by whitespace. See the section
+	 `BATCH OUTPUT` below for details.
+	+
+	* When used with `--use-mailmap`, for commit and tag objects, the
+	  printed object information shows the size of the object as if the
+	  identities recorded in it were replaced by the mailmap mechanism.
 
 --batch-command::
 --batch-command=<format>::
 	Enter a command mode that reads commands and arguments from stdin. May
-	only be combined with `--buffer`, `--textconv` or `--filters`. In the
-	case of `--textconv` or `--filters`, the input lines also need to specify
-	the path, separated by whitespace. See the section `BATCH OUTPUT` below
-	for details.
+	only be combined with `--buffer`, `--textconv`, `--use-mailmap` or
+	`--filters`.
+	+
+	* When used with `--textconv` or `--filters`, the input lines must
+	  specify the path, separated by whitespace. See the section
+	  `BATCH OUTPUT` below for details.
+	+
+	* When used with `--use-mailmap`, for commit and tag objects, the
+	  `contents` command shows the identities replaced using the
+	  mailmap mechanism, while the `info` command shows the size
+	  of the object as if it actually recorded the replacement
+	  identities.
+
 +
 `--batch-command` recognizes the following commands:
 +
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index dde8dbeacd..cc17635e76 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -444,6 +444,9 @@ static void batch_object_write(const char *obj_name,
 	if (!data->skip_object_info) {
 		int ret;
 
+		if (use_mailmap)
+			data->info.typep = &data->type;
+
 		if (pack)
 			ret = packed_object_info(the_repository, pack, offset,
 						 &data->info);
@@ -457,6 +460,18 @@ static void batch_object_write(const char *obj_name,
 			fflush(stdout);
 			return;
 		}
+
+		if (use_mailmap && (data->type == OBJ_COMMIT || data->type == OBJ_TAG)) {
+			size_t s = data->size;
+			char *buf = NULL;
+
+			buf = repo_read_object_file(the_repository, &data->oid, &data->type,
+						    &data->size);
+			buf = replace_idents_using_mailmap(buf, &s);
+			data->size = cast_size_t_to_ulong(s);
+
+			free(buf);
+		}
 	}
 
 	strbuf_reset(scratch);
diff --git a/t/t4203-mailmap.sh b/t/t4203-mailmap.sh
index b8ec5e0959..fa7f987284 100755
--- a/t/t4203-mailmap.sh
+++ b/t/t4203-mailmap.sh
@@ -1051,4 +1051,40 @@ test_expect_success 'git cat-file -s returns correct size with --use-mailmap for
 	test_cmp expect actual
 '
 
+test_expect_success 'git cat-file --batch-check returns correct size with --use-mailmap' '
+	test_when_finished "rm .mailmap" &&
+	cat >.mailmap <<-\EOF &&
+	C O Mitter <committer@example.com> Orig <orig@example.com>
+	EOF
+	git cat-file commit HEAD >commit.out &&
+	commit_size=$(wc -c <commit.out) &&
+	commit_sha=$(git rev-parse HEAD) &&
+	echo $commit_sha commit $commit_size >expect &&
+	git cat-file --use-mailmap commit HEAD >commit.out &&
+	commit_size=$(wc -c <commit.out) &&
+	echo $commit_sha commit $commit_size >>expect &&
+	echo "HEAD" >in &&
+	git cat-file --batch-check <in >actual &&
+	git cat-file --use-mailmap --batch-check <in >>actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git cat-file --batch-command returns correct size with --use-mailmap' '
+	test_when_finished "rm .mailmap" &&
+	cat >.mailmap <<-\EOF &&
+	C O Mitter <committer@example.com> Orig <orig@example.com>
+	EOF
+	git cat-file commit HEAD >commit.out &&
+	commit_size=$(wc -c <commit.out) &&
+	commit_sha=$(git rev-parse HEAD) &&
+	echo $commit_sha commit $commit_size >expect &&
+	git cat-file --use-mailmap commit HEAD >commit.out &&
+	commit_size=$(wc -c <commit.out) &&
+	echo $commit_sha commit $commit_size >>expect &&
+	echo "info HEAD" >in &&
+	git cat-file --batch-command <in >actual &&
+	git cat-file --use-mailmap --batch-command <in >>actual &&
+	test_cmp expect actual
+'
+
 test_done
-- 
2.39.0.97.g8305148718


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [PATCH v7 0/2] Add mailmap mechanism in cat-file options
  2022-12-20  6:01 ` [PATCH v7 0/2] Add mailmap mechanism in cat-file options Siddharth Asthana
  2022-12-20  6:01   ` [PATCH v7 1/2] cat-file: add mailmap support to -s option Siddharth Asthana
  2022-12-20  6:01   ` [PATCH v7 2/2] cat-file: add mailmap support to --batch-check option Siddharth Asthana
@ 2022-12-20 13:02   ` Ævar Arnfjörð Bjarmason
  2 siblings, 0 replies; 44+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-12-20 13:02 UTC (permalink / raw)
  To: Siddharth Asthana
  Cc: git, christian.couder, gitster, johncai86, Johannes.Schindelin, me


On Tue, Dec 20 2022, Siddharth Asthana wrote:

>          2. Make one call to `oid_object_info_extended()` to get the type of the
>             object. Then, if the object type is either of commit or tag, make a
>     -       call to `read_object_file()` to read the contents of the object.
>     +       call to `repo_read_object_file()` to read the contents of the object.
>      
>          I benchmarked the following command with both the above approaches and
>          compared against the current implementation where `--use-mailmap`
>     @@ Documentation/git-cat-file.txt: OPTIONS
>      -	also need to specify the path, separated by whitespace.  See the
>      -	section `BATCH OUTPUT` below for details.
>      +	on stdin. May not be combined with any other options or arguments
>     -+	except --textconv, --filters, or --use-mailmap.
>     ++	except `--textconv`, `--filters`, or `--use-mailmap`.
>      +	+
>      +	* When used with `--textconv` or `--filters`, the input lines
>      +	  must specify the path, separated by whitespace. See the section
>     @@ Documentation/git-cat-file.txt: OPTIONS
>      -	need to specify the path, separated by whitespace.  See the
>      -	section `BATCH OUTPUT` below for details.
>      +	Print object information for each object provided on stdin. May not be
>     -+	combined with any other options or arguments except --textconv, --filters
>     -+	or --use-mailmap.
>     ++	combined with any other options or arguments except `--textconv`, `--filters`
>     ++	or `--use-mailmap`.
>      +	+
>      +	* When used with `--textconv` or `--filters`, the input lines must
>      +	 specify the path, separated by whitespace. See the section
>     @@ builtin/cat-file.c: static void batch_object_write(const char *obj_name,
>      +			size_t s = data->size;
>      +			char *buf = NULL;
>      +
>     -+			buf = read_object_file(&data->oid, &data->type, &data->size);
>     ++			buf = repo_read_object_file(the_repository, &data->oid, &data->type,
>     ++						    &data->size);
>      +			buf = replace_idents_using_mailmap(buf, &s);
>      +			data->size = cast_size_t_to_ulong(s);
>      +


This series looks good to me at this point, thanks in particular for the
repo_*() change (will make something I plan to submit soon easier).

These[1][2] are nits I came up with while reviewing this, not worth a
re-roll, but tweaked some things I found a bit odd, namely:

 * Let's not cast to void **, instead we can just declare a variable for
   the 's' case
 * If we don't need the "buf" return value we can skip the assignment
   (although I guess technically this breaks the encapsulation, so maybe
   we shouldn't skip it)
 * We can skip the NULL assignment in 2/2, and instead just assign to
   the variable as we declare it, and also do the replace/free on one
   line (to make it clear that we immediately don't care about it, and
   only want the size).

I don't think any of it's worth a re-roll, just notes to show you I've
looked at it carefully.

The only unresolved question I had while reading this is if we're sure
that a repo_read_object_file() following the a successful
oid_object_info_extended() is guaranteed to succeed? If not the code in
2/2 has a segfault we might trigger (as buf will be NULL), but maybe
we're guaranteed to always get the already-retrieved object from the
object cache.

1:  fe3cc3715b2 ! 1:  31701b3e55d cat-file: add mailmap support to -s option
    @@ Documentation/git-cat-file.txt: OPTIONS
     
      ## builtin/cat-file.c ##
     @@ builtin/cat-file.c: static int cat_one_file(int opt, const char *exp_type, const char *obj_name,
    + 		break;
      
      	case 's':
    ++	{
    ++		void *obuf = NULL;
    ++
      		oi.sizep = &size;
     +
     +		if (use_mailmap) {
     +			oi.typep = &type;
    -+			oi.contentp = (void**)&buf;
    ++			oi.contentp = &obuf;
     +		}
     +
      		if (oid_object_info_extended(the_repository, &oid, &oi, flags) < 0)
    @@ builtin/cat-file.c: static int cat_one_file(int opt, const char *exp_type, const
     +
     +		if (use_mailmap && (type == OBJ_COMMIT || type == OBJ_TAG)) {
     +			size_t s = size;
    -+			buf = replace_idents_using_mailmap(buf, &s);
    ++
    ++			replace_idents_using_mailmap(obuf, &s);
     +			size = cast_size_t_to_ulong(s);
     +		}
    ++		free(obuf);
     +
      		printf("%"PRIuMAX"\n", (uintmax_t)size);
      		ret = 0;
      		goto cleanup;
    +-
    ++	}
    + 	case 'e':
    + 		return !has_object_file(&oid);
    + 
     
      ## t/t4203-mailmap.sh ##
     @@ t/t4203-mailmap.sh: test_expect_success '--mailmap enables mailmap in cat-file for annotated tag obj
2:  d6c621820d2 ! 2:  14d95db69e9 cat-file: add mailmap support to --batch-check option
    @@ builtin/cat-file.c: static void batch_object_write(const char *obj_name,
     +
     +		if (use_mailmap && (data->type == OBJ_COMMIT || data->type == OBJ_TAG)) {
     +			size_t s = data->size;
    -+			char *buf = NULL;
    ++			char *buf = repo_read_object_file(the_repository,
    ++							  &data->oid,
    ++							  &data->type,
    ++							  &data->size);
     +
    -+			buf = repo_read_object_file(the_repository, &data->oid, &data->type,
    -+						    &data->size);
    -+			buf = replace_idents_using_mailmap(buf, &s);
    ++			free(replace_idents_using_mailmap(buf, &s));
     +			data->size = cast_size_t_to_ulong(s);
    -+
    -+			free(buf);
     +		}
      	}
      

^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2022-12-20 13:14 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-16 20:59 [PATCH 0/3] Add mailmap mechanism in --batch-check options Siddharth Asthana
2022-09-16 20:59 ` [PATCH 1/3] doc/cat-file: allow --use-mailmap for --batch options Siddharth Asthana
2022-09-16 22:02   ` Junio C Hamano
2022-09-16 20:59 ` [PATCH 2/3] cat-file: add mailmap support to -s option Siddharth Asthana
2022-09-16 22:22   ` Junio C Hamano
2022-09-16 20:59 ` [PATCH 3/3] cat-file: add mailmap support to --batch-check option Siddharth Asthana
2022-09-16 22:35   ` Junio C Hamano
2022-09-26 10:53 ` [PATCH v2 0/2] Add mailmap mechanism in cat-file options Siddharth Asthana
2022-09-26 10:53   ` [PATCH v2 1/2] cat-file: add mailmap support to -s option Siddharth Asthana
2022-09-26 13:16     ` Ævar Arnfjörð Bjarmason
2022-09-26 13:25     ` Ævar Arnfjörð Bjarmason
2022-09-26 10:53   ` [PATCH v2 2/2] cat-file: add mailmap support to --batch-check option Siddharth Asthana
2022-10-29 10:24 ` [PATCH v3 0/2] Add mailmap mechanism in cat-file options Siddharth Asthana
2022-10-29 10:24   ` [PATCH v3 1/2] cat-file: add mailmap support to -s option Siddharth Asthana
2022-10-31 11:49     ` Christian Couder
2022-10-29 10:24   ` [PATCH v3 2/2] cat-file: add mailmap support to --batch-check option Siddharth Asthana
2022-10-31 11:43     ` Christian Couder
2022-10-29 18:00   ` [PATCH v3 0/2] Add mailmap mechanism in cat-file options Taylor Blau
2022-11-13 21:28 ` [PATCH v4 0/3] " Siddharth Asthana
2022-11-13 21:28   ` [PATCH v4 1/3] cat-file: add mailmap support to -s option Siddharth Asthana
2022-11-13 21:28   ` [PATCH v4 2/3] cat-file: add mailmap support to --batch-check option Siddharth Asthana
2022-11-15 21:40     ` Taylor Blau
2022-11-13 21:28   ` [PATCH v4 3/3] doc/cat-file: allow --use-mailmap for --batch options Siddharth Asthana
2022-11-14 17:48   ` [PATCH v4 0/3] Add mailmap mechanism in cat-file options Christian Couder
2022-11-14 22:30     ` Taylor Blau
2022-11-20  7:42     ` Siddharth Asthana
2022-11-20  7:48 ` [PATCH v5 0/2] " Siddharth Asthana
2022-11-20  7:48   ` [PATCH v5 1/2] cat-file: add mailmap support to -s option Siddharth Asthana
2022-11-21  7:27     ` Junio C Hamano
2022-11-21  9:40       ` Christian Couder
2022-11-21  9:45         ` Junio C Hamano
2022-11-21 11:27         ` Ævar Arnfjörð Bjarmason
2022-11-20  7:48   ` [PATCH v5 2/2] cat-file: add mailmap support to --batch-check option Siddharth Asthana
2022-11-21  7:38     ` Junio C Hamano
2022-11-30  9:19       ` Junio C Hamano
2022-12-01 15:55 ` [PATCH v6 0/2] Add mailmap mechanism in cat-file options Siddharth Asthana
2022-12-01 15:55   ` [PATCH v6 1/2] cat-file: add mailmap support to -s option Siddharth Asthana
2022-12-01 15:55   ` [PATCH v6 2/2] cat-file: add mailmap support to --batch-check option Siddharth Asthana
2022-12-14 11:27     ` Ævar Arnfjörð Bjarmason
2022-12-14 14:04     ` Christian Couder
2022-12-20  6:01 ` [PATCH v7 0/2] Add mailmap mechanism in cat-file options Siddharth Asthana
2022-12-20  6:01   ` [PATCH v7 1/2] cat-file: add mailmap support to -s option Siddharth Asthana
2022-12-20  6:01   ` [PATCH v7 2/2] cat-file: add mailmap support to --batch-check option Siddharth Asthana
2022-12-20 13:02   ` [PATCH v7 0/2] Add mailmap mechanism in cat-file options Ævar Arnfjörð Bjarmason

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).