All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matheus Tavares <matheus.bernardino@usp.br>
To: gitgitgadget@gmail.com
Cc: git@vger.kernel.org, philipoakley@iee.email, sobomax@gmail.com,
	sobomax@sippysoft.com
Subject: Re: [PATCH v2] Make ident dynamic, not just a hardcoded value of "$Id".
Date: Thu, 26 Aug 2021 17:37:13 -0300	[thread overview]
Message-ID: <20210826203713.45139-1-matheus.bernardino@usp.br> (raw)
In-Reply-To: <pull.1074.v2.git.git.1629952119446.gitgitgadget@gmail.com>

Hi, Maksym

I haven't read the entire patch (and I don't normally use the ident feature),
but I left a few comments below: 

On Thu, Aug 26, 2021 at 1:28 AM Maksym Sobolyev via GitGitGadget <gitgitgadget@gmail.com> wrote:
>
> diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
> index 83fd4e19a4..9e486f3e8d 100644
> --- a/Documentation/gitattributes.txt
> +++ b/Documentation/gitattributes.txt
> @@ -382,6 +382,14 @@ sign `$` upon checkout.  Any byte sequence that begins with
>  `$Id:` and ends with `$` in the worktree file is replaced
>  with `$Id$` upon check-in.
>
> +The `ident` attribute can also provide an optional value,
> +which if supplied is going to be used for expansion instead of
> +the string `Id`.
> +
> +------------------------
> +*.[ch]         ident=FreeBSD
> +------------------------

What happens if there is a ':' or '$' in the custom id name?

$ echo 'f	ident=$weird:$' >.gitattributes
$ echo '$$weird:$$' >f
$ git add .
$ git commit -m files
$ rm f
$ git checkout f
$ cat f

$$weird:$: 907f73b343505bf289a4a25e41ea91d90250fedb $

Seem right. Now let's see the cleaning direction:

$ echo foo >> f
$ git add f
$ git cat-file -p :f

$$weird:$: 907f73b343505bf289a4a25e41ea91d90250fedb $
foo

Hmm, this one is not right. I think I know the cause:

> diff --git a/convert.c b/convert.c
> index 0d6fb3410ae..1e8940bf9d7 100644
> --- a/convert.c
> +++ b/convert.c
> @@ -1122,17 +1127,18 @@ static int ident_to_git(const char *src, size_t len,
>                 len -= dollar + 1 - src;
>                 src  = dollar + 1;
>
> -               if (len > 3 && !memcmp(src, "Id:", 3)) {
> -                       dollar = memchr(src + 3, '$', len - 3);
> +               if (len > idact->id_len + 1 && !memcmp(src, idact->id, idact->id_len) && src[idact->id_len + 1] == ':') {

Shouldn't this be `... && src[idact->id_len] == ':')`? I.e. without the "+ 1".

> diff --git a/convert.h b/convert.h
> index 5ee1c322058..a02632e8104 100644
> --- a/convert.h
> +++ b/convert.h
> @@ -76,11 +76,16 @@ enum convert_crlf_action {
>
>  struct convert_driver;
>
> +struct ident_action {
> +       const char *id;
> +       int id_len;

I known the string size is limited to GIT_MAX_IDENT_LEN, but perhaps it
would be better to future-proof this and use `size_t` (or at least
`unsigned int`).

> diff --git a/parallel-checkout.c b/parallel-checkout.c
> index ddc0ff3c064..f9a3f2ff25b 100644
> --- a/parallel-checkout.c
> +++ b/parallel-checkout.c
> @@ -403,13 +403,15 @@ static void send_one_item(int fd, struct parallel_checkout_item *pc_item)
>         size_t name_len = pc_item->ce->ce_namelen;
>         size_t working_tree_encoding_len = working_tree_encoding ?
>                                            strlen(working_tree_encoding) : 0;
> +       const char *ident_action_id = pc_item->ca.ident_action.id;
> +       size_t ident_action_len = pc_item->ca.ident_action.id_len;
>
>         /*
>          * Any changes in the calculation of the message size must also be made
>          * in is_eligible_for_parallel_checkout().
>          */
>         len_data = sizeof(struct pc_item_fixed_portion) + name_len +
> -                  working_tree_encoding_len;
> +                  working_tree_encoding_len + ident_action_len;

This update in the packet size calculation should also be reflected in
is_eligible_for_parallel_checkout():

	packed_item_size = sizeof(struct pc_item_fixed_portion) + ce->ce_namelen +
		(ca->working_tree_encoding ? strlen(ca->working_tree_encoding) : 0);

> diff --git a/t/t2082-parallel-checkout-attributes.sh b/t/t2082-parallel-checkout-attributes.sh
> index 25254579618..822957a8dc8 100755
> --- a/t/t2082-parallel-checkout-attributes.sh
> +++ b/t/t2082-parallel-checkout-attributes.sh
> @@ -20,16 +20,19 @@ test_expect_success 'parallel-checkout with ident' '
>         (
>                 cd ident &&
>                 echo "A ident" >.gitattributes &&
> +               echo "C ident=MyCusomVeryLongAndWordyId" >>.gitattributes &&
>                 echo "\$Id\$" >A &&
>                 echo "\$Id\$" >B &&
> +               echo "\$MyCusomVeryLongAndWordyId\$" >C &&
>                 git add -A &&
>                 git commit -m id &&
>
> -               rm A B &&
> +               rm A B C &&
>                 test_checkout_workers 2 git reset --hard &&
>                 hexsz=$(test_oid hexsz) &&
>                 grep -E "\\\$Id: [0-9a-f]{$hexsz} \\\$" A &&
> -               grep "\\\$Id\\\$" B
> +               grep "\\\$Id\\\$" B &&
> +               grep -E "\\\$MyCusomVeryLongAndWordyId: [0-9a-f]{$hexsz} \\\$" C
>         )
>  '

The change to this test looks good. But there are other things related
to the ident filter that are not covered here (as the test is mostly
interested in how ident interacts with parallel checkout). For example,
the test doesn't check whether the "cleaning" direction is performed
correctly, it just checks the "smudging" part. (That's why it didn't catch the
error I showed earlier.)

You might want to take a look at the other ident tests in
t0021-conversion.sh and perhaps addapt/copy some of them to ensure that
the expected behavior persists when using a custom id name. For example,
quickly changing the "$Id:" references to "$customId:" and replacing
"ident" with "ident=customId", I seem to get at least one test failure:

--- expected-output     2021-08-26 19:40:07.181596662 +0000
+++ expanded-keywords   2021-08-26 19:40:07.188263463 +0000
@@ -6,5 +6,5 @@
 $customId: bebd07c752ffffd6779e1056db5de66c3bb733ed $
 $customId: bebd07c752ffffd6779e1056db5de66c3bb733ed $
 $customId: NoTerminatingSymbol
-$customId: Foreign Commit With Spaces $
+$customId: bebd07c752ffffd6779e1056db5de66c3bb733ed $
 $customId: NoTerminatingSymbolAtEOF
\ No newline at end of file
error: last command exited with $?=1
not ok 3 - expanded_in_repo

(But I haven't dig further, and this was just a quick test, so it could be my
fault.)

  reply	other threads:[~2021-08-26 20:37 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-23 16:41 [PATCH] Make ident dynamic, not just a hardcoded value of "$Id" Maksym Sobolyev via GitGitGadget
2021-08-23 18:10 ` Junio C Hamano
2021-08-23 18:41 ` Philip Oakley
2021-08-26  4:28 ` [PATCH v2] " Maksym Sobolyev via GitGitGadget
2021-08-26 20:37   ` Matheus Tavares [this message]
2021-09-02  0:58     ` Junio C Hamano
2021-09-02 19:04       ` Junio C Hamano
2021-08-27  2:59   ` Junio C Hamano
     [not found]     ` <CABFYoQC_FzbU_E4hU0kCz-WFJNOLspwL2Gjc01sMXDZosxJWjw@mail.gmail.com>
2021-09-01  5:35       ` Junio C Hamano
2021-09-01  2:13   ` [PATCH v3] " Maksym Sobolyev via GitGitGadget
2021-09-02  3:40     ` Đoàn Trần Công Danh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210826203713.45139-1-matheus.bernardino@usp.br \
    --to=matheus.bernardino@usp.br \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=philipoakley@iee.email \
    --cc=sobomax@gmail.com \
    --cc=sobomax@sippysoft.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.