All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] add usage-strings ci check and amend remaining usage strings
@ 2022-02-16 17:02 Abhradeep Chakraborty via GitGitGadget
  2022-02-21 14:51 ` Abhradeep Chakraborty
                   ` (2 more replies)
  0 siblings, 3 replies; 59+ messages in thread
From: Abhradeep Chakraborty via GitGitGadget @ 2022-02-16 17:02 UTC (permalink / raw)
  To: git; +Cc: Abhradeep Chakraborty, Abhra303

From: Abhra303 <chakrabortyabhradeep79@gmail.com>

Usage strings for git (sub)command flags has a style guide that
suggests - first letter should not capitalized (unless requied)
and it should skip full-stop at the end of line. But there are
some files where usage-strings do not follow the above mentioned
guide. Moreover, there are no checks to verify if all usage strings
are following the guide/convention or not.

Amend the usage strings that don't follow the convention/guide and
add a `CI` check for checking the usage strings (whether the first
letter is capital or it ends with full-stop). If the `check` find
such strings then print those strings and return a non-zero status.

Also provide a script that takes an optional argument (a valid <tree>
string), to check the usage strings in the given <tree> (`HEAD` is
the default argument).

Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
---
    add usage-strings ci check and amend remaining usage strings
    
    This patch series completely fixes #636.
    
    The issue is about amending the usage-strings (for command flags such as
    -h, -v etc.) which do not follow the style convention/guide. There was a
    PR [https://github.com/gitgitgadget/git/pull/920] addressing this issue
    but as Johannes [https://github.com/dscho] said in his comment
    [https://github.com/gitgitgadget/git/issues/636#issuecomment-1018660439],
    there are some files that still have those kind of usage strings.
    Johannes also suggested to add a CI check under ci/test-documentation.sh
    to check the usage strings.
    
    So, in this patch, all remaining usage strings are corrected. I also
    added a check-usage-strings target in Makefile which can be used to
    check usage strings. It uses check-usage-strings.sh.
    
     1. If check-usage-strings.sh is run on a valid git repo - it will check
        the validity of usage-strings in the tree specified by an argument
        or in the HEAD if no argument provided.
     2. If the current repo is not a git repo (i.e. if it doesn't find any
        .git folder), it will check the usage string in the current root
        directory.
    
    For the first case, output of make check-usage-strings or
    ./check-usage-strings.sh would be similar to -
    
    HEAD:builtin/bisect--helper.c:1212                        N_("use <cmd>... to automatically bisect."), BISECT_RUN),
    HEAD:builtin/submodule--helper.c:1877            OPT__QUIET(&quiet, "Suppress output for cloning a submodule"),
    
    
    If an argument provided - ./check-usage-strings.sh 'v2.34.0' , it will
    search for usage-strings in v2.34.0 and v2.34.0 will be prefixed before
    filenames instead of HEAD.
    
    In the second case, output would be similar to -
    
    diff.c:5596                            N_("select files by diff type."),
    diff.c:5599               N_("Output to a specific file"),
    builtin/branch.c:666            OPT_BIT('C', NULL, &copy, N_("copy a branch, even if target exists."), 2),
    make: *** [check-usage-strings] Error 1
    
    
    Note in the last case - arguments provided to it will be useless.

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1147%2FAbhra303%2Fusage_command_amend-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1147/Abhra303/usage_command_amend-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1147

 Makefile                    |  5 +++++
 builtin/bisect--helper.c    |  2 +-
 builtin/reflog.c            |  6 +++---
 builtin/submodule--helper.c |  2 +-
 check-usage-strings.sh      | 33 +++++++++++++++++++++++++++++++++
 ci/test-documentation.sh    |  1 +
 diff.c                      |  2 +-
 t/helper/test-run-command.c |  6 +++---
 8 files changed, 48 insertions(+), 9 deletions(-)
 create mode 100755 check-usage-strings.sh

diff --git a/Makefile b/Makefile
index 186f9ab6190..93faed51da0 100644
--- a/Makefile
+++ b/Makefile
@@ -3416,6 +3416,11 @@ check-docs::
 check-builtins::
 	./check-builtins.sh
 
+### Make sure all the usage strings follow usage string style guide
+#
+check-usage-strings::
+	./check-usage-strings.sh
+
 ### Test suite coverage testing
 #
 .PHONY: coverage coverage-clean coverage-compile coverage-test coverage-report
diff --git a/builtin/bisect--helper.c b/builtin/bisect--helper.c
index 28a2e6a5750..614d95b022c 100644
--- a/builtin/bisect--helper.c
+++ b/builtin/bisect--helper.c
@@ -1209,7 +1209,7 @@ int cmd_bisect__helper(int argc, const char **argv, const char *prefix)
 		OPT_CMDMODE(0, "bisect-visualize", &cmdmode,
 			 N_("visualize the bisection"), BISECT_VISUALIZE),
 		OPT_CMDMODE(0, "bisect-run", &cmdmode,
-			 N_("use <cmd>... to automatically bisect."), BISECT_RUN),
+			 N_("use <cmd>... to automatically bisect"), BISECT_RUN),
 		OPT_BOOL(0, "no-log", &nolog,
 			 N_("no log for BISECT_WRITE")),
 		OPT_END()
diff --git a/builtin/reflog.c b/builtin/reflog.c
index 85b838720c3..28372c5e2b5 100644
--- a/builtin/reflog.c
+++ b/builtin/reflog.c
@@ -600,7 +600,7 @@ static int cmd_reflog_expire(int argc, const char **argv, const char *prefix)
 		OPT_BIT(0, "updateref", &flags,
 			N_("update the reference to the value of the top reflog entry"),
 			EXPIRE_REFLOGS_UPDATE_REF),
-		OPT_BOOL(0, "verbose", &verbose, N_("print extra information on screen.")),
+		OPT_BOOL(0, "verbose", &verbose, N_("print extra information on screen")),
 		OPT_CALLBACK_F(0, "expire", &cmd, N_("timestamp"),
 			       N_("prune entries older than the specified time"),
 			       PARSE_OPT_NONEG,
@@ -613,7 +613,7 @@ static int cmd_reflog_expire(int argc, const char **argv, const char *prefix)
 			 N_("prune any reflog entries that point to broken commits")),
 		OPT_BOOL(0, "all", &do_all, N_("process the reflogs of all references")),
 		OPT_BOOL(1, "single-worktree", &all_worktrees,
-			 N_("limits processing to reflogs from the current worktree only.")),
+			 N_("limits processing to reflogs from the current worktree only")),
 		OPT_END()
 	};
 
@@ -736,7 +736,7 @@ static int cmd_reflog_delete(int argc, const char **argv, const char *prefix)
 		OPT_BIT(0, "updateref", &flags,
 			N_("update the reference to the value of the top reflog entry"),
 			EXPIRE_REFLOGS_UPDATE_REF),
-		OPT_BOOL(0, "verbose", &verbose, N_("print extra information on screen.")),
+		OPT_BOOL(0, "verbose", &verbose, N_("print extra information on screen")),
 		OPT_END()
 	};
 
diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index c5d3fc3817f..9864ec1427d 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -1874,7 +1874,7 @@ static int module_clone(int argc, const char **argv, const char *prefix)
 		OPT_STRING(0, "depth", &clone_data.depth,
 			   N_("string"),
 			   N_("depth for shallow clones")),
-		OPT__QUIET(&quiet, "Suppress output for cloning a submodule"),
+		OPT__QUIET(&quiet, "suppress output for cloning a submodule"),
 		OPT_BOOL(0, "progress", &progress,
 			   N_("force cloning progress")),
 		OPT_BOOL(0, "require-init", &require_init,
diff --git a/check-usage-strings.sh b/check-usage-strings.sh
new file mode 100755
index 00000000000..a4028e0d00d
--- /dev/null
+++ b/check-usage-strings.sh
@@ -0,0 +1,33 @@
+{
+  if test -d ".git"
+  then
+    rev=${1:-"HEAD"}
+    for entry in $(git grep -l 'struct option .* = {$' "$rev" -- \*.c);
+    do
+      git show "$entry" |
+      sed -n '/struct option .* = {/,/OPT_END/{=;p;}' |
+      sed "N;s/^\\([0-9]*\\)\\n/$(echo "$entry" | sed 's/\//\\&/g'):\\1/";
+    done
+  else
+    for entry in $(grep -rl --include="*.c" 'struct option .* = {$' . );
+    do
+      cat "$entry" |
+      sed -n '/struct option .* = {/,/OPT_END/{=;p;}' |
+      sed "N;s/^\\([0-9]*\\)\\n/$(echo "$entry" | sed -e 's/\//\\&/g' -e 's/^\.\\\///'):\\1/";
+    done
+  fi
+} |
+grep -Pe '((?<!OPT_GROUP\(N_\(|OPT_GROUP\()"(?!GPG|DEPRECATED|SHA1|HEAD)[A-Z]|(?<!"|\.\.)\.")' |
+{
+  status=0
+  while read content;
+  do
+    if test -n "$content"
+    then
+      echo "$content";
+      status=1;
+    fi
+  done
+
+  exit $status
+}
diff --git a/ci/test-documentation.sh b/ci/test-documentation.sh
index de41888430a..f66848dfc66 100755
--- a/ci/test-documentation.sh
+++ b/ci/test-documentation.sh
@@ -15,6 +15,7 @@ filter_log () {
 }
 
 make check-builtins
+make check-usage-strings
 make check-docs
 
 # Build docs with AsciiDoc
diff --git a/diff.c b/diff.c
index c862771a589..000be3bf232 100644
--- a/diff.c
+++ b/diff.c
@@ -5596,7 +5596,7 @@ static void prep_parse_options(struct diff_options *options)
 			       N_("select files by diff type"),
 			       PARSE_OPT_NONEG, diff_opt_diff_filter),
 		{ OPTION_CALLBACK, 0, "output", options, N_("<file>"),
-		  N_("Output to a specific file"),
+		  N_("output to a specific file"),
 		  PARSE_OPT_NONEG, NULL, 0, diff_opt_output },
 
 		OPT_END()
diff --git a/t/helper/test-run-command.c b/t/helper/test-run-command.c
index 913775a14b7..8f370cd89f1 100644
--- a/t/helper/test-run-command.c
+++ b/t/helper/test-run-command.c
@@ -221,9 +221,9 @@ static int quote_stress_test(int argc, const char **argv)
 	struct strbuf out = STRBUF_INIT;
 	struct strvec args = STRVEC_INIT;
 	struct option options[] = {
-		OPT_INTEGER('n', "trials", &trials, "Number of trials"),
-		OPT_INTEGER('s', "skip", &skip, "Skip <n> trials"),
-		OPT_BOOL('m', "msys2", &msys2, "Test quoting for MSYS2's sh"),
+		OPT_INTEGER('n', "trials", &trials, "number of trials"),
+		OPT_INTEGER('s', "skip", &skip, "skip <n> trials"),
+		OPT_BOOL('m', "msys2", &msys2, "test quoting for MSYS2's sh"),
 		OPT_END()
 	};
 	const char * const usage[] = {

base-commit: b80121027d1247a0754b3cc46897fee75c050b44
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* Re: [PATCH] add usage-strings ci check and amend remaining usage strings
  2022-02-16 17:02 [PATCH] add usage-strings ci check and amend remaining usage strings Abhradeep Chakraborty via GitGitGadget
@ 2022-02-21 14:51 ` Abhradeep Chakraborty
  2022-02-21 15:39 ` Ævar Arnfjörð Bjarmason
  2022-02-22 15:58 ` [PATCH v2] add usage-strings " Abhradeep Chakraborty via GitGitGadget
  2 siblings, 0 replies; 59+ messages in thread
From: Abhradeep Chakraborty @ 2022-02-21 14:51 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Abhradeep Chakraborty, Ævar Arnfjörð Bjarmason,
	Johannes Schindelin, git

Hello reviewers and community members,

This patch request is not reviewed yet (i.e. no response/suggestions came about this patch request; most probably because it was lost in the PR ocean).

So, could you please review my patch request?

thanks :)

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH] add usage-strings ci check and amend remaining usage strings
  2022-02-16 17:02 [PATCH] add usage-strings ci check and amend remaining usage strings Abhradeep Chakraborty via GitGitGadget
  2022-02-21 14:51 ` Abhradeep Chakraborty
@ 2022-02-21 15:39 ` Ævar Arnfjörð Bjarmason
  2022-02-21 17:15   ` Junio C Hamano
                     ` (2 more replies)
  2022-02-22 15:58 ` [PATCH v2] add usage-strings " Abhradeep Chakraborty via GitGitGadget
  2 siblings, 3 replies; 59+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-21 15:39 UTC (permalink / raw)
  To: Abhradeep Chakraborty via GitGitGadget; +Cc: git, Abhra303


On Wed, Feb 16 2022, Abhradeep Chakraborty via GitGitGadget wrote:

> From: Abhra303 <chakrabortyabhradeep79@gmail.com>
>
> Usage strings for git (sub)command flags has a style guide that
> suggests - first letter should not capitalized (unless requied)
> and it should skip full-stop at the end of line. But there are
> some files where usage-strings do not follow the above mentioned
> guide. Moreover, there are no checks to verify if all usage strings
> are following the guide/convention or not.
>
> Amend the usage strings that don't follow the convention/guide and
> add a `CI` check for checking the usage strings (whether the first
> letter is capital or it ends with full-stop). If the `check` find
> such strings then print those strings and return a non-zero status.
>
> Also provide a script that takes an optional argument (a valid <tree>
> string), to check the usage strings in the given <tree> (`HEAD` is
> the default argument).
>
> Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
> ---
>     add usage-strings ci check and amend remaining usage strings
>     
>     This patch series completely fixes #636.
>     
>     The issue is about amending the usage-strings (for command flags such as
>     -h, -v etc.) which do not follow the style convention/guide. There was a
>     PR [https://github.com/gitgitgadget/git/pull/920] addressing this issue
>     but as Johannes [https://github.com/dscho] said in his comment
>     [https://github.com/gitgitgadget/git/issues/636#issuecomment-1018660439],
>     there are some files that still have those kind of usage strings.
>     Johannes also suggested to add a CI check under ci/test-documentation.sh
>     to check the usage strings.
>     
>     So, in this patch, all remaining usage strings are corrected. I also
>     added a check-usage-strings target in Makefile which can be used to
>     check usage strings. It uses check-usage-strings.sh.
>     
>      1. If check-usage-strings.sh is run on a valid git repo - it will check
>         the validity of usage-strings in the tree specified by an argument
>         or in the HEAD if no argument provided.
>      2. If the current repo is not a git repo (i.e. if it doesn't find any
>         .git folder), it will check the usage string in the current root
>         directory.
>     
>     For the first case, output of make check-usage-strings or
>     ./check-usage-strings.sh would be similar to -
>     
>     HEAD:builtin/bisect--helper.c:1212                        N_("use <cmd>... to automatically bisect."), BISECT_RUN),
>     HEAD:builtin/submodule--helper.c:1877            OPT__QUIET(&quiet, "Suppress output for cloning a submodule"),
>     
>     
>     If an argument provided - ./check-usage-strings.sh 'v2.34.0' , it will
>     search for usage-strings in v2.34.0 and v2.34.0 will be prefixed before
>     filenames instead of HEAD.
>     
>     In the second case, output would be similar to -
>     
>     diff.c:5596                            N_("select files by diff type."),
>     diff.c:5599               N_("Output to a specific file"),
>     builtin/branch.c:666            OPT_BIT('C', NULL, &copy, N_("copy a branch, even if target exists."), 2),
>     make: *** [check-usage-strings] Error 1
>     
>     
>     Note in the last case - arguments provided to it will be useless.
>
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1147%2FAbhra303%2Fusage_command_amend-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1147/Abhra303/usage_command_amend-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/1147

Sorry about leaving this patch submission hanging. I read this at the
time, but forgot to find time to loop back to it.

>  Makefile                    |  5 +++++
>  builtin/bisect--helper.c    |  2 +-
>  builtin/reflog.c            |  6 +++---
>  builtin/submodule--helper.c |  2 +-
>  check-usage-strings.sh      | 33 +++++++++++++++++++++++++++++++++
>  ci/test-documentation.sh    |  1 +
>  diff.c                      |  2 +-
>  t/helper/test-run-command.c |  6 +++---
>  8 files changed, 48 insertions(+), 9 deletions(-)
>  create mode 100755 check-usage-strings.sh
>
> diff --git a/Makefile b/Makefile
> index 186f9ab6190..93faed51da0 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -3416,6 +3416,11 @@ check-docs::
>  check-builtins::
>  	./check-builtins.sh
>  
> +### Make sure all the usage strings follow usage string style guide
> +#
> +check-usage-strings::
> +	./check-usage-strings.sh
> +
>  ### Test suite coverage testing
>  #
>  .PHONY: coverage coverage-clean coverage-compile coverage-test coverage-report
> diff --git a/builtin/bisect--helper.c b/builtin/bisect--helper.c
> index 28a2e6a5750..614d95b022c 100644
> --- a/builtin/bisect--helper.c
> +++ b/builtin/bisect--helper.c
> @@ -1209,7 +1209,7 @@ int cmd_bisect__helper(int argc, const char **argv, const char *prefix)
>  		OPT_CMDMODE(0, "bisect-visualize", &cmdmode,
>  			 N_("visualize the bisection"), BISECT_VISUALIZE),
>  		OPT_CMDMODE(0, "bisect-run", &cmdmode,
> -			 N_("use <cmd>... to automatically bisect."), BISECT_RUN),
> +			 N_("use <cmd>... to automatically bisect"), BISECT_RUN),
>  		OPT_BOOL(0, "no-log", &nolog,
>  			 N_("no log for BISECT_WRITE")),
>  		OPT_END()
> diff --git a/builtin/reflog.c b/builtin/reflog.c
> index 85b838720c3..28372c5e2b5 100644
> --- a/builtin/reflog.c
> +++ b/builtin/reflog.c
> @@ -600,7 +600,7 @@ static int cmd_reflog_expire(int argc, const char **argv, const char *prefix)
>  		OPT_BIT(0, "updateref", &flags,
>  			N_("update the reference to the value of the top reflog entry"),
>  			EXPIRE_REFLOGS_UPDATE_REF),
> -		OPT_BOOL(0, "verbose", &verbose, N_("print extra information on screen.")),
> +		OPT_BOOL(0, "verbose", &verbose, N_("print extra information on screen")),
>  		OPT_CALLBACK_F(0, "expire", &cmd, N_("timestamp"),
>  			       N_("prune entries older than the specified time"),
>  			       PARSE_OPT_NONEG,
> @@ -613,7 +613,7 @@ static int cmd_reflog_expire(int argc, const char **argv, const char *prefix)
>  			 N_("prune any reflog entries that point to broken commits")),
>  		OPT_BOOL(0, "all", &do_all, N_("process the reflogs of all references")),
>  		OPT_BOOL(1, "single-worktree", &all_worktrees,
> -			 N_("limits processing to reflogs from the current worktree only.")),
> +			 N_("limits processing to reflogs from the current worktree only")),
>  		OPT_END()
>  	};
>  
> @@ -736,7 +736,7 @@ static int cmd_reflog_delete(int argc, const char **argv, const char *prefix)
>  		OPT_BIT(0, "updateref", &flags,
>  			N_("update the reference to the value of the top reflog entry"),
>  			EXPIRE_REFLOGS_UPDATE_REF),
> -		OPT_BOOL(0, "verbose", &verbose, N_("print extra information on screen.")),
> +		OPT_BOOL(0, "verbose", &verbose, N_("print extra information on screen")),
>  		OPT_END()
>  	};
>
> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
> index c5d3fc3817f..9864ec1427d 100644
> --- a/builtin/submodule--helper.c
> +++ b/builtin/submodule--helper.c
> @@ -1874,7 +1874,7 @@ static int module_clone(int argc, const char **argv, const char *prefix)
>  		OPT_STRING(0, "depth", &clone_data.depth,
>  			   N_("string"),
>  			   N_("depth for shallow clones")),
> -		OPT__QUIET(&quiet, "Suppress output for cloning a submodule"),
> +		OPT__QUIET(&quiet, "suppress output for cloning a submodule"),
>  		OPT_BOOL(0, "progress", &progress,
>  			   N_("force cloning progress")),
>  		OPT_BOOL(0, "require-init", &require_init,

It's really good to have these fixed! Ditto for the remaining ones I
elided.

> diff --git a/check-usage-strings.sh b/check-usage-strings.sh
> new file mode 100755
> index 00000000000..a4028e0d00d
> --- /dev/null
> +++ b/check-usage-strings.sh
> @@ -0,0 +1,33 @@
> +{
> +  if test -d ".git"
> +  then
> +    rev=${1:-"HEAD"}
> +    for entry in $(git grep -l 'struct option .* = {$' "$rev" -- \*.c);
> +    do
> +      git show "$entry" |
> +      sed -n '/struct option .* = {/,/OPT_END/{=;p;}' |
> +      sed "N;s/^\\([0-9]*\\)\\n/$(echo "$entry" | sed 's/\//\\&/g'):\\1/";
> +    done
> +  else
> +    for entry in $(grep -rl --include="*.c" 'struct option .* = {$' . );
> +    do
> +      cat "$entry" |
> +      sed -n '/struct option .* = {/,/OPT_END/{=;p;}' |
> +      sed "N;s/^\\([0-9]*\\)\\n/$(echo "$entry" | sed -e 's/\//\\&/g' -e 's/^\.\\\///'):\\1/";
> +    done
> +  fi
> +} |
> +grep -Pe '((?<!OPT_GROUP\(N_\(|OPT_GROUP\()"(?!GPG|DEPRECATED|SHA1|HEAD)[A-Z]|(?<!"|\.\.)\.")' |
> +{
> +  status=0
> +  while read content;
> +  do
> +    if test -n "$content"
> +    then
> +      echo "$content";
> +      status=1;
> +    fi
> +  done
> +
> +  exit $status
> +}
> diff --git a/ci/test-documentation.sh b/ci/test-documentation.sh
> index de41888430a..f66848dfc66 100755
> --- a/ci/test-documentation.sh
> +++ b/ci/test-documentation.sh
> @@ -15,6 +15,7 @@ filter_log () {
>  }

As much as I like the idea, I really don't want us to have this method
of doing it though, i.e. to start parsing our C code with a
hard-to-maintain shellscript.

But the good news is that there's much easier way to add this!

Aside: if we did want to do the "parse C" method the right way to do it
would be to have a coccinelle script do it. We don't currently, but we
use coccicheck, and if you look at the linux kernel's use of it there's
multiple such checks there. I.e. you can have it parse the C and run
your checks with an arbitrary script.

But in this case there's really a much easier way to do this, to just
extend something like this:

diff --git a/parse-options.c b/parse-options.c
index 2437ad3bcdd..90d8da6ad4c 100644
--- a/parse-options.c
+++ b/parse-options.c
@@ -492,6 +492,8 @@ static void parse_options_check(const struct option *opts)
 		default:
 			; /* ok. (usually accepts an argument) */
 		}
+		if (opts->help && ends_with(opts->help, "."))
+			err |= optbug(opts, xstrfmt("argh should not end with a dot: %s", opts->help));
 		if (opts->argh &&
 		    strcspn(opts->argh, " _") != strlen(opts->argh))
 			err |= optbug(opts, "multi-word argh should use dash to separate words");

Then the t0012-help.sh test will catch these, and that's where these
sorts of checks belong in our tree.

See b6c2a0d45d4 (parse-options: make sure argh string does not have SP
or _, 2014-03-23) for the existing code shown in the context where we
already check "argh" like that, i.e. we're just missing a test for
"help".

Obviously such a function would need to hardcode some of the logic you
added in your shellscript. E.g. this fires on a string ending in "...",
but yours doesn't.

That should be fairly easy to do though, and if not we could always just
dump these to stderr or something if a
git_env_bool("GIT_TEST_PARSE_OPTIONS_DUMP_FIELD_HELP", 0) was true, and
do the testing itself in t0012-help.sh.

>  make check-builtins
> +make check-usage-strings
>  make check-docs

Good to have this as a "make" target, not something e.g. peculiar to CI.

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* Re: [PATCH] add usage-strings ci check and amend remaining usage strings
  2022-02-21 15:39 ` Ævar Arnfjörð Bjarmason
@ 2022-02-21 17:15   ` Junio C Hamano
  2022-02-21 17:33   ` Abhradeep Chakraborty
  2022-02-22 10:25   ` Abhradeep Chakraborty
  2 siblings, 0 replies; 59+ messages in thread
From: Junio C Hamano @ 2022-02-21 17:15 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Abhradeep Chakraborty via GitGitGadget, git, Abhra303

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

>> diff --git a/ci/test-documentation.sh b/ci/test-documentation.sh
>> index de41888430a..f66848dfc66 100755
>> --- a/ci/test-documentation.sh
>> +++ b/ci/test-documentation.sh
>> @@ -15,6 +15,7 @@ filter_log () {
>>  }
>
> As much as I like the idea, I really don't want us to have this method
> of doing it though, i.e. to start parsing our C code with a
> hard-to-maintain shellscript.
>
> But the good news is that there's much easier way to add this!

;-)  Good suggestions.  I have nothing to add.


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH] add usage-strings ci check and amend remaining usage strings
  2022-02-21 15:39 ` Ævar Arnfjörð Bjarmason
  2022-02-21 17:15   ` Junio C Hamano
@ 2022-02-21 17:33   ` Abhradeep Chakraborty
  2022-02-21 18:52     ` Ævar Arnfjörð Bjarmason
  2022-02-22 10:57     ` Johannes Schindelin
  2022-02-22 10:25   ` Abhradeep Chakraborty
  2 siblings, 2 replies; 59+ messages in thread
From: Abhradeep Chakraborty @ 2022-02-21 17:33 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Abhradeep Chakraborty, git

Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:

> Sorry about leaving this patch submission hanging. I read this at the
> time, but forgot to find time to loop back to it.

No worries. Thanks for reviewing :)

> But in this case there's really a much easier way to do this, to just
> extend something like this:
> ...
> See b6c2a0d45d4 (parse-options: make sure argh string does not have SP
> or _, 2014-03-23) for the existing code shown in the context where we
> already check "argh" like that, i.e. we're just missing a test for
> "help".
>
> Obviously such a function would need to hardcode some of the logic you
> added in your shellscript. E.g. this fires on a string ending in "...",
> but yours doesn't.

Thank you so much for the suggestion. Didn't aware of it before. I will
try to implement the logic in parse-options.c` (as you suggested).

> That should be fairly easy to do though, and if not we could always just
> dump these to stderr or something if a
> git_env_bool("GIT_TEST_PARSE_OPTIONS_DUMP_FIELD_HELP", 0) was true, and
> do the testing itself in t0012-help.sh.

Okay but if the logic can't be implented in the `parse-options.c` file
(most probably I will be able to implement the logic), would you allow me
to try the `coccinelle script` method you mentioned?

Thanks :)

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH] add usage-strings ci check and amend remaining usage strings
  2022-02-21 17:33   ` Abhradeep Chakraborty
@ 2022-02-21 18:52     ` Ævar Arnfjörð Bjarmason
  2022-02-22 10:57     ` Johannes Schindelin
  1 sibling, 0 replies; 59+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-21 18:52 UTC (permalink / raw)
  To: Abhradeep Chakraborty; +Cc: git


On Mon, Feb 21 2022, Abhradeep Chakraborty wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
>
>> Sorry about leaving this patch submission hanging. I read this at the
>> time, but forgot to find time to loop back to it.
>
> No worries. Thanks for reviewing :)
>
>> But in this case there's really a much easier way to do this, to just
>> extend something like this:
>> ...
>> See b6c2a0d45d4 (parse-options: make sure argh string does not have SP
>> or _, 2014-03-23) for the existing code shown in the context where we
>> already check "argh" like that, i.e. we're just missing a test for
>> "help".
>>
>> Obviously such a function would need to hardcode some of the logic you
>> added in your shellscript. E.g. this fires on a string ending in "...",
>> but yours doesn't.
>
> Thank you so much for the suggestion. Didn't aware of it before. I will
> try to implement the logic in parse-options.c` (as you suggested).
>
>> That should be fairly easy to do though, and if not we could always just
>> dump these to stderr or something if a
>> git_env_bool("GIT_TEST_PARSE_OPTIONS_DUMP_FIELD_HELP", 0) was true, and
>> do the testing itself in t0012-help.sh.
>
> Okay but if the logic can't be implented in the `parse-options.c` file
> (most probably I will be able to implement the logic), would you allow me
> to try the `coccinelle script` method you mentioned?

In this case I think there's definitely no reason for why it can't/won't
work in parse-options.c.

If you're doing something like that with coccicheck I'm afraid I can't
help much, I've only seen that the kernel is doing it (it's referenced
in some of the coccinelle docs), but I haven't personally used it for
anything close to that.


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH] add usage-strings ci check and amend remaining usage strings
  2022-02-21 15:39 ` Ævar Arnfjörð Bjarmason
  2022-02-21 17:15   ` Junio C Hamano
  2022-02-21 17:33   ` Abhradeep Chakraborty
@ 2022-02-22 10:25   ` Abhradeep Chakraborty
  2 siblings, 0 replies; 59+ messages in thread
From: Abhradeep Chakraborty @ 2022-02-22 10:25 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Abhradeep Chakraborty, git

Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:

> But in this case there's really a much easier way to do this, to just
> extend something like this:
> ...
> See b6c2a0d45d4 (parse-options: make sure argh string does not have SP
> or _, 2014-03-23) for the existing code shown in the context where we
> already check "argh" like that, i.e. we're just missing a test for
> "help".
>
> Obviously such a function would need to hardcode some of the logic you
> added in your shellscript. E.g. this fires on a string ending in "...",
> but yours doesn't.

Hello Ævar, I have some query related to this method. I have implemented
the logic locally and tests are also passing. However, I think the test
you mentioned is only running against the builtin files and files that
are used in builtin commands (e.g. `diff.c`, `builtin/add.c` etc.). But
some files from `t/helper` (e.g. t/helper/test-run-command.c) also uses
parse option API and it seems that there are no test files (pardon me if I
am wrong) for checking `parse option usage strings check` for `t/helper`
test-tool commands.

E.g. `grep -r --include="*.c" 'struct option .*\[] = {$' .` command gives
the following output - 

./helper/test-parse-options.c:  struct option options[] = {
./helper/test-lazy-init-name-hash.c:    struct option options[] = {
./helper/test-serve-v2.c:       struct option options[] = {
./helper/test-simple-ipc.c:     struct option options[] = {
./helper/test-parse-pathspec-file.c:    struct option options[] = {
./helper/test-getcwd.c: struct option options[] = {
./helper/test-run-command.c:    struct option options[] = {
./helper/test-run-command.c:    struct option options[] = {
./helper/test-proc-receive.c:   struct option options[] = {
./helper/test-progress.c:       struct option options[] = {
./helper/test-tool.c:   struct option options[] = {

So, these files are using parse-options and there is a chance that in
future, usage strings from these files may violate the style guide. In
this case, all tests will be passing even if there are some style
violations. What do you think?

Thanks :)

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH] add usage-strings ci check and amend remaining usage strings
  2022-02-21 17:33   ` Abhradeep Chakraborty
  2022-02-21 18:52     ` Ævar Arnfjörð Bjarmason
@ 2022-02-22 10:57     ` Johannes Schindelin
  2022-02-22 12:37         ` [cocci] " Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 59+ messages in thread
From: Johannes Schindelin @ 2022-02-22 10:57 UTC (permalink / raw)
  To: Julia Lawall
  Cc: Abhradeep Chakraborty, Ævar Arnfjörð Bjarmason, git

[-- Attachment #1: Type: text/plain, Size: 1311 bytes --]

Hi Julia,

I would like to loop you in here because you have helped us with
Coccinelle questions in the past.

On Mon, 21 Feb 2022, Abhradeep Chakraborty wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
>
> > That should be fairly easy to do though, and if not we could always
> > just dump these to stderr or something if a
> > git_env_bool("GIT_TEST_PARSE_OPTIONS_DUMP_FIELD_HELP", 0) was true,
> > and do the testing itself in t0012-help.sh.
>
> Okay but if the logic can't be implented in the `parse-options.c` file
> (most probably I will be able to implement the logic), would you allow
> me to try the `coccinelle script` method you mentioned?

The task at hand is to identify calls to the macro `OPT_CMDMODE()` (and
other, similar macros) that get a fourth argument of the form

	N_("<some string>")

The problem is to identify `<some string>` that ends in a `.` (which we
want to avoid) or that starts with some prefix and a colon but follows
with an upper-case character.

In other words, we want to suggest replacing

	N_("log: Use something")

or

	N_("log: use something.")

by

	N_("log: use something")

Ævar suggested that Coccinelle can do that. Could you give us a hand how
this would be possible using `spatch`?

Thank you,
Johannes

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH] add usage-strings ci check and amend remaining usage strings
  2022-02-22 10:57     ` Johannes Schindelin
@ 2022-02-22 12:37         ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 59+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-22 12:37 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Julia Lawall, Abhradeep Chakraborty, git, cocci


On Tue, Feb 22 2022, Johannes Schindelin wrote:

> Hi Julia,
>
> I would like to loop you in here because you have helped us with
> Coccinelle questions in the past.

Thanks. Probably better to CC the relevant ML, adding it.

> On Mon, 21 Feb 2022, Abhradeep Chakraborty wrote:
>
>> Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
>>
>> > That should be fairly easy to do though, and if not we could always
>> > just dump these to stderr or something if a
>> > git_env_bool("GIT_TEST_PARSE_OPTIONS_DUMP_FIELD_HELP", 0) was true,
>> > and do the testing itself in t0012-help.sh.
>>
>> Okay but if the logic can't be implented in the `parse-options.c` file
>> (most probably I will be able to implement the logic), would you allow
>> me to try the `coccinelle script` method you mentioned?
>
> The task at hand is to identify calls to the macro `OPT_CMDMODE()` (and
> other, similar macros) that get a fourth argument of the form
>
> 	N_("<some string>")
>
> The problem is to identify `<some string>` that ends in a `.` (which we
> want to avoid) or that starts with some prefix and a colon but follows
> with an upper-case character.
>
> In other words, we want to suggest replacing
>
> 	N_("log: Use something")
>
> or
>
> 	N_("log: use something.")
>
> by
>
> 	N_("log: use something")
>
> Ævar suggested that Coccinelle can do that. Could you give us a hand how
> this would be possible using `spatch`?

I probably shouldn't have mentioned that at all, and I think this is
academic in this context, because as noted we can just add this to
parse_options_check() (linking it here again for off-git-ml context):

    https://lore.kernel.org/git/220221.86tucsb4oy.gmgdl@evledraar.gmail.com/

We now pay that trivial runtime overhead every time, we could run it
only during the tests if we ever got worried about it.

And it's a lot less fragile and easy to understand than running
coccicheck, i.e. as nice as it is it's still takes a while to run, is
its own mini-language, needs to be kept in sync with code changes etc.

So by doing it at runtime we can adjust messages, code & tests in an
atomic patch more easily (i.e. not assume that you ran some cocci target
to validate it).

It also makes it really easy to do things that are really hard (or
impossible?) with coccinelle. I.e. some of these checks are run as a
function of what flag gets passed into some function later on, which in
the general case would require coccinelle to have some runtime emulator
for C code just to see *what* checks it wants to run.

That being said (and with the caveat that I've only looked at this code,
not done this myself) if you clone linux.git and browse through:

    git grep -C100 -F coccilib.report '*.cocci'

You can see a lot of examples of using cocci for these sorts of checks.

And the same goes if you clone coccinelle.git and do:

    git grep -C100 @script: -- tests

For linux.git it's documented here:
https://github.com/torvalds/linux/blob/master/Documentation/dev-tools/coccinelle.rst

I.e. it's basically writing the sort of cocci rules we have in-tree with
a callback script that complaints about the required change.

For our use it would probably better (in lieu of parse_options_check(),
which is the right thing here) to just have a normal *.cocci file and
complain if it applies changes. We already error in the CI if those need
to apply any changes.

But I don't off-hand know how to do that. E.g. I was trying the other
day to come up with some coccinelle rule that converted:

    die("BUG: blah blah");

To:

    BUG("blah blah");

And while I'm sure there's some way to do this, couldn't find a way to
write a rule to "reach in" to a constant string, apply some check or
search/replacement on it, and to do a subsequent transformation on it.

In this case the OPT_BLAH(1, 2, 3, "string here") has OPT_BLAH() as a
macro. I can't remember if there's extra caveats around using coccinelle
for macros v.s. symbols.

Disclaimer: By "couldn't" I mean I grepped the above examples for all of
a few minutes quickly skimmed the coccinelle docs, didn't find a
template I could copy, then ended up writing some nasty grep/xargs/perl
for-loop instead :)

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [cocci] [PATCH] add usage-strings ci check and amend remaining usage strings
@ 2022-02-22 12:37         ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 59+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-22 12:37 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Julia Lawall, Abhradeep Chakraborty, git, cocci


On Tue, Feb 22 2022, Johannes Schindelin wrote:

> Hi Julia,
>
> I would like to loop you in here because you have helped us with
> Coccinelle questions in the past.

Thanks. Probably better to CC the relevant ML, adding it.

> On Mon, 21 Feb 2022, Abhradeep Chakraborty wrote:
>
>> Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
>>
>> > That should be fairly easy to do though, and if not we could always
>> > just dump these to stderr or something if a
>> > git_env_bool("GIT_TEST_PARSE_OPTIONS_DUMP_FIELD_HELP", 0) was true,
>> > and do the testing itself in t0012-help.sh.
>>
>> Okay but if the logic can't be implented in the `parse-options.c` file
>> (most probably I will be able to implement the logic), would you allow
>> me to try the `coccinelle script` method you mentioned?
>
> The task at hand is to identify calls to the macro `OPT_CMDMODE()` (and
> other, similar macros) that get a fourth argument of the form
>
> 	N_("<some string>")
>
> The problem is to identify `<some string>` that ends in a `.` (which we
> want to avoid) or that starts with some prefix and a colon but follows
> with an upper-case character.
>
> In other words, we want to suggest replacing
>
> 	N_("log: Use something")
>
> or
>
> 	N_("log: use something.")
>
> by
>
> 	N_("log: use something")
>
> Ævar suggested that Coccinelle can do that. Could you give us a hand how
> this would be possible using `spatch`?

I probably shouldn't have mentioned that at all, and I think this is
academic in this context, because as noted we can just add this to
parse_options_check() (linking it here again for off-git-ml context):

    https://lore.kernel.org/git/220221.86tucsb4oy.gmgdl@evledraar.gmail.com/

We now pay that trivial runtime overhead every time, we could run it
only during the tests if we ever got worried about it.

And it's a lot less fragile and easy to understand than running
coccicheck, i.e. as nice as it is it's still takes a while to run, is
its own mini-language, needs to be kept in sync with code changes etc.

So by doing it at runtime we can adjust messages, code & tests in an
atomic patch more easily (i.e. not assume that you ran some cocci target
to validate it).

It also makes it really easy to do things that are really hard (or
impossible?) with coccinelle. I.e. some of these checks are run as a
function of what flag gets passed into some function later on, which in
the general case would require coccinelle to have some runtime emulator
for C code just to see *what* checks it wants to run.

That being said (and with the caveat that I've only looked at this code,
not done this myself) if you clone linux.git and browse through:

    git grep -C100 -F coccilib.report '*.cocci'

You can see a lot of examples of using cocci for these sorts of checks.

And the same goes if you clone coccinelle.git and do:

    git grep -C100 @script: -- tests

For linux.git it's documented here:
https://github.com/torvalds/linux/blob/master/Documentation/dev-tools/coccinelle.rst

I.e. it's basically writing the sort of cocci rules we have in-tree with
a callback script that complaints about the required change.

For our use it would probably better (in lieu of parse_options_check(),
which is the right thing here) to just have a normal *.cocci file and
complain if it applies changes. We already error in the CI if those need
to apply any changes.

But I don't off-hand know how to do that. E.g. I was trying the other
day to come up with some coccinelle rule that converted:

    die("BUG: blah blah");

To:

    BUG("blah blah");

And while I'm sure there's some way to do this, couldn't find a way to
write a rule to "reach in" to a constant string, apply some check or
search/replacement on it, and to do a subsequent transformation on it.

In this case the OPT_BLAH(1, 2, 3, "string here") has OPT_BLAH() as a
macro. I can't remember if there's extra caveats around using coccinelle
for macros v.s. symbols.

Disclaimer: By "couldn't" I mean I grepped the above examples for all of
a few minutes quickly skimmed the coccinelle docs, didn't find a
template I could copy, then ended up writing some nasty grep/xargs/perl
for-loop instead :)

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [cocci] [PATCH] add usage-strings ci check and amend remaining usage strings
  2022-02-22 12:37         ` [cocci] " Ævar Arnfjörð Bjarmason
  (?)
@ 2022-02-22 13:42         ` Julia Lawall
  2022-02-22 14:03           ` Abhradeep Chakraborty
                             ` (2 more replies)
  -1 siblings, 3 replies; 59+ messages in thread
From: Julia Lawall @ 2022-02-22 13:42 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Johannes Schindelin, Abhradeep Chakraborty, git, cocci

[-- Attachment #1: Type: text/plain, Size: 5650 bytes --]



On Tue, 22 Feb 2022, Ævar Arnfjörð Bjarmason wrote:

>
> On Tue, Feb 22 2022, Johannes Schindelin wrote:
>
> > Hi Julia,
> >
> > I would like to loop you in here because you have helped us with
> > Coccinelle questions in the past.
>
> Thanks. Probably better to CC the relevant ML, adding it.
>
> > On Mon, 21 Feb 2022, Abhradeep Chakraborty wrote:
> >
> >> Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
> >>
> >> > That should be fairly easy to do though, and if not we could always
> >> > just dump these to stderr or something if a
> >> > git_env_bool("GIT_TEST_PARSE_OPTIONS_DUMP_FIELD_HELP", 0) was true,
> >> > and do the testing itself in t0012-help.sh.
> >>
> >> Okay but if the logic can't be implented in the `parse-options.c` file
> >> (most probably I will be able to implement the logic), would you allow
> >> me to try the `coccinelle script` method you mentioned?
> >
> > The task at hand is to identify calls to the macro `OPT_CMDMODE()` (and
> > other, similar macros) that get a fourth argument of the form
> >
> > 	N_("<some string>")
> >
> > The problem is to identify `<some string>` that ends in a `.` (which we
> > want to avoid) or that starts with some prefix and a colon but follows
> > with an upper-case character.
> >
> > In other words, we want to suggest replacing
> >
> > 	N_("log: Use something")
> >
> > or
> >
> > 	N_("log: use something.")
> >
> > by
> >
> > 	N_("log: use something")
> >
> > Ævar suggested that Coccinelle can do that. Could you give us a hand how
> > this would be possible using `spatch`?

Hello,

I'm not sure to follow all of the following.

Of there are some cases that are useful to do statically, with only local
information, then using Coccinelle could be useful to get the problem out
of the way once and for all.  Coccinelle doesn't support much processing
of strings directly, but you can always write some python code to test the
contents of a string and to create a new one.

Let me know if you want to try this.  You can also check, eg the demo
demos/pythontococci.cocci to see how to create code in a python script and
then use it in a normal SmPL rule.

If some context has to be taken into account and the context in the same
function, then that can also be done with Coccinelle, eg

A
...
B

matches the case where after an A there is a B on all execution paths
(except perhaps those that end in an error exit) and

A
... when exists
B

matches the case where there is a B sometime after executing A, even if
that does not always occur.

If the context that you are interested in is in a called function or is in
the calling context, then Coccinelle might not be the ideal choice.
Coccinelle works on one function at a time, so to do anything
interprocedural, you have to do some hacks.

julia
>
> I probably shouldn't have mentioned that at all, and I think this is
> academic in this context, because as noted we can just add this to
> parse_options_check() (linking it here again for off-git-ml context):
>
>     https://lore.kernel.org/git/220221.86tucsb4oy.gmgdl@evledraar.gmail.com/
>
> We now pay that trivial runtime overhead every time, we could run it
> only during the tests if we ever got worried about it.
>
> And it's a lot less fragile and easy to understand than running
> coccicheck, i.e. as nice as it is it's still takes a while to run, is
> its own mini-language, needs to be kept in sync with code changes etc.
>
> So by doing it at runtime we can adjust messages, code & tests in an
> atomic patch more easily (i.e. not assume that you ran some cocci target
> to validate it).
>
> It also makes it really easy to do things that are really hard (or
> impossible?) with coccinelle. I.e. some of these checks are run as a
> function of what flag gets passed into some function later on, which in
> the general case would require coccinelle to have some runtime emulator
> for C code just to see *what* checks it wants to run.
>
> That being said (and with the caveat that I've only looked at this code,
> not done this myself) if you clone linux.git and browse through:
>
>     git grep -C100 -F coccilib.report '*.cocci'
>
> You can see a lot of examples of using cocci for these sorts of checks.
>
> And the same goes if you clone coccinelle.git and do:
>
>     git grep -C100 @script: -- tests
>
> For linux.git it's documented here:
> https://github.com/torvalds/linux/blob/master/Documentation/dev-tools/coccinelle.rst
>
> I.e. it's basically writing the sort of cocci rules we have in-tree with
> a callback script that complaints about the required change.
>
> For our use it would probably better (in lieu of parse_options_check(),
> which is the right thing here) to just have a normal *.cocci file and
> complain if it applies changes. We already error in the CI if those need
> to apply any changes.
>
> But I don't off-hand know how to do that. E.g. I was trying the other
> day to come up with some coccinelle rule that converted:
>
>     die("BUG: blah blah");
>
> To:
>
>     BUG("blah blah");
>
> And while I'm sure there's some way to do this, couldn't find a way to
> write a rule to "reach in" to a constant string, apply some check or
> search/replacement on it, and to do a subsequent transformation on it.
>
> In this case the OPT_BLAH(1, 2, 3, "string here") has OPT_BLAH() as a
> macro. I can't remember if there's extra caveats around using coccinelle
> for macros v.s. symbols.
>
> Disclaimer: By "couldn't" I mean I grepped the above examples for all of
> a few minutes quickly skimmed the coccinelle docs, didn't find a
> template I could copy, then ended up writing some nasty grep/xargs/perl
> for-loop instead :)
>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH] add usage-strings ci check and amend remaining usage strings
  2022-02-22 13:42         ` Julia Lawall
@ 2022-02-22 14:03           ` Abhradeep Chakraborty
  2022-02-22 15:47           ` Abhradeep Chakraborty
  2022-02-25 15:03           ` [cocci] " Johannes Schindelin
  2 siblings, 0 replies; 59+ messages in thread
From: Abhradeep Chakraborty @ 2022-02-22 14:03 UTC (permalink / raw)
  To: Julia Lawall
  Cc: Abhradeep Chakraborty, Ævar Arnfjörð Bjarmason,
	Johannes Schindelin, git

Julia Lawall wrote:

> Of there are some cases that are useful to do statically, with only local
> information, then using Coccinelle could be useful to get the problem out
> of the way once and for all.  Coccinelle doesn't support much processing
> of strings directly, but you can always write some python code to test the
> contents of a string and to create a new one.
>
> Let me know if you want to try this.  You can also check, eg the demo
> demos/pythontococci.cocci to see how to create code in a python script and
> then use it in a normal SmPL rule.
> ...
> If the context that you are interested in is in a called function or is in
> the calling context, then Coccinelle might not be the ideal choice.
> Coccinelle works on one function at a time, so to do anything
> interprocedural, you have to do some hacks.

Thank you Julia for this helpful info. Looking at your description, I think
the `add check to parse-options.c` (that Ævar suggested as the most ideal
method for it) method is more simpler than this. Moreover,as this is only
about checking usage-strings, so adding complexity to it will not be a
good idea.

Thanks :)

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH] add usage-strings ci check and amend remaining usage strings
  2022-02-22 13:42         ` Julia Lawall
  2022-02-22 14:03           ` Abhradeep Chakraborty
@ 2022-02-22 15:47           ` Abhradeep Chakraborty
  2022-02-25 15:30             ` Johannes Schindelin
  2022-02-25 15:03           ` [cocci] " Johannes Schindelin
  2 siblings, 1 reply; 59+ messages in thread
From: Abhradeep Chakraborty @ 2022-02-22 15:47 UTC (permalink / raw)
  To: Julia Lawall
  Cc: Abhradeep Chakraborty, Ævar Arnfjörð Bjarmason,
	Johannes Schindelin, git

Julia Lawall wrote:

> Of there are some cases that are useful to do statically, with only local
> information, then using Coccinelle could be useful to get the problem out
> of the way once and for all.  Coccinelle doesn't support much processing
> of strings directly, but you can always write some python code to test the
> contents of a string and to create a new one.
>
> Let me know if you want to try this.  You can also check, eg the demo
> demos/pythontococci.cocci to see how to create code in a python script and
> then use it in a normal SmPL rule.
> ...
> If the context that you are interested in is in a called function or is in
> the calling context, then Coccinelle might not be the ideal choice.
> Coccinelle works on one function at a time, so to do anything
> interprocedural, you have to do some hacks.

Though in this case, `parse-options.c check` method is better, but in other
cases, this might be a good fit. In those cases, I would also like to help
you (i.e. you, Johannes, Ævar and other devs) to fix those cases.

Thanks :)

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH v2] add usage-strings check  and amend remaining usage strings
  2022-02-16 17:02 [PATCH] add usage-strings ci check and amend remaining usage strings Abhradeep Chakraborty via GitGitGadget
  2022-02-21 14:51 ` Abhradeep Chakraborty
  2022-02-21 15:39 ` Ævar Arnfjörð Bjarmason
@ 2022-02-22 15:58 ` Abhradeep Chakraborty via GitGitGadget
  2022-02-22 17:16   ` Eric Sunshine
  2022-02-23 14:27   ` [PATCH v3 0/2] add usage-strings ci " Abhradeep Chakraborty via GitGitGadget
  2 siblings, 2 replies; 59+ messages in thread
From: Abhradeep Chakraborty via GitGitGadget @ 2022-02-22 15:58 UTC (permalink / raw)
  To: git
  Cc: Ævar Arnfjörð Bjarmason, Johannes Schindelin,
	Julia Lawall, Abhradeep Chakraborty, Abhra303

From: Abhra303 <chakrabortyabhradeep79@gmail.com>

Usage strings for git (sub)command flags has a style guide that
suggests - first letter should not capitalized (unless requied)
and it should skip full-stop at the end of line. But there are
some files where usage-strings do not follow the above mentioned
guide. Moreover, there are no checks to verify if all usage strings
are following the guide/convention or not.

Amend the usage strings that don't follow the convention/guide and
add a check in the `parse_options_check()` function in `parse-options.c`
to check the usage strings against the style guide.

Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
---
    add usage-strings ci check and amend remaining usage strings
    
    This patch series completely fixes #636.
    
    The issue is about amending the usage-strings (for command flags such as
    -h, -v etc.) which do not follow the style convention/guide. There was a
    PR [https://github.com/gitgitgadget/git/pull/920] addressing this issue
    but as Johannes [https://github.com/dscho] said in his comment
    [https://github.com/gitgitgadget/git/issues/636#issuecomment-1018660439],
    there are some files that still have those kind of usage strings.
    Johannes also suggested to add a CI check under ci/test-documentation.sh
    to check the usage strings.
    
    So, in this patch, all remaining usage strings are corrected. I also
    added checks to parse_options_check() in parse-options.c (as suggested
    by Ævar).
    
    Changes since v1:
    
     1. remove check-usage-strings.sh
     2. remove CI check
     3. add checks to parse-options.c
     4. modify t/t1502-rev-parse-parseopt.sh to pass the test

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1147%2FAbhra303%2Fusage_command_amend-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1147/Abhra303/usage_command_amend-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1147

Range-diff vs v1:

 1:  ea0ba45d77a ! 1:  902937e768d add usage-strings ci check and amend remaining usage strings
     @@ Metadata
      Author: Abhra303 <chakrabortyabhradeep79@gmail.com>
      
       ## Commit message ##
     -    add usage-strings ci check and amend remaining usage strings
     +    add usage-strings check  and amend remaining usage strings
      
          Usage strings for git (sub)command flags has a style guide that
          suggests - first letter should not capitalized (unless requied)
     @@ Commit message
          are following the guide/convention or not.
      
          Amend the usage strings that don't follow the convention/guide and
     -    add a `CI` check for checking the usage strings (whether the first
     -    letter is capital or it ends with full-stop). If the `check` find
     -    such strings then print those strings and return a non-zero status.
     -
     -    Also provide a script that takes an optional argument (a valid <tree>
     -    string), to check the usage strings in the given <tree> (`HEAD` is
     -    the default argument).
     +    add a check in the `parse_options_check()` function in `parse-options.c`
     +    to check the usage strings against the style guide.
      
          Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
      
     - ## Makefile ##
     -@@ Makefile: check-docs::
     - check-builtins::
     - 	./check-builtins.sh
     - 
     -+### Make sure all the usage strings follow usage string style guide
     -+#
     -+check-usage-strings::
     -+	./check-usage-strings.sh
     -+
     - ### Test suite coverage testing
     - #
     - .PHONY: coverage coverage-clean coverage-compile coverage-test coverage-report
     -
       ## builtin/bisect--helper.c ##
      @@ builtin/bisect--helper.c: int cmd_bisect__helper(int argc, const char **argv, const char *prefix)
       		OPT_CMDMODE(0, "bisect-visualize", &cmdmode,
     @@ builtin/submodule--helper.c: static int module_clone(int argc, const char **argv
       			   N_("force cloning progress")),
       		OPT_BOOL(0, "require-init", &require_init,
      
     - ## check-usage-strings.sh (new) ##
     -@@
     -+{
     -+  if test -d ".git"
     -+  then
     -+    rev=${1:-"HEAD"}
     -+    for entry in $(git grep -l 'struct option .* = {$' "$rev" -- \*.c);
     -+    do
     -+      git show "$entry" |
     -+      sed -n '/struct option .* = {/,/OPT_END/{=;p;}' |
     -+      sed "N;s/^\\([0-9]*\\)\\n/$(echo "$entry" | sed 's/\//\\&/g'):\\1/";
     -+    done
     -+  else
     -+    for entry in $(grep -rl --include="*.c" 'struct option .* = {$' . );
     -+    do
     -+      cat "$entry" |
     -+      sed -n '/struct option .* = {/,/OPT_END/{=;p;}' |
     -+      sed "N;s/^\\([0-9]*\\)\\n/$(echo "$entry" | sed -e 's/\//\\&/g' -e 's/^\.\\\///'):\\1/";
     -+    done
     -+  fi
     -+} |
     -+grep -Pe '((?<!OPT_GROUP\(N_\(|OPT_GROUP\()"(?!GPG|DEPRECATED|SHA1|HEAD)[A-Z]|(?<!"|\.\.)\.")' |
     -+{
     -+  status=0
     -+  while read content;
     -+  do
     -+    if test -n "$content"
     -+    then
     -+      echo "$content";
     -+      status=1;
     -+    fi
     -+  done
     -+
     -+  exit $status
     -+}
     -
     - ## ci/test-documentation.sh ##
     -@@ ci/test-documentation.sh: filter_log () {
     - }
     - 
     - make check-builtins
     -+make check-usage-strings
     - make check-docs
     - 
     - # Build docs with AsciiDoc
     -
       ## diff.c ##
      @@ diff.c: static void prep_parse_options(struct diff_options *options)
       			       N_("select files by diff type"),
     @@ diff.c: static void prep_parse_options(struct diff_options *options)
       
       		OPT_END()
      
     + ## parse-options.c ##
     +@@ parse-options.c: static void parse_options_check(const struct option *opts)
     + 		default:
     + 			; /* ok. (usually accepts an argument) */
     + 		}
     ++		if (opts->type != OPTION_GROUP && opts->help &&
     ++			!(starts_with(opts->help, "HEAD") ||
     ++			  starts_with(opts->help, "GPG") ||
     ++			  starts_with(opts->help, "DEPRECATED") ||
     ++			  starts_with(opts->help, "SHA1")) &&
     ++			  (opts->help[0] >= 65 && opts->help[0] <= 90))
     ++			err |= optbug(opts, xstrfmt("help should not start with capital letter unless needed: %s", opts->help));
     ++		if (opts->help && !ends_with(opts->help, "...") && ends_with(opts->help, "."))
     ++			err |= optbug(opts, xstrfmt("help should not end with a dot: %s", opts->help));
     + 		if (opts->argh &&
     + 		    strcspn(opts->argh, " _") != strlen(opts->argh))
     + 			err |= optbug(opts, "multi-word argh should use dash to separate words");
     +
       ## t/helper/test-run-command.c ##
      @@ t/helper/test-run-command.c: static int quote_stress_test(int argc, const char **argv)
       	struct strbuf out = STRBUF_INIT;
     @@ t/helper/test-run-command.c: static int quote_stress_test(int argc, const char *
       		OPT_END()
       	};
       	const char * const usage[] = {
     +
     + ## t/t1502-rev-parse-parseopt.sh ##
     +@@ t/t1502-rev-parse-parseopt.sh: test_expect_success 'setup optionspec-only-hidden-switches' '
     + |
     + |some-command does foo and bar!
     + |--
     +-|hidden1* A hidden switch
     ++|hidden1* a hidden switch
     + EOF
     + '
     + 
     +@@ t/t1502-rev-parse-parseopt.sh: test_expect_success 'test --parseopt help-all output hidden switches' '
     + |
     + |    some-command does foo and bar!
     + |
     +-|    --hidden1             A hidden switch
     ++|    --hidden1             a hidden switch
     + |
     + |EOF
     + END_EXPECT


 builtin/bisect--helper.c      | 2 +-
 builtin/reflog.c              | 6 +++---
 builtin/submodule--helper.c   | 2 +-
 diff.c                        | 2 +-
 parse-options.c               | 9 +++++++++
 t/helper/test-run-command.c   | 6 +++---
 t/t1502-rev-parse-parseopt.sh | 4 ++--
 7 files changed, 20 insertions(+), 11 deletions(-)

diff --git a/builtin/bisect--helper.c b/builtin/bisect--helper.c
index 28a2e6a5750..614d95b022c 100644
--- a/builtin/bisect--helper.c
+++ b/builtin/bisect--helper.c
@@ -1209,7 +1209,7 @@ int cmd_bisect__helper(int argc, const char **argv, const char *prefix)
 		OPT_CMDMODE(0, "bisect-visualize", &cmdmode,
 			 N_("visualize the bisection"), BISECT_VISUALIZE),
 		OPT_CMDMODE(0, "bisect-run", &cmdmode,
-			 N_("use <cmd>... to automatically bisect."), BISECT_RUN),
+			 N_("use <cmd>... to automatically bisect"), BISECT_RUN),
 		OPT_BOOL(0, "no-log", &nolog,
 			 N_("no log for BISECT_WRITE")),
 		OPT_END()
diff --git a/builtin/reflog.c b/builtin/reflog.c
index 85b838720c3..28372c5e2b5 100644
--- a/builtin/reflog.c
+++ b/builtin/reflog.c
@@ -600,7 +600,7 @@ static int cmd_reflog_expire(int argc, const char **argv, const char *prefix)
 		OPT_BIT(0, "updateref", &flags,
 			N_("update the reference to the value of the top reflog entry"),
 			EXPIRE_REFLOGS_UPDATE_REF),
-		OPT_BOOL(0, "verbose", &verbose, N_("print extra information on screen.")),
+		OPT_BOOL(0, "verbose", &verbose, N_("print extra information on screen")),
 		OPT_CALLBACK_F(0, "expire", &cmd, N_("timestamp"),
 			       N_("prune entries older than the specified time"),
 			       PARSE_OPT_NONEG,
@@ -613,7 +613,7 @@ static int cmd_reflog_expire(int argc, const char **argv, const char *prefix)
 			 N_("prune any reflog entries that point to broken commits")),
 		OPT_BOOL(0, "all", &do_all, N_("process the reflogs of all references")),
 		OPT_BOOL(1, "single-worktree", &all_worktrees,
-			 N_("limits processing to reflogs from the current worktree only.")),
+			 N_("limits processing to reflogs from the current worktree only")),
 		OPT_END()
 	};
 
@@ -736,7 +736,7 @@ static int cmd_reflog_delete(int argc, const char **argv, const char *prefix)
 		OPT_BIT(0, "updateref", &flags,
 			N_("update the reference to the value of the top reflog entry"),
 			EXPIRE_REFLOGS_UPDATE_REF),
-		OPT_BOOL(0, "verbose", &verbose, N_("print extra information on screen.")),
+		OPT_BOOL(0, "verbose", &verbose, N_("print extra information on screen")),
 		OPT_END()
 	};
 
diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 33c82c3ab91..6332d305983 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -1875,7 +1875,7 @@ static int module_clone(int argc, const char **argv, const char *prefix)
 		OPT_STRING(0, "depth", &clone_data.depth,
 			   N_("string"),
 			   N_("depth for shallow clones")),
-		OPT__QUIET(&quiet, "Suppress output for cloning a submodule"),
+		OPT__QUIET(&quiet, "suppress output for cloning a submodule"),
 		OPT_BOOL(0, "progress", &progress,
 			   N_("force cloning progress")),
 		OPT_BOOL(0, "require-init", &require_init,
diff --git a/diff.c b/diff.c
index 7d5cfd325ea..387435a4a45 100644
--- a/diff.c
+++ b/diff.c
@@ -5630,7 +5630,7 @@ static void prep_parse_options(struct diff_options *options)
 			       N_("select files by diff type"),
 			       PARSE_OPT_NONEG, diff_opt_diff_filter),
 		{ OPTION_CALLBACK, 0, "output", options, N_("<file>"),
-		  N_("Output to a specific file"),
+		  N_("output to a specific file"),
 		  PARSE_OPT_NONEG, NULL, 0, diff_opt_output },
 
 		OPT_END()
diff --git a/parse-options.c b/parse-options.c
index 2437ad3bcdd..91cbfb0d7f7 100644
--- a/parse-options.c
+++ b/parse-options.c
@@ -492,6 +492,15 @@ static void parse_options_check(const struct option *opts)
 		default:
 			; /* ok. (usually accepts an argument) */
 		}
+		if (opts->type != OPTION_GROUP && opts->help &&
+			!(starts_with(opts->help, "HEAD") ||
+			  starts_with(opts->help, "GPG") ||
+			  starts_with(opts->help, "DEPRECATED") ||
+			  starts_with(opts->help, "SHA1")) &&
+			  (opts->help[0] >= 65 && opts->help[0] <= 90))
+			err |= optbug(opts, xstrfmt("help should not start with capital letter unless needed: %s", opts->help));
+		if (opts->help && !ends_with(opts->help, "...") && ends_with(opts->help, "."))
+			err |= optbug(opts, xstrfmt("help should not end with a dot: %s", opts->help));
 		if (opts->argh &&
 		    strcspn(opts->argh, " _") != strlen(opts->argh))
 			err |= optbug(opts, "multi-word argh should use dash to separate words");
diff --git a/t/helper/test-run-command.c b/t/helper/test-run-command.c
index 913775a14b7..8f370cd89f1 100644
--- a/t/helper/test-run-command.c
+++ b/t/helper/test-run-command.c
@@ -221,9 +221,9 @@ static int quote_stress_test(int argc, const char **argv)
 	struct strbuf out = STRBUF_INIT;
 	struct strvec args = STRVEC_INIT;
 	struct option options[] = {
-		OPT_INTEGER('n', "trials", &trials, "Number of trials"),
-		OPT_INTEGER('s', "skip", &skip, "Skip <n> trials"),
-		OPT_BOOL('m', "msys2", &msys2, "Test quoting for MSYS2's sh"),
+		OPT_INTEGER('n', "trials", &trials, "number of trials"),
+		OPT_INTEGER('s', "skip", &skip, "skip <n> trials"),
+		OPT_BOOL('m', "msys2", &msys2, "test quoting for MSYS2's sh"),
 		OPT_END()
 	};
 	const char * const usage[] = {
diff --git a/t/t1502-rev-parse-parseopt.sh b/t/t1502-rev-parse-parseopt.sh
index 284fe18e726..2a07e130b96 100755
--- a/t/t1502-rev-parse-parseopt.sh
+++ b/t/t1502-rev-parse-parseopt.sh
@@ -53,7 +53,7 @@ test_expect_success 'setup optionspec-only-hidden-switches' '
 |
 |some-command does foo and bar!
 |--
-|hidden1* A hidden switch
+|hidden1* a hidden switch
 EOF
 '
 
@@ -131,7 +131,7 @@ test_expect_success 'test --parseopt help-all output hidden switches' '
 |
 |    some-command does foo and bar!
 |
-|    --hidden1             A hidden switch
+|    --hidden1             a hidden switch
 |
 |EOF
 END_EXPECT

base-commit: e6ebfd0e8cbbd10878070c8a356b5ad1b3ca464e
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* Re: [PATCH v2] add usage-strings check and amend remaining usage strings
  2022-02-22 15:58 ` [PATCH v2] add usage-strings " Abhradeep Chakraborty via GitGitGadget
@ 2022-02-22 17:16   ` Eric Sunshine
  2022-02-23 11:59     ` Abhradeep Chakraborty
  2022-02-23 21:17     ` Junio C Hamano
  2022-02-23 14:27   ` [PATCH v3 0/2] add usage-strings ci " Abhradeep Chakraborty via GitGitGadget
  1 sibling, 2 replies; 59+ messages in thread
From: Eric Sunshine @ 2022-02-22 17:16 UTC (permalink / raw)
  To: Abhradeep Chakraborty via GitGitGadget
  Cc: Git List, Ævar Arnfjörð Bjarmason,
	Johannes Schindelin, Julia Lawall, Abhradeep Chakraborty

On Tue, Feb 22, 2022 at 11:27 AM Abhradeep Chakraborty via
GitGitGadget <gitgitgadget@gmail.com> wrote:
> Usage strings for git (sub)command flags has a style guide that
> suggests - first letter should not capitalized (unless requied)

s/requied/required/

> and it should skip full-stop at the end of line. But there are
> some files where usage-strings do not follow the above mentioned
> guide. Moreover, there are no checks to verify if all usage strings
> are following the guide/convention or not.
>
> Amend the usage strings that don't follow the convention/guide and
> add a check in the `parse_options_check()` function in `parse-options.c`
> to check the usage strings against the style guide.

This is a relatively minor observation, but it might make sense to
split this into two patches, the first of which fixes the offending
usage strings, and the second which adds the check to parse-options.c
to prevent more offending strings from entering the project in the
future. Anyhow, not necessarily worth a reroll.

> Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
> ---
> diff --git a/parse-options.c b/parse-options.c
> @@ -492,6 +492,15 @@ static void parse_options_check(const struct option *opts)
> +               if (opts->type != OPTION_GROUP && opts->help &&
> +                       !(starts_with(opts->help, "HEAD") ||
> +                         starts_with(opts->help, "GPG") ||
> +                         starts_with(opts->help, "DEPRECATED") ||
> +                         starts_with(opts->help, "SHA1")) &&
> +                         (opts->help[0] >= 65 && opts->help[0] <= 90))
> +                       err |= optbug(opts, xstrfmt("help should not start with capital letter unless needed: %s", opts->help));

This list of hardcoded exceptions may become a maintenance burden. I
can figure out why OPTION_GROUP is treated specially here, but why use
magic numbers 65 and 90 rather than a more obvious function like
isupper()?

Perhaps instead of hardcoding an exception list and magic numbers, we
can use a simple heuristic instead. For instance, if the first two
characters of the help string are uppercase, then assume it is an
acronym (i.e. "GPG") or special name (i.e. "HEAD"), thus allowed.
Maybe something like this:

    if (opts->type != OPTION_GROUP && opts->help &&
        opts->help[0] && isupper(opts->help[0]) &&
        !(opts->help[1] && isupper(opts->help[1])))

> +               if (opts->help && !ends_with(opts->help, "...") && ends_with(opts->help, "."))
> +                       err |= optbug(opts, xstrfmt("help should not end with a dot: %s", opts->help));

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v2] add usage-strings check and amend remaining usage strings
  2022-02-22 17:16   ` Eric Sunshine
@ 2022-02-23 11:59     ` Abhradeep Chakraborty
  2022-02-23 21:17     ` Junio C Hamano
  1 sibling, 0 replies; 59+ messages in thread
From: Abhradeep Chakraborty @ 2022-02-23 11:59 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: Abhradeep Chakraborty, Ævar Arnfjörð Bjarmason,
	Johannes Schindelin, Junio C Hamano, git

Eric Sunshine <sunshine@sunshineco.com> wrote:

> s/requied/required/

Thanks for pointing out. Will fix it.

> This is a relatively minor observation, but it might make sense to
> split this into two patches, the first of which fixes the offending
> usage strings, and the second which adds the check to parse-options.c
> to prevent more offending strings from entering the project in the
> future. Anyhow, not necessarily worth a reroll.

I made a single commit because both changes are small. But I think You're
right - I should split it into two.

> This list of hardcoded exceptions may become a maintenance burden. I
> can figure out why OPTION_GROUP is treated specially here, but why use
> magic numbers 65 and 90 rather than a more obvious function like
> isupper()?

Hmm, this was a quick fix came into my mind. So, I didn't look at other
(and better) options for this. Will fix it :)

> Perhaps instead of hardcoding an exception list and magic numbers, we
> can use a simple heuristic instead. For instance, if the first two
> characters of the help string are uppercase, then assume it is an
> acronym (i.e. "GPG") or special name (i.e. "HEAD"), thus allowed.
> Maybe something like this:
>
>     if (opts->type != OPTION_GROUP && opts->help &&
>         opts->help[0] && isupper(opts->help[0]) &&
>         !(opts->help[1] && isupper(opts->help[1])))

Okay, got it!

Thanks :)

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH v3 0/2] add usage-strings ci check and amend remaining usage strings
  2022-02-22 15:58 ` [PATCH v2] add usage-strings " Abhradeep Chakraborty via GitGitGadget
  2022-02-22 17:16   ` Eric Sunshine
@ 2022-02-23 14:27   ` Abhradeep Chakraborty via GitGitGadget
  2022-02-23 14:27     ` [PATCH v3 1/2] amend remaining usage strings according to style guide Abhra303 via GitGitGadget
                       ` (2 more replies)
  1 sibling, 3 replies; 59+ messages in thread
From: Abhradeep Chakraborty via GitGitGadget @ 2022-02-23 14:27 UTC (permalink / raw)
  To: git
  Cc: Ævar Arnfjörð Bjarmason, Johannes Schindelin,
	Julia Lawall, Eric Sunshine, Abhradeep Chakraborty

This patch series completely fixes #636.

The issue is about amending the usage-strings (for command flags such as -h,
-v etc.) which do not follow the style convention/guide. There was a PR
[https://github.com/gitgitgadget/git/pull/920] addressing this issue but as
Johannes [https://github.com/dscho] said in his comment
[https://github.com/gitgitgadget/git/issues/636#issuecomment-1018660439],
there are some files that still have those kind of usage strings. Johannes
also suggested to add a CI check under ci/test-documentation.sh to check the
usage strings.

In this version, the previously single commit is split into two commits (
one addressing amending of usage strings and another is for adding the style
checks to parse_options_check()) and the checks are simplified.

Changes since v1:

 1. remove check-usage-strings.sh
 2. remove CI check
 3. add checks to parse-options.c
 4. modify t/t1502-rev-parse-parseopt.sh to pass the test

Until v1:

A shell script check-usage-strings.sh was introduced to check the
usage-strings. CI check for the same was also introduced.

Abhra303 (1):
  amend remaining usage strings according to style guide

Abhradeep Chakraborty (1):
  parse-options.c: add style checks for usage-strings

 builtin/bisect--helper.c      | 2 +-
 builtin/reflog.c              | 6 +++---
 builtin/submodule--helper.c   | 2 +-
 diff.c                        | 2 +-
 parse-options.c               | 6 ++++++
 t/helper/test-run-command.c   | 6 +++---
 t/t1502-rev-parse-parseopt.sh | 4 ++--
 7 files changed, 17 insertions(+), 11 deletions(-)


base-commit: e6ebfd0e8cbbd10878070c8a356b5ad1b3ca464e
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1147%2FAbhra303%2Fusage_command_amend-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1147/Abhra303/usage_command_amend-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/1147

Range-diff vs v2:

 1:  902937e768d ! 1:  f425e36b7ea add usage-strings check  and amend remaining usage strings
     @@ Metadata
      Author: Abhra303 <chakrabortyabhradeep79@gmail.com>
      
       ## Commit message ##
     -    add usage-strings check  and amend remaining usage strings
     +    amend remaining usage strings according to style guide
      
          Usage strings for git (sub)command flags has a style guide that
     -    suggests - first letter should not capitalized (unless requied)
     +    suggests - first letter should not capitalized (unless required)
          and it should skip full-stop at the end of line. But there are
          some files where usage-strings do not follow the above mentioned
     -    guide. Moreover, there are no checks to verify if all usage strings
     -    are following the guide/convention or not.
     +    guide.
      
     -    Amend the usage strings that don't follow the convention/guide and
     -    add a check in the `parse_options_check()` function in `parse-options.c`
     -    to check the usage strings against the style guide.
     +    Amend the usage strings that don't follow the style convention/guide.
      
          Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
      
     @@ diff.c: static void prep_parse_options(struct diff_options *options)
       
       		OPT_END()
      
     - ## parse-options.c ##
     -@@ parse-options.c: static void parse_options_check(const struct option *opts)
     - 		default:
     - 			; /* ok. (usually accepts an argument) */
     - 		}
     -+		if (opts->type != OPTION_GROUP && opts->help &&
     -+			!(starts_with(opts->help, "HEAD") ||
     -+			  starts_with(opts->help, "GPG") ||
     -+			  starts_with(opts->help, "DEPRECATED") ||
     -+			  starts_with(opts->help, "SHA1")) &&
     -+			  (opts->help[0] >= 65 && opts->help[0] <= 90))
     -+			err |= optbug(opts, xstrfmt("help should not start with capital letter unless needed: %s", opts->help));
     -+		if (opts->help && !ends_with(opts->help, "...") && ends_with(opts->help, "."))
     -+			err |= optbug(opts, xstrfmt("help should not end with a dot: %s", opts->help));
     - 		if (opts->argh &&
     - 		    strcspn(opts->argh, " _") != strlen(opts->argh))
     - 			err |= optbug(opts, "multi-word argh should use dash to separate words");
     -
       ## t/helper/test-run-command.c ##
      @@ t/helper/test-run-command.c: static int quote_stress_test(int argc, const char **argv)
       	struct strbuf out = STRBUF_INIT;
     @@ t/helper/test-run-command.c: static int quote_stress_test(int argc, const char *
       		OPT_END()
       	};
       	const char * const usage[] = {
     -
     - ## t/t1502-rev-parse-parseopt.sh ##
     -@@ t/t1502-rev-parse-parseopt.sh: test_expect_success 'setup optionspec-only-hidden-switches' '
     - |
     - |some-command does foo and bar!
     - |--
     --|hidden1* A hidden switch
     -+|hidden1* a hidden switch
     - EOF
     - '
     - 
     -@@ t/t1502-rev-parse-parseopt.sh: test_expect_success 'test --parseopt help-all output hidden switches' '
     - |
     - |    some-command does foo and bar!
     - |
     --|    --hidden1             A hidden switch
     -+|    --hidden1             a hidden switch
     - |
     - |EOF
     - END_EXPECT
 -:  ----------- > 2:  9d42bdbff6c parse-options.c: add style checks for usage-strings

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH v3 1/2] amend remaining usage strings according to style guide
  2022-02-23 14:27   ` [PATCH v3 0/2] add usage-strings ci " Abhradeep Chakraborty via GitGitGadget
@ 2022-02-23 14:27     ` Abhra303 via GitGitGadget
  2022-02-23 14:27     ` [PATCH v3 2/2] parse-options.c: add style checks for usage-strings Abhradeep Chakraborty via GitGitGadget
  2022-02-25  5:23     ` [PATCH v4 0/2] add usage-strings ci check and amend remaining usage strings Abhradeep Chakraborty via GitGitGadget
  2 siblings, 0 replies; 59+ messages in thread
From: Abhra303 via GitGitGadget @ 2022-02-23 14:27 UTC (permalink / raw)
  To: git
  Cc: Ævar Arnfjörð Bjarmason, Johannes Schindelin,
	Julia Lawall, Eric Sunshine, Abhradeep Chakraborty, Abhra303

From: Abhra303 <chakrabortyabhradeep79@gmail.com>

Usage strings for git (sub)command flags has a style guide that
suggests - first letter should not capitalized (unless required)
and it should skip full-stop at the end of line. But there are
some files where usage-strings do not follow the above mentioned
guide.

Amend the usage strings that don't follow the style convention/guide.

Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
---
 builtin/bisect--helper.c    | 2 +-
 builtin/reflog.c            | 6 +++---
 builtin/submodule--helper.c | 2 +-
 diff.c                      | 2 +-
 t/helper/test-run-command.c | 6 +++---
 5 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/builtin/bisect--helper.c b/builtin/bisect--helper.c
index 28a2e6a5750..614d95b022c 100644
--- a/builtin/bisect--helper.c
+++ b/builtin/bisect--helper.c
@@ -1209,7 +1209,7 @@ int cmd_bisect__helper(int argc, const char **argv, const char *prefix)
 		OPT_CMDMODE(0, "bisect-visualize", &cmdmode,
 			 N_("visualize the bisection"), BISECT_VISUALIZE),
 		OPT_CMDMODE(0, "bisect-run", &cmdmode,
-			 N_("use <cmd>... to automatically bisect."), BISECT_RUN),
+			 N_("use <cmd>... to automatically bisect"), BISECT_RUN),
 		OPT_BOOL(0, "no-log", &nolog,
 			 N_("no log for BISECT_WRITE")),
 		OPT_END()
diff --git a/builtin/reflog.c b/builtin/reflog.c
index 85b838720c3..28372c5e2b5 100644
--- a/builtin/reflog.c
+++ b/builtin/reflog.c
@@ -600,7 +600,7 @@ static int cmd_reflog_expire(int argc, const char **argv, const char *prefix)
 		OPT_BIT(0, "updateref", &flags,
 			N_("update the reference to the value of the top reflog entry"),
 			EXPIRE_REFLOGS_UPDATE_REF),
-		OPT_BOOL(0, "verbose", &verbose, N_("print extra information on screen.")),
+		OPT_BOOL(0, "verbose", &verbose, N_("print extra information on screen")),
 		OPT_CALLBACK_F(0, "expire", &cmd, N_("timestamp"),
 			       N_("prune entries older than the specified time"),
 			       PARSE_OPT_NONEG,
@@ -613,7 +613,7 @@ static int cmd_reflog_expire(int argc, const char **argv, const char *prefix)
 			 N_("prune any reflog entries that point to broken commits")),
 		OPT_BOOL(0, "all", &do_all, N_("process the reflogs of all references")),
 		OPT_BOOL(1, "single-worktree", &all_worktrees,
-			 N_("limits processing to reflogs from the current worktree only.")),
+			 N_("limits processing to reflogs from the current worktree only")),
 		OPT_END()
 	};
 
@@ -736,7 +736,7 @@ static int cmd_reflog_delete(int argc, const char **argv, const char *prefix)
 		OPT_BIT(0, "updateref", &flags,
 			N_("update the reference to the value of the top reflog entry"),
 			EXPIRE_REFLOGS_UPDATE_REF),
-		OPT_BOOL(0, "verbose", &verbose, N_("print extra information on screen.")),
+		OPT_BOOL(0, "verbose", &verbose, N_("print extra information on screen")),
 		OPT_END()
 	};
 
diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 33c82c3ab91..6332d305983 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -1875,7 +1875,7 @@ static int module_clone(int argc, const char **argv, const char *prefix)
 		OPT_STRING(0, "depth", &clone_data.depth,
 			   N_("string"),
 			   N_("depth for shallow clones")),
-		OPT__QUIET(&quiet, "Suppress output for cloning a submodule"),
+		OPT__QUIET(&quiet, "suppress output for cloning a submodule"),
 		OPT_BOOL(0, "progress", &progress,
 			   N_("force cloning progress")),
 		OPT_BOOL(0, "require-init", &require_init,
diff --git a/diff.c b/diff.c
index 7d5cfd325ea..387435a4a45 100644
--- a/diff.c
+++ b/diff.c
@@ -5630,7 +5630,7 @@ static void prep_parse_options(struct diff_options *options)
 			       N_("select files by diff type"),
 			       PARSE_OPT_NONEG, diff_opt_diff_filter),
 		{ OPTION_CALLBACK, 0, "output", options, N_("<file>"),
-		  N_("Output to a specific file"),
+		  N_("output to a specific file"),
 		  PARSE_OPT_NONEG, NULL, 0, diff_opt_output },
 
 		OPT_END()
diff --git a/t/helper/test-run-command.c b/t/helper/test-run-command.c
index 913775a14b7..8f370cd89f1 100644
--- a/t/helper/test-run-command.c
+++ b/t/helper/test-run-command.c
@@ -221,9 +221,9 @@ static int quote_stress_test(int argc, const char **argv)
 	struct strbuf out = STRBUF_INIT;
 	struct strvec args = STRVEC_INIT;
 	struct option options[] = {
-		OPT_INTEGER('n', "trials", &trials, "Number of trials"),
-		OPT_INTEGER('s', "skip", &skip, "Skip <n> trials"),
-		OPT_BOOL('m', "msys2", &msys2, "Test quoting for MSYS2's sh"),
+		OPT_INTEGER('n', "trials", &trials, "number of trials"),
+		OPT_INTEGER('s', "skip", &skip, "skip <n> trials"),
+		OPT_BOOL('m', "msys2", &msys2, "test quoting for MSYS2's sh"),
 		OPT_END()
 	};
 	const char * const usage[] = {
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 2/2] parse-options.c: add style checks for usage-strings
  2022-02-23 14:27   ` [PATCH v3 0/2] add usage-strings ci " Abhradeep Chakraborty via GitGitGadget
  2022-02-23 14:27     ` [PATCH v3 1/2] amend remaining usage strings according to style guide Abhra303 via GitGitGadget
@ 2022-02-23 14:27     ` Abhradeep Chakraborty via GitGitGadget
  2022-02-25  5:23     ` [PATCH v4 0/2] add usage-strings ci check and amend remaining usage strings Abhradeep Chakraborty via GitGitGadget
  2 siblings, 0 replies; 59+ messages in thread
From: Abhradeep Chakraborty via GitGitGadget @ 2022-02-23 14:27 UTC (permalink / raw)
  To: git
  Cc: Ævar Arnfjörð Bjarmason, Johannes Schindelin,
	Julia Lawall, Eric Sunshine, Abhradeep Chakraborty,
	Abhradeep Chakraborty

From: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>

`parse-options.c` doesn't check if the usage strings for option flags
are following the style guide or not. Style convention says, usage
strings should not start with capital letter (unless needed) and
it should not end with `.`.

Add checks to the `parse_options_check()` function to check usage
strings against the style convention.

Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
---
 parse-options.c               | 6 ++++++
 t/t1502-rev-parse-parseopt.sh | 4 ++--
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/parse-options.c b/parse-options.c
index 2437ad3bcdd..eb92290a63a 100644
--- a/parse-options.c
+++ b/parse-options.c
@@ -492,6 +492,12 @@ static void parse_options_check(const struct option *opts)
 		default:
 			; /* ok. (usually accepts an argument) */
 		}
+		if (opts->type != OPTION_GROUP && opts->help &&
+			opts->help[0] && isupper(opts->help[0]) &&
+			!(opts->help[1] && isupper(opts->help[1])))
+			err |= optbug(opts, xstrfmt("help should not start with capital letter unless needed: %s", opts->help));
+		if (opts->help && !ends_with(opts->help, "...") && ends_with(opts->help, "."))
+			err |= optbug(opts, xstrfmt("help should not end with a dot: %s", opts->help));
 		if (opts->argh &&
 		    strcspn(opts->argh, " _") != strlen(opts->argh))
 			err |= optbug(opts, "multi-word argh should use dash to separate words");
diff --git a/t/t1502-rev-parse-parseopt.sh b/t/t1502-rev-parse-parseopt.sh
index 284fe18e726..2a07e130b96 100755
--- a/t/t1502-rev-parse-parseopt.sh
+++ b/t/t1502-rev-parse-parseopt.sh
@@ -53,7 +53,7 @@ test_expect_success 'setup optionspec-only-hidden-switches' '
 |
 |some-command does foo and bar!
 |--
-|hidden1* A hidden switch
+|hidden1* a hidden switch
 EOF
 '
 
@@ -131,7 +131,7 @@ test_expect_success 'test --parseopt help-all output hidden switches' '
 |
 |    some-command does foo and bar!
 |
-|    --hidden1             A hidden switch
+|    --hidden1             a hidden switch
 |
 |EOF
 END_EXPECT
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* Re: [PATCH v2] add usage-strings check and amend remaining usage strings
  2022-02-22 17:16   ` Eric Sunshine
  2022-02-23 11:59     ` Abhradeep Chakraborty
@ 2022-02-23 21:17     ` Junio C Hamano
  2022-02-23 21:20       ` Eric Sunshine
  2022-02-24  6:26       ` Abhradeep Chakraborty
  1 sibling, 2 replies; 59+ messages in thread
From: Junio C Hamano @ 2022-02-23 21:17 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: Abhradeep Chakraborty via GitGitGadget, Git List,
	Ævar Arnfjörð Bjarmason, Johannes Schindelin,
	Julia Lawall, Abhradeep Chakraborty

Eric Sunshine <sunshine@sunshineco.com> writes:

> This is a relatively minor observation, but it might make sense to
> split this into two patches, the first of which fixes the offending
> usage strings, and the second which adds the check to parse-options.c
> to prevent more offending strings from entering the project in the
> future.

Yeah, that sounds like a quite sensible split.

I notice that the real-looking name

>> Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>

does not match with the in-body "From:" that has less real-looking.
Please fix the in-body "From:" if this is rerolled so that both
mention the same "Human Readable Name <email@add.re.ss>".

>> ---
>> diff --git a/parse-options.c b/parse-options.c
>> @@ -492,6 +492,15 @@ static void parse_options_check(const struct option *opts)
>> +               if (opts->type != OPTION_GROUP && opts->help &&
>> +                       !(starts_with(opts->help, "HEAD") ||
>> +                         starts_with(opts->help, "GPG") ||
>> +                         starts_with(opts->help, "DEPRECATED") ||
>> +                         starts_with(opts->help, "SHA1")) &&
>> +                         (opts->help[0] >= 65 && opts->help[0] <= 90))
>> +                       err |= optbug(opts, xstrfmt("help should not start with capital letter unless needed: %s", opts->help));
>
> This list of hardcoded exceptions may become a maintenance burden. I
> can figure out why OPTION_GROUP is treated specially here, but why use
> magic numbers 65 and 90 rather than a more obvious function like
> isupper()?
>
> Perhaps instead of hardcoding an exception list and magic numbers, we
> can use a simple heuristic instead. For instance, if the first two
> characters of the help string are uppercase, then assume it is an
> acronym (i.e. "GPG") or special name (i.e. "HEAD"), thus allowed.
> Maybe something like this:
>
>     if (opts->type != OPTION_GROUP && opts->help &&
>         opts->help[0] && isupper(opts->help[0]) &&
>         !(opts->help[1] && isupper(opts->help[1])))
>

Much better than what was posted, but such a heuristic deserves some
in-code comment to check why we see the first two.

>> +               if (opts->help && !ends_with(opts->help, "...") && ends_with(opts->help, "."))
>> +                       err |= optbug(opts, xstrfmt("help should not end with a dot: %s", opts->help));

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v2] add usage-strings check and amend remaining usage strings
  2022-02-23 21:17     ` Junio C Hamano
@ 2022-02-23 21:20       ` Eric Sunshine
  2022-02-24  6:26       ` Abhradeep Chakraborty
  1 sibling, 0 replies; 59+ messages in thread
From: Eric Sunshine @ 2022-02-23 21:20 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Abhradeep Chakraborty via GitGitGadget, Git List,
	Ævar Arnfjörð Bjarmason, Johannes Schindelin,
	Julia Lawall, Abhradeep Chakraborty

On Wed, Feb 23, 2022 at 4:17 PM Junio C Hamano <gitster@pobox.com> wrote:
> Eric Sunshine <sunshine@sunshineco.com> writes:
> >> +               if (opts->type != OPTION_GROUP && opts->help &&
> >> +                       !(starts_with(opts->help, "HEAD") ||
> >> +                         starts_with(opts->help, "GPG") ||
> >> +                         starts_with(opts->help, "DEPRECATED") ||
> >> +                         starts_with(opts->help, "SHA1")) &&
> >> +                         (opts->help[0] >= 65 && opts->help[0] <= 90))
> >
> > This list of hardcoded exceptions may become a maintenance burden. I
> > can figure out why OPTION_GROUP is treated specially here, but why use
> > magic numbers 65 and 90 rather than a more obvious function like
> > isupper()?
> >
> > Perhaps instead of hardcoding an exception list and magic numbers, we
> > can use a simple heuristic instead. For instance, if the first two
> > characters of the help string are uppercase, then assume it is an
> > acronym (i.e. "GPG") or special name (i.e. "HEAD"), thus allowed.
> > Maybe something like this:
> >
> >     if (opts->type != OPTION_GROUP && opts->help &&
> >         opts->help[0] && isupper(opts->help[0]) &&
> >         !(opts->help[1] && isupper(opts->help[1])))
>
> Much better than what was posted, but such a heuristic deserves some
> in-code comment to check why we see the first two.

Yes, I had the same thought as soon as I walked away from the computer
and was going to post a follow-up email to say as much but got
distracted by other things and never got around to it. Thanks for
filling in the gap.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v2] add usage-strings check and amend remaining usage strings
  2022-02-23 21:17     ` Junio C Hamano
  2022-02-23 21:20       ` Eric Sunshine
@ 2022-02-24  6:26       ` Abhradeep Chakraborty
  1 sibling, 0 replies; 59+ messages in thread
From: Abhradeep Chakraborty @ 2022-02-24  6:26 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Abhradeep Chakraborty, Eric Sunshine,
	Ævar Arnfjörð Bjarmason, Johannes Schindelin, git

Junio C Hamano <gitster@pobox.com> wrote:

> I notice that the real-looking name
>
>>> Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
>
> does not match with the in-body "From:" that has less real-looking.
> Please fix the in-body "From:" if this is rerolled so that both
> mention the same "Human Readable Name <email@add.re.ss>".

Okay, fixing it. Unfortunately your review came after sending the third
version[1], so this is not fixed in the newest version. But could you
review the newest version of this patch series so that I can make all the
suggested changes in one go?

> Much better than what was posted, but such a heuristic deserves some
> in-code comment to check why we see the first two.

Okay, will add comments to it.

Thanks :)

[1] https://lore.kernel.org/git/pull.1147.v3.git.1645626455.gitgitgadget@gmail.com/

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH v4 0/2] add usage-strings ci check and amend remaining usage strings
  2022-02-23 14:27   ` [PATCH v3 0/2] add usage-strings ci " Abhradeep Chakraborty via GitGitGadget
  2022-02-23 14:27     ` [PATCH v3 1/2] amend remaining usage strings according to style guide Abhra303 via GitGitGadget
  2022-02-23 14:27     ` [PATCH v3 2/2] parse-options.c: add style checks for usage-strings Abhradeep Chakraborty via GitGitGadget
@ 2022-02-25  5:23     ` Abhradeep Chakraborty via GitGitGadget
  2022-02-25  5:23       ` [PATCH v4 1/2] amend remaining usage strings according to style guide Abhradeep Chakraborty via GitGitGadget
  2022-02-25  5:23       ` [PATCH v4 2/2] parse-options.c: add style checks for usage-strings Abhradeep Chakraborty via GitGitGadget
  2 siblings, 2 replies; 59+ messages in thread
From: Abhradeep Chakraborty via GitGitGadget @ 2022-02-25  5:23 UTC (permalink / raw)
  To: git
  Cc: Ævar Arnfjörð Bjarmason, Johannes Schindelin,
	Julia Lawall, Eric Sunshine, Junio C Hamano,
	Abhradeep Chakraborty

This patch series completely fixes #636.

The issue is about amending the usage-strings (for command flags such as -h,
-v etc.) which do not follow the style convention/guide. There was a PR
[https://github.com/gitgitgadget/git/pull/920] addressing this issue but as
Johannes [https://github.com/dscho] said in his comment
[https://github.com/gitgitgadget/git/issues/636#issuecomment-1018660439],
there are some files that still have those kind of usage strings. Johannes
also suggested to add a CI check under ci/test-documentation.sh to check the
usage strings.

In this version, comments added and the From field of the first commit
message is updated (i.e. "Abhradeep Chakraborty" instead of "Abhra303")

Changes since v2:

 1. split the single commit into two logically separated commits ( one
    addressing amending of usage strings and another is for adding the style
    checks to parse_options_check())
 2. the checks are simplified.

Changes since v1:

 1. remove check-usage-strings.sh
 2. remove CI check
 3. add checks to parse-options.c
 4. modify t/t1502-rev-parse-parseopt.sh to pass the test

Until v1:

A shell script check-usage-strings.sh was introduced to check the
usage-strings. CI check for the same was also introduced.

Abhradeep Chakraborty (2):
  amend remaining usage strings according to style guide
  parse-options.c: add style checks for usage-strings

 builtin/bisect--helper.c      |  2 +-
 builtin/reflog.c              |  6 +++---
 builtin/submodule--helper.c   |  2 +-
 diff.c                        |  2 +-
 parse-options.c               | 11 +++++++++++
 t/helper/test-run-command.c   |  6 +++---
 t/t1502-rev-parse-parseopt.sh |  4 ++--
 7 files changed, 22 insertions(+), 11 deletions(-)


base-commit: e6ebfd0e8cbbd10878070c8a356b5ad1b3ca464e
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1147%2FAbhra303%2Fusage_command_amend-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1147/Abhra303/usage_command_amend-v4
Pull-Request: https://github.com/gitgitgadget/git/pull/1147

Range-diff vs v3:

 1:  f425e36b7ea ! 1:  dc200d098ae amend remaining usage strings according to style guide
     @@
       ## Metadata ##
     -Author: Abhra303 <chakrabortyabhradeep79@gmail.com>
     +Author: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
      
       ## Commit message ##
          amend remaining usage strings according to style guide
 2:  9d42bdbff6c ! 2:  e1c5a325826 parse-options.c: add style checks for usage-strings
     @@ parse-options.c: static void parse_options_check(const struct option *opts)
       		default:
       			; /* ok. (usually accepts an argument) */
       		}
     ++
     ++		// OPTION_GROUP should be ignored
     ++		// if the first two characters of the help string are uppercase, then assume it is an
     ++		// acronym (i.e. "GPG") or special name (i.e. "HEAD"), thus allowed.
     ++		// else assume the usage string is violating the style convention and throw error.
      +		if (opts->type != OPTION_GROUP && opts->help &&
      +			opts->help[0] && isupper(opts->help[0]) &&
      +			!(opts->help[1] && isupper(opts->help[1])))

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH v4 1/2] amend remaining usage strings according to style guide
  2022-02-25  5:23     ` [PATCH v4 0/2] add usage-strings ci check and amend remaining usage strings Abhradeep Chakraborty via GitGitGadget
@ 2022-02-25  5:23       ` Abhradeep Chakraborty via GitGitGadget
  2022-02-25  5:23       ` [PATCH v4 2/2] parse-options.c: add style checks for usage-strings Abhradeep Chakraborty via GitGitGadget
  1 sibling, 0 replies; 59+ messages in thread
From: Abhradeep Chakraborty via GitGitGadget @ 2022-02-25  5:23 UTC (permalink / raw)
  To: git
  Cc: Ævar Arnfjörð Bjarmason, Johannes Schindelin,
	Julia Lawall, Eric Sunshine, Junio C Hamano,
	Abhradeep Chakraborty, Abhradeep Chakraborty

From: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>

Usage strings for git (sub)command flags has a style guide that
suggests - first letter should not capitalized (unless required)
and it should skip full-stop at the end of line. But there are
some files where usage-strings do not follow the above mentioned
guide.

Amend the usage strings that don't follow the style convention/guide.

Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
---
 builtin/bisect--helper.c    | 2 +-
 builtin/reflog.c            | 6 +++---
 builtin/submodule--helper.c | 2 +-
 diff.c                      | 2 +-
 t/helper/test-run-command.c | 6 +++---
 5 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/builtin/bisect--helper.c b/builtin/bisect--helper.c
index 28a2e6a5750..614d95b022c 100644
--- a/builtin/bisect--helper.c
+++ b/builtin/bisect--helper.c
@@ -1209,7 +1209,7 @@ int cmd_bisect__helper(int argc, const char **argv, const char *prefix)
 		OPT_CMDMODE(0, "bisect-visualize", &cmdmode,
 			 N_("visualize the bisection"), BISECT_VISUALIZE),
 		OPT_CMDMODE(0, "bisect-run", &cmdmode,
-			 N_("use <cmd>... to automatically bisect."), BISECT_RUN),
+			 N_("use <cmd>... to automatically bisect"), BISECT_RUN),
 		OPT_BOOL(0, "no-log", &nolog,
 			 N_("no log for BISECT_WRITE")),
 		OPT_END()
diff --git a/builtin/reflog.c b/builtin/reflog.c
index 85b838720c3..28372c5e2b5 100644
--- a/builtin/reflog.c
+++ b/builtin/reflog.c
@@ -600,7 +600,7 @@ static int cmd_reflog_expire(int argc, const char **argv, const char *prefix)
 		OPT_BIT(0, "updateref", &flags,
 			N_("update the reference to the value of the top reflog entry"),
 			EXPIRE_REFLOGS_UPDATE_REF),
-		OPT_BOOL(0, "verbose", &verbose, N_("print extra information on screen.")),
+		OPT_BOOL(0, "verbose", &verbose, N_("print extra information on screen")),
 		OPT_CALLBACK_F(0, "expire", &cmd, N_("timestamp"),
 			       N_("prune entries older than the specified time"),
 			       PARSE_OPT_NONEG,
@@ -613,7 +613,7 @@ static int cmd_reflog_expire(int argc, const char **argv, const char *prefix)
 			 N_("prune any reflog entries that point to broken commits")),
 		OPT_BOOL(0, "all", &do_all, N_("process the reflogs of all references")),
 		OPT_BOOL(1, "single-worktree", &all_worktrees,
-			 N_("limits processing to reflogs from the current worktree only.")),
+			 N_("limits processing to reflogs from the current worktree only")),
 		OPT_END()
 	};
 
@@ -736,7 +736,7 @@ static int cmd_reflog_delete(int argc, const char **argv, const char *prefix)
 		OPT_BIT(0, "updateref", &flags,
 			N_("update the reference to the value of the top reflog entry"),
 			EXPIRE_REFLOGS_UPDATE_REF),
-		OPT_BOOL(0, "verbose", &verbose, N_("print extra information on screen.")),
+		OPT_BOOL(0, "verbose", &verbose, N_("print extra information on screen")),
 		OPT_END()
 	};
 
diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 33c82c3ab91..6332d305983 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -1875,7 +1875,7 @@ static int module_clone(int argc, const char **argv, const char *prefix)
 		OPT_STRING(0, "depth", &clone_data.depth,
 			   N_("string"),
 			   N_("depth for shallow clones")),
-		OPT__QUIET(&quiet, "Suppress output for cloning a submodule"),
+		OPT__QUIET(&quiet, "suppress output for cloning a submodule"),
 		OPT_BOOL(0, "progress", &progress,
 			   N_("force cloning progress")),
 		OPT_BOOL(0, "require-init", &require_init,
diff --git a/diff.c b/diff.c
index 7d5cfd325ea..387435a4a45 100644
--- a/diff.c
+++ b/diff.c
@@ -5630,7 +5630,7 @@ static void prep_parse_options(struct diff_options *options)
 			       N_("select files by diff type"),
 			       PARSE_OPT_NONEG, diff_opt_diff_filter),
 		{ OPTION_CALLBACK, 0, "output", options, N_("<file>"),
-		  N_("Output to a specific file"),
+		  N_("output to a specific file"),
 		  PARSE_OPT_NONEG, NULL, 0, diff_opt_output },
 
 		OPT_END()
diff --git a/t/helper/test-run-command.c b/t/helper/test-run-command.c
index 913775a14b7..8f370cd89f1 100644
--- a/t/helper/test-run-command.c
+++ b/t/helper/test-run-command.c
@@ -221,9 +221,9 @@ static int quote_stress_test(int argc, const char **argv)
 	struct strbuf out = STRBUF_INIT;
 	struct strvec args = STRVEC_INIT;
 	struct option options[] = {
-		OPT_INTEGER('n', "trials", &trials, "Number of trials"),
-		OPT_INTEGER('s', "skip", &skip, "Skip <n> trials"),
-		OPT_BOOL('m', "msys2", &msys2, "Test quoting for MSYS2's sh"),
+		OPT_INTEGER('n', "trials", &trials, "number of trials"),
+		OPT_INTEGER('s', "skip", &skip, "skip <n> trials"),
+		OPT_BOOL('m', "msys2", &msys2, "test quoting for MSYS2's sh"),
 		OPT_END()
 	};
 	const char * const usage[] = {
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v4 2/2] parse-options.c: add style checks for usage-strings
  2022-02-25  5:23     ` [PATCH v4 0/2] add usage-strings ci check and amend remaining usage strings Abhradeep Chakraborty via GitGitGadget
  2022-02-25  5:23       ` [PATCH v4 1/2] amend remaining usage strings according to style guide Abhradeep Chakraborty via GitGitGadget
@ 2022-02-25  5:23       ` Abhradeep Chakraborty via GitGitGadget
  2022-02-25  6:13         ` Junio C Hamano
  2022-02-25 15:36         ` Johannes Schindelin
  1 sibling, 2 replies; 59+ messages in thread
From: Abhradeep Chakraborty via GitGitGadget @ 2022-02-25  5:23 UTC (permalink / raw)
  To: git
  Cc: Ævar Arnfjörð Bjarmason, Johannes Schindelin,
	Julia Lawall, Eric Sunshine, Junio C Hamano,
	Abhradeep Chakraborty, Abhradeep Chakraborty

From: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>

`parse-options.c` doesn't check if the usage strings for option flags
are following the style guide or not. Style convention says, usage
strings should not start with capital letter (unless needed) and
it should not end with `.`.

Add checks to the `parse_options_check()` function to check usage
strings against the style convention.

Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
---
 parse-options.c               | 11 +++++++++++
 t/t1502-rev-parse-parseopt.sh |  4 ++--
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/parse-options.c b/parse-options.c
index 2437ad3bcdd..acd9ddbb372 100644
--- a/parse-options.c
+++ b/parse-options.c
@@ -492,6 +492,17 @@ static void parse_options_check(const struct option *opts)
 		default:
 			; /* ok. (usually accepts an argument) */
 		}
+
+		// OPTION_GROUP should be ignored
+		// if the first two characters of the help string are uppercase, then assume it is an
+		// acronym (i.e. "GPG") or special name (i.e. "HEAD"), thus allowed.
+		// else assume the usage string is violating the style convention and throw error.
+		if (opts->type != OPTION_GROUP && opts->help &&
+			opts->help[0] && isupper(opts->help[0]) &&
+			!(opts->help[1] && isupper(opts->help[1])))
+			err |= optbug(opts, xstrfmt("help should not start with capital letter unless needed: %s", opts->help));
+		if (opts->help && !ends_with(opts->help, "...") && ends_with(opts->help, "."))
+			err |= optbug(opts, xstrfmt("help should not end with a dot: %s", opts->help));
 		if (opts->argh &&
 		    strcspn(opts->argh, " _") != strlen(opts->argh))
 			err |= optbug(opts, "multi-word argh should use dash to separate words");
diff --git a/t/t1502-rev-parse-parseopt.sh b/t/t1502-rev-parse-parseopt.sh
index 284fe18e726..2a07e130b96 100755
--- a/t/t1502-rev-parse-parseopt.sh
+++ b/t/t1502-rev-parse-parseopt.sh
@@ -53,7 +53,7 @@ test_expect_success 'setup optionspec-only-hidden-switches' '
 |
 |some-command does foo and bar!
 |--
-|hidden1* A hidden switch
+|hidden1* a hidden switch
 EOF
 '
 
@@ -131,7 +131,7 @@ test_expect_success 'test --parseopt help-all output hidden switches' '
 |
 |    some-command does foo and bar!
 |
-|    --hidden1             A hidden switch
+|    --hidden1             a hidden switch
 |
 |EOF
 END_EXPECT
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 2/2] parse-options.c: add style checks for usage-strings
  2022-02-25  5:23       ` [PATCH v4 2/2] parse-options.c: add style checks for usage-strings Abhradeep Chakraborty via GitGitGadget
@ 2022-02-25  6:13         ` Junio C Hamano
  2022-02-25  8:08           ` Abhradeep Chakraborty
  2022-02-25 15:36         ` Johannes Schindelin
  1 sibling, 1 reply; 59+ messages in thread
From: Junio C Hamano @ 2022-02-25  6:13 UTC (permalink / raw)
  To: Abhradeep Chakraborty via GitGitGadget
  Cc: git, Ævar Arnfjörð Bjarmason, Johannes Schindelin,
	Julia Lawall, Eric Sunshine, Abhradeep Chakraborty

"Abhradeep Chakraborty via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> +
> +		// OPTION_GROUP should be ignored
> +		// if the first two characters of the help string are uppercase, then assume it is an
> +		// acronym (i.e. "GPG") or special name (i.e. "HEAD"), thus allowed.
> +		// else assume the usage string is violating the style convention and throw error.

Style.

	/*
	 * This is how our multi-line comments
         * look like; with slash-asterisk that opens
         * and asterisk-slash that closes one on their
         * own lines.
	 */

Also avoid overly long lines.

> +		if (opts->type != OPTION_GROUP && opts->help &&
> +			opts->help[0] && isupper(opts->help[0]) &&
> +			!(opts->help[1] && isupper(opts->help[1])))
> +			err |= optbug(opts, xstrfmt("help should not start with capital letter unless needed: %s", opts->help));
> +		if (opts->help && !ends_with(opts->help, "...") && ends_with(opts->help, "."))
> +			err |= optbug(opts, xstrfmt("help should not end with a dot: %s", opts->help));

These two calls to optbug() use xstrfmt() to grab allocated pieces
of memory and pass it as a parameter to the function, which means
the string is leaked without any chance to be freed.

Do we care?

>  		if (opts->argh &&
>  		    strcspn(opts->argh, " _") != strlen(opts->argh))
>  			err |= optbug(opts, "multi-word argh should use dash to separate words");

The existing use of optbug() we see here does not share such a
problem.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 2/2] parse-options.c: add style checks for usage-strings
  2022-02-25  6:13         ` Junio C Hamano
@ 2022-02-25  8:08           ` Abhradeep Chakraborty
  2022-02-25 17:06             ` Junio C Hamano
  0 siblings, 1 reply; 59+ messages in thread
From: Abhradeep Chakraborty @ 2022-02-25  8:08 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Abhradeep Chakraborty, Eric Sunshine,
	Ævar Arnfjörð Bjarmason, Johannes Schindelin, git

Junio C Hamano wrote:

> Style.
>
>	/*
>        * This is how our multi-line comments
>        * look like; with slash-asterisk that opens
>        * and asterisk-slash that closes one on their
>        * own lines.
>	 */
>
> Also avoid overly long lines.

Oh, sorry for that. I was in kind of a hurry ( today was
my semester exam), so I didn't look at the style guide.
Will fix it.

> These two calls to optbug() use xstrfmt() to grab allocated pieces
> of memory and pass it as a parameter to the function, which means
> the string is leaked without any chance to be freed.
>
> Do we care?
>
> >  		if (opts->argh &&
> >  		    strcspn(opts->argh, " _") != strlen(opts->argh))
> >  			err |= optbug(opts, "multi-word argh should use dash to separate words");
>
> The existing use of optbug() we see here does not share such a
> problem.

hmm, I wanted a formatting function to format (i.e. pass the
`opt->help` dynamically) the output string. The existing use of
`optbug()` that you specified has no `%s` formatter; it is a plain
string. That's why I used `xstrfmt()`. Moreover, it was in Ævar's
suggestion[1] -

> +		if (opts->help && ends_with(opts->help, "."))
> +			err |= optbug(opts, xstrfmt("argh should not end with a dot: %s", opts->help));

But I think, you're right. There is some memory leakage here.
Should I go with plain strings then? (i.e. "help should not end
with a dot" instead of `xstrfmt("help should not end with a dot:
%s", opts->help)`)

Thanks :)

[1] https://lore.kernel.org/git/220221.86tucsb4oy.gmgdl@evledraar.gmail.com/

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [cocci] [PATCH] add usage-strings ci check and amend remaining usage strings
  2022-02-22 13:42         ` Julia Lawall
  2022-02-22 14:03           ` Abhradeep Chakraborty
  2022-02-22 15:47           ` Abhradeep Chakraborty
@ 2022-02-25 15:03           ` Johannes Schindelin
  2022-02-25 15:36             ` Julia Lawall
  2022-02-25 16:28             ` Ævar Arnfjörð Bjarmason
  2 siblings, 2 replies; 59+ messages in thread
From: Johannes Schindelin @ 2022-02-25 15:03 UTC (permalink / raw)
  To: Julia Lawall
  Cc: Ævar Arnfjörð Bjarmason, Abhradeep Chakraborty,
	git, cocci

[-- Attachment #1: Type: text/plain, Size: 1679 bytes --]

Hi Julia,

On Tue, 22 Feb 2022, Julia Lawall wrote:

> [I]f there are some cases that are useful to do statically, with only
> local information, then using Coccinelle could be useful to get the
> problem out of the way once and for all.  Coccinelle doesn't support
> much processing of strings directly, but you can always write some
> python code to test the contents of a string and to create a new one.
>
> Let me know if you want to try this.  You can also check, eg the demo
> demos/pythontococci.cocci to see how to create code in a python script and
> then use it in a normal SmPL rule.
>
> If some context has to be taken into account and the context in the same
> function, then that can also be done with Coccinelle, eg
>
> A
> ...
> B
>
> matches the case where after an A there is a B on all execution paths
> (except perhaps those that end in an error exit) and
>
> A
> ... when exists
> B
>
> matches the case where there is a B sometime after executing A, even if
> that does not always occur.
>
> If the context that you are interested in is in a called function or is in
> the calling context, then Coccinelle might not be the ideal choice.
> Coccinelle works on one function at a time, so to do anything
> interprocedural, you have to do some hacks.

Right. The code in question is not actually calling a function, but a
macro, and passes a literal string to the macro that we would want to
check statically.

I did have my doubts that it would be easy with Coccinelle, but since Ævar
seemed so confident, I tried it, struggled, and decided to follow up with
you.

Thank you for confirming my suspicion!
Johannes

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH] add usage-strings ci check and amend remaining usage strings
  2022-02-22 15:47           ` Abhradeep Chakraborty
@ 2022-02-25 15:30             ` Johannes Schindelin
  2022-02-25 16:16               ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 59+ messages in thread
From: Johannes Schindelin @ 2022-02-25 15:30 UTC (permalink / raw)
  To: Abhradeep Chakraborty
  Cc: Julia Lawall, Ævar Arnfjörð Bjarmason, git

Hi,

On Tue, 22 Feb 2022, Abhradeep Chakraborty wrote:

> Julia Lawall wrote:
>
> > Of there are some cases that are useful to do statically, with only local
> > information, then using Coccinelle could be useful to get the problem out
> > of the way once and for all.  Coccinelle doesn't support much processing
> > of strings directly, but you can always write some python code to test the
> > contents of a string and to create a new one.
> >
> > Let me know if you want to try this.  You can also check, eg the demo
> > demos/pythontococci.cocci to see how to create code in a python script and
> > then use it in a normal SmPL rule.
> > ...
> > If the context that you are interested in is in a called function or is in
> > the calling context, then Coccinelle might not be the ideal choice.
> > Coccinelle works on one function at a time, so to do anything
> > interprocedural, you have to do some hacks.
>
> Though in this case, `parse-options.c check` method is better [...]

I fear that this is incorrect.

In general, it is my experience that it is a mistake any time a static
check is replaced by a runtime check.

I was ready to let it slide in this instance, but in this case I now have
proof that the `parse-options.c` check is worse than the originally
suggested `sed` chain.

That concrete proof is in the output of
https://github.com/git/git/actions/runs/1890665968, where the combination
of `ac/usage-string-fixups` and `jh/builtin-fsmonitor-part2` causes many,
many failures, but all of those failures have the same root cause: the
runtime check.

With the original `check-usage-strings.sh`, the user inspecting any
failure would see precisely what the issue is, in the `static-analysis`
job's logs. It would display something like this:

	HEAD:builtin/fsmonitor--daemon.c:1507:          N_("Max seconds to wait for background daemon startup")),

With v4 of the patch series, it does not spell out anything in
`static-analysis`. Instead, it causes 8 separate jobs to fail,
it causes failures not only in `t0012-help.sh` but also in
`t7519-status-fsmonitor.sh` and in `t7527-builtin-fsmonitor.sh`.

The purpose of t7519 and t7527 is _not_ to verify those usage strings,
though.

The worst part? Look at the relevant output of t0012 (see
https://github.com/git/git/runs/5312844492?check_suite_focus=true#step:5:4902):

	[...]
	++ git -C sub fsmonitor--daemon -h
	++ exit_code=128
	++ test 128 = 129
	++ echo 'test_expect_code: command exited with 128, we wanted 129 git -C sub fsmonitor--daemon -h'
	test_expect_code: command exited with 128, we wanted 129 git -C sub fsmonitor--daemon -h
	++ return 1
	error: last command exited with $?=1
	not ok 81 - fsmonitor--daemon can handle -h
	[...]

Do you see what usage string caused the failure? You can't. And that's
even by design:

	(
		GIT_CEILING_DIRECTORIES=$(pwd) &&
		export GIT_CEILING_DIRECTORIES &&
		test_expect_code 129 git -C sub $builtin -h >output 2>&1
	) &&
	test_i18ngrep usage output

The output is redirected, and since the runtime check added to
`parse-options.c` causes the exit code to be different from the expected
one, the output is never shown.

Arguably the most important job of a regression test is to help software
engineers to diagnose and fix the regression. As quickly and as
conveniently as possible. That means that it is not enough to point out
that there is a regression, the output should be as helpful and concise as
possible to facilitate fixing the problem.

In the above-mentioned case, it was neither as helpful nor as concise as
possible because in the test case that was supposed to identify the
problem, the actual error message was swallowed, and instead of causing
one test failure, it caused a whopping 42 test cases to fail (some of
which even show the error message, but that's not even the purpose of
those test cases).

Since the entire point of this here patch series is to help enforce Git's
rules regarding usage strings, it should expect things like the issue with
`fsmonitor--daemon` _and_ make it as painless to address such an issue.

I am afraid that I have to NAK the `parse-options.c` approach because v1
of this patch series did so much better a job.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [cocci] [PATCH] add usage-strings ci check and amend remaining usage strings
  2022-02-25 15:03           ` [cocci] " Johannes Schindelin
@ 2022-02-25 15:36             ` Julia Lawall
  2022-02-25 16:28             ` Ævar Arnfjörð Bjarmason
  1 sibling, 0 replies; 59+ messages in thread
From: Julia Lawall @ 2022-02-25 15:36 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Ævar Arnfjörð Bjarmason, Abhradeep Chakraborty,
	git, cocci

[-- Attachment #1: Type: text/plain, Size: 2399 bytes --]



On Fri, 25 Feb 2022, Johannes Schindelin wrote:

> Hi Julia,
>
> On Tue, 22 Feb 2022, Julia Lawall wrote:
>
> > [I]f there are some cases that are useful to do statically, with only
> > local information, then using Coccinelle could be useful to get the
> > problem out of the way once and for all.  Coccinelle doesn't support
> > much processing of strings directly, but you can always write some
> > python code to test the contents of a string and to create a new one.
> >
> > Let me know if you want to try this.  You can also check, eg the demo
> > demos/pythontococci.cocci to see how to create code in a python script and
> > then use it in a normal SmPL rule.
> >
> > If some context has to be taken into account and the context in the same
> > function, then that can also be done with Coccinelle, eg
> >
> > A
> > ...
> > B
> >
> > matches the case where after an A there is a B on all execution paths
> > (except perhaps those that end in an error exit) and
> >
> > A
> > ... when exists
> > B
> >
> > matches the case where there is a B sometime after executing A, even if
> > that does not always occur.
> >
> > If the context that you are interested in is in a called function or is in
> > the calling context, then Coccinelle might not be the ideal choice.
> > Coccinelle works on one function at a time, so to do anything
> > interprocedural, you have to do some hacks.
>
> Right. The code in question is not actually calling a function, but a
> macro, and passes a literal string to the macro that we would want to
> check statically.

Coccinelle doesn't care about whether a function is called or whether a
macro is called.  It considers everything to be a function.

>
> I did have my doubts that it would be easy with Coccinelle, but since Ævar
> seemed so confident, I tried it, struggled, and decided to follow up with
> you.

Something like this:

@r1@
expression e;
@@

N(e)

@script:python s@
e << r1.e;
replacement;
@@

if string_ok e
then cocci.include_match(False)
else coccinelle.replacement = "\"better string\""

@@
expression r1.e;
expression s.replacement;
@@
- N(e)
+ N(replacement)

------------------

You can fill in the definition of string_ok and better string.  I think
the \" will be necessary, because the value of an expression metavariable
at the python level is a string, so there should be a string inside of it
to make it a string expression.

julia

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 2/2] parse-options.c: add style checks for usage-strings
  2022-02-25  5:23       ` [PATCH v4 2/2] parse-options.c: add style checks for usage-strings Abhradeep Chakraborty via GitGitGadget
  2022-02-25  6:13         ` Junio C Hamano
@ 2022-02-25 15:36         ` Johannes Schindelin
  2022-02-25 16:01           ` Abhradeep Chakraborty
  2022-02-26  1:36           ` Junio C Hamano
  1 sibling, 2 replies; 59+ messages in thread
From: Johannes Schindelin @ 2022-02-25 15:36 UTC (permalink / raw)
  To: Abhradeep Chakraborty via GitGitGadget
  Cc: git, Ævar Arnfjörð Bjarmason, Julia Lawall,
	Eric Sunshine, Junio C Hamano, Abhradeep Chakraborty,
	Abhradeep Chakraborty

Hi Abhradeep,

On Fri, 25 Feb 2022, Abhradeep Chakraborty via GitGitGadget wrote:

> From: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
>
> `parse-options.c` doesn't check if the usage strings for option flags
> are following the style guide or not. Style convention says, usage
> strings should not start with capital letter (unless needed) and
> it should not end with `.`.
>
> Add checks to the `parse_options_check()` function to check usage
> strings against the style convention.

As I just pointed out in
https://lore.kernel.org/git/nycvar.QRO.7.76.6.2202251600210.11118@tvgsbejvaqbjf.bet/,
it seems that replacing the static check presented in v1 by a runtime
check needs to be reverted.

In addition to the example I presented, there is another compelling reason
to do so: with the static check, we can detect incorrect usage strings in
all code, even in code that is platform-dependent (such as in
`fsmonitor--daemon`).

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 2/2] parse-options.c: add style checks for usage-strings
  2022-02-25 15:36         ` Johannes Schindelin
@ 2022-02-25 16:01           ` Abhradeep Chakraborty
  2022-02-26  1:36           ` Junio C Hamano
  1 sibling, 0 replies; 59+ messages in thread
From: Abhradeep Chakraborty @ 2022-02-25 16:01 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Abhradeep Chakraborty, Eric Sunshine,
	Ævar Arnfjörð Bjarmason, Junio C Hamano,
	Julia Lawall, git

Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:

> As I just pointed out in
> https://lore.kernel.org/git/nycvar.QRO.7.76.6.2202251600210.11118@tvgsbejv=
> aqbjf.bet/,
> it seems that replacing the static check presented in v1 by a runtime
> check needs to be reverted.
>
> In addition to the example I presented, there is another compelling reason
> to do so: with the static check, we can detect incorrect usage strings in
> all code, even in code that is platform-dependent (such as in
> `fsmonitor--daemon`).

First of all, thank you so much for putting so much time to look into
my PR. I appriciate your research about various possible outcomes of this
Patch request.

I saw your mail where you listed some of the disadvantages of the current
version. I also agree with the arguments you provided and it is also true
that one wouldn't find any clue by seeing the output of the `CI` link
you mentioned (i.e. https://github.com/git/git/runs/5312914410?check_suite_focus=true).

Let's see what Junio, Ævar and others say about this.

Thanks :)

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH] add usage-strings ci check and amend remaining usage strings
  2022-02-25 15:30             ` Johannes Schindelin
@ 2022-02-25 16:16               ` Ævar Arnfjörð Bjarmason
  2022-02-26  4:22                 ` Abhradeep Chakraborty
  0 siblings, 1 reply; 59+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-25 16:16 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Abhradeep Chakraborty, Julia Lawall, git


On Fri, Feb 25 2022, Johannes Schindelin wrote:

> Hi,
>
> On Tue, 22 Feb 2022, Abhradeep Chakraborty wrote:
>
>> Julia Lawall wrote:
>>
>> > Of there are some cases that are useful to do statically, with only local
>> > information, then using Coccinelle could be useful to get the problem out
>> > of the way once and for all.  Coccinelle doesn't support much processing
>> > of strings directly, but you can always write some python code to test the
>> > contents of a string and to create a new one.
>> >
>> > Let me know if you want to try this.  You can also check, eg the demo
>> > demos/pythontococci.cocci to see how to create code in a python script and
>> > then use it in a normal SmPL rule.
>> > ...
>> > If the context that you are interested in is in a called function or is in
>> > the calling context, then Coccinelle might not be the ideal choice.
>> > Coccinelle works on one function at a time, so to do anything
>> > interprocedural, you have to do some hacks.
>>
>> Though in this case, `parse-options.c check` method is better [...]
>
> I fear that this is incorrect.
>
> In general, it is my experience that it is a mistake any time a static
> check is replaced by a runtime check.
>
> I was ready to let it slide in this instance, but in this case I now have
> proof that the `parse-options.c` check is worse than the originally
> suggested `sed` chain.
>
> That concrete proof is in the output of
> https://github.com/git/git/actions/runs/1890665968, where the combination
> of `ac/usage-string-fixups` and `jh/builtin-fsmonitor-part2` causes many,
> many failures, but all of those failures have the same root cause: the
> runtime check.
>
> With the original `check-usage-strings.sh`, the user inspecting any
> failure would see precisely what the issue is, in the `static-analysis`
> job's logs. It would display something like this:
>
> 	HEAD:builtin/fsmonitor--daemon.c:1507:          N_("Max seconds to wait for background daemon startup")),
>
> With v4 of the patch series, it does not spell out anything in
> `static-analysis`. Instead, it causes 8 separate jobs to fail,
> it causes failures not only in `t0012-help.sh` but also in
> `t7519-status-fsmonitor.sh` and in `t7527-builtin-fsmonitor.sh`.
>
> The purpose of t7519 and t7527 is _not_ to verify those usage strings,
> though.
>
> The worst part? Look at the relevant output of t0012 (see
> https://github.com/git/git/runs/5312844492?check_suite_focus=true#step:5:4902):
>
> 	[...]
> 	++ git -C sub fsmonitor--daemon -h
> 	++ exit_code=128
> 	++ test 128 = 129
> 	++ echo 'test_expect_code: command exited with 128, we wanted 129 git -C sub fsmonitor--daemon -h'
> 	test_expect_code: command exited with 128, we wanted 129 git -C sub fsmonitor--daemon -h
> 	++ return 1
> 	error: last command exited with $?=1
> 	not ok 81 - fsmonitor--daemon can handle -h
> 	[...]
>
> Do you see what usage string caused the failure? You can't. And that's
> even by design:
>
> 	(
> 		GIT_CEILING_DIRECTORIES=$(pwd) &&
> 		export GIT_CEILING_DIRECTORIES &&
> 		test_expect_code 129 git -C sub $builtin -h >output 2>&1
> 	) &&
> 	test_i18ngrep usage output
>
> The output is redirected, and since the runtime check added to
> `parse-options.c` causes the exit code to be different from the expected
> one, the output is never shown.
>
> Arguably the most important job of a regression test is to help software
> engineers to diagnose and fix the regression. As quickly and as
> conveniently as possible. That means that it is not enough to point out
> that there is a regression, the output should be as helpful and concise as
> possible to facilitate fixing the problem.
>
> In the above-mentioned case, it was neither as helpful nor as concise as
> possible because in the test case that was supposed to identify the
> problem, the actual error message was swallowed, and instead of causing
> one test failure, it caused a whopping 42 test cases to fail (some of
> which even show the error message, but that's not even the purpose of
> those test cases).
>
> Since the entire point of this here patch series is to help enforce Git's
> rules regarding usage strings, it should expect things like the issue with
> `fsmonitor--daemon` _and_ make it as painless to address such an issue.
>
> I am afraid that I have to NAK the `parse-options.c` approach because v1
> of this patch series did so much better a job.

I think that's a bit of an overreaction to what I think is a solid v2 in
<pull.1147.v2.git.1645545507689.gitgitgadget@gmail.com>, i.e. that we
must go back to v1 because we encountered this issue.

A. I think you're right about the t0012-help.sh output being bad,
   but that's rather easily fixed with something like the [1] below.

   I've run into that a few times, wished it was better, and manually
   grepped or cat'd the "output" file.

   Part of that is ultimately because we're mixing and matching whether this
   "usage" output goes on stdout or stderr in various commands.

B. The fsmonitor--daemon case is worse than most because it's only running on
   OS or Windows, i.e. the error we'd get in various other CI jobs is ifdef'd
   away, even though we could run the parse_options() part there.

   IIRC that's something I commented on in previous rounds of that series...

C. The case of 42 tests failing because of this could be addressed by just having
   t0012-help.sh do these checks if we wanted, although in that case we'd need to
   make sure we deal with other test blind spots. I.e. the
   "GIT_TEST_PARSE_OPTIONS_DUMP_FIELD_HELP" suggestion I had.

D. These sorts of check, by their nature, have an initial period of growing
   pains due to other in-flight topics. Once we move past that it's usually a
   non-issue going forward, as issues will be caught locally before patch
   submission.

   Data in favor of that is various other checks in parse_options_check() being
   mostly a non-issue, e.g. Junio's b6c2a0d45d4 (parse-options: make sure argh
   string does not have SP or _, 2014-03-23).

In this case I don't see how some minor issues when merging this with
"seen" would have us abandon the v1 and commit to a fragile parsing of C
code in shellscript instead, or with some coccinelle check that would
have inherent issues finding the full context we need (passed-down flags
etc.).

1. 

diff --git a/t/t0012-help.sh b/t/t0012-help.sh
index 6c3e1f7159d..5474d463467 100755
--- a/t/t0012-help.sh
+++ b/t/t0012-help.sh
@@ -237,15 +237,24 @@ test_expect_success 'generate builtin list' '
 	git --list-cmds=builtins >builtins
 '
 
+builtin_in_sub () {
+	(
+		GIT_CEILING_DIRECTORIES=$(pwd) &&
+		export GIT_CEILING_DIRECTORIES &&
+		"$@"
+	)
+}
+
+
 while read builtin
 do
-	test_expect_success "$builtin can handle -h" '
-		(
-			GIT_CEILING_DIRECTORIES=$(pwd) &&
-			export GIT_CEILING_DIRECTORIES &&
-			test_expect_code 129 git -C sub $builtin -h >output 2>&1
-		) &&
-		test_i18ngrep usage output
+	test_expect_success "invoking '$builtin -h' yields exit code 129" '
+		builtin_in_sub test_expect_code 129 git -C sub $builtin -h
+	'
+
+	test_expect_success "invoking '$builtin -h' output" '
+		builtin_in_sub test_expect_code 129 git -C sub $builtin -h >output 2>&1 &&
+		grep usage output
 	'
 done <builtins
 

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* Re: [cocci] [PATCH] add usage-strings ci check and amend remaining usage strings
  2022-02-25 15:03           ` [cocci] " Johannes Schindelin
  2022-02-25 15:36             ` Julia Lawall
@ 2022-02-25 16:28             ` Ævar Arnfjörð Bjarmason
  1 sibling, 0 replies; 59+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-25 16:28 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Julia Lawall, Abhradeep Chakraborty, git, cocci


On Fri, Feb 25 2022, Johannes Schindelin wrote:

> Hi Julia,
>
> On Tue, 22 Feb 2022, Julia Lawall wrote:
>
>> [I]f there are some cases that are useful to do statically, with only
>> local information, then using Coccinelle could be useful to get the
>> problem out of the way once and for all.  Coccinelle doesn't support
>> much processing of strings directly, but you can always write some
>> python code to test the contents of a string and to create a new one.
>>
>> Let me know if you want to try this.  You can also check, eg the demo
>> demos/pythontococci.cocci to see how to create code in a python script and
>> then use it in a normal SmPL rule.
>>
>> If some context has to be taken into account and the context in the same
>> function, then that can also be done with Coccinelle, eg
>>
>> A
>> ...
>> B
>>
>> matches the case where after an A there is a B on all execution paths
>> (except perhaps those that end in an error exit) and
>>
>> A
>> ... when exists
>> B
>>
>> matches the case where there is a B sometime after executing A, even if
>> that does not always occur.
>>
>> If the context that you are interested in is in a called function or is in
>> the calling context, then Coccinelle might not be the ideal choice.
>> Coccinelle works on one function at a time, so to do anything
>> interprocedural, you have to do some hacks.
>
> Right. The code in question is not actually calling a function, but a
> macro, and passes a literal string to the macro that we would want to
> check statically.
>
> I did have my doubts that it would be easy with Coccinelle, but since Ævar
> seemed so confident, I tried it, struggled, and decided to follow up with
> you.
>
> Thank you for confirming my suspicion!
> Johannes

In case it's not clear from the upthread (and I thought my [1] explained
it well enough) I never thought it would be easy or even possible to do
this particular thing with coccinelle.

I.e. I mentioned in [1]:

    Aside: if we did want to do the "parse C" method the right way to do it
    would be to have a coccinelle script do it

So it's intended as a side-note to explain to a new contributor that
*if* we do end up wanting to parse or transform C we have coccinelle,
and it's a great tool for those cases where such static transformations
are easy. I and others have added some in-tree in the past where
appropriate.

But I then went on to say (and elaborated on later in [2]) that in this
case the right thing to do is runtime checking.

So, I'm sorry if you wasted time on it. In either case it seems we've
ended up in agreement in this case about appropriate uses of coccinelle.

I.e. it's a fantastic tool as a semantic patch engine, but it
understandably has limitations where you'd effectively need it to
execute your program to decide what to do, as is the case with the
parse_options() API and the eventual parse_options_check() etc. doing
assertions depending on flags that got passed down.

1. https://lore.kernel.org/git/220221.86tucsb4oy.gmgdl@evledraar.gmail.com/
2. https://lore.kernel.org/git/220222.867d9n83ir.gmgdl@evledraar.gmail.com/

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 2/2] parse-options.c: add style checks for usage-strings
  2022-02-25  8:08           ` Abhradeep Chakraborty
@ 2022-02-25 17:06             ` Junio C Hamano
  2022-02-26  3:57               ` Abhradeep Chakraborty
  0 siblings, 1 reply; 59+ messages in thread
From: Junio C Hamano @ 2022-02-25 17:06 UTC (permalink / raw)
  To: Abhradeep Chakraborty
  Cc: Eric Sunshine, Ævar Arnfjörð Bjarmason,
	Johannes Schindelin, git

Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com> writes:

>> These two calls to optbug() use xstrfmt() to grab allocated pieces
>> of memory and pass it as a parameter to the function, which means
>> the string is leaked without any chance to be freed.
>>
>> Do we care?
>>
>> >  		if (opts->argh &&
>> >  		    strcspn(opts->argh, " _") != strlen(opts->argh))
>> >  			err |= optbug(opts, "multi-word argh should use dash to separate words");
>>
>> The existing use of optbug() we see here does not share such a
>> problem.
>
> hmm, I wanted a formatting function to format (i.e. pass the
> `opt->help` dynamically) the output string. The existing use of
> `optbug()` that you specified has no `%s` formatter; it is a plain
> string. That's why I used `xstrfmt()`. Moreover, it was in Ævar's
> suggestion[1] -
>
>> +		if (opts->help && ends_with(opts->help, "."))
>> +			err |= optbug(opts, xstrfmt("argh should not end with a dot: %s", opts->help));
>
> But I think, you're right. There is some memory leakage here.
> Should I go with plain strings then? (i.e. "help should not end
> with a dot" instead of `xstrfmt("help should not end with a dot:
> %s", opts->help)`)

Sorry that I've given you a trick question, when I know you are
quite new to the community.

I think the right answer to "Do we care?" is "In this case, because
we are about to call exit(), we don't care.  The extra complexity
and code necessary to retain the memory we get from xstrfmt and free
it is not worth it."  It's not like we do this in a loop that iterates
unbounded number of times before the exit() happens (in which case
we should care).

Thanks.





^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 2/2] parse-options.c: add style checks for usage-strings
  2022-02-25 15:36         ` Johannes Schindelin
  2022-02-25 16:01           ` Abhradeep Chakraborty
@ 2022-02-26  1:36           ` Junio C Hamano
  2022-02-26  6:08             ` Junio C Hamano
  1 sibling, 1 reply; 59+ messages in thread
From: Junio C Hamano @ 2022-02-26  1:36 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Abhradeep Chakraborty via GitGitGadget, git,
	Ævar Arnfjörð Bjarmason, Julia Lawall,
	Eric Sunshine, Abhradeep Chakraborty

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

>> Add checks to the `parse_options_check()` function to check usage
>> strings against the style convention.
>
> As I just pointed out in
> https://lore.kernel.org/git/nycvar.QRO.7.76.6.2202251600210.11118@tvgsbejvaqbjf.bet/,
> it seems that replacing the static check presented in v1 by a runtime
> check needs to be reverted.

Sorry, but I am not sure how that conclusion follows from a breakage
in a topic in flight that was discovered by the check.

I do not know if a coccinelle based solution is sufficiently easy,
simple and robust enough to encourage us to scrap what has already
been proposed and reviewed, instead of leaving it as a topic for a
future incremental improvement that we can make on top.

> In addition to the example I presented, there is another compelling reason
> to do so: with the static check, we can detect incorrect usage strings in
> all code, even in code that is platform-dependent (such as in
> `fsmonitor--daemon`).

Yes and no.  

I would imagine that large enough platforms that have their own
conditionally compiled #ifdef/#endif block already have CI to build
their conditionally compiled block in practice.  I do not see the
above as a compelling reason to grow and shift the scope of these
two patches.

Thanks.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 2/2] parse-options.c: add style checks for usage-strings
  2022-02-25 17:06             ` Junio C Hamano
@ 2022-02-26  3:57               ` Abhradeep Chakraborty
  0 siblings, 0 replies; 59+ messages in thread
From: Abhradeep Chakraborty @ 2022-02-26  3:57 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Abhradeep Chakraborty, Johannes Schindelin, Eric Sunshine,
	Ævar Arnfjörð Bjarmason, Julia Lawall, git


Junio C Hamano <gitster@pobox.com> wrote:

> Sorry that I've given you a trick question, when I know you are
> quite new to the community.

There is nothing to say `sorry`. Every review comment is teaching me
new things. E.g. If you didn't ask me this question, I would not go to
the codebase and see the proper handling of `xstrfmt`. So, thanks.

> I think the right answer to "Do we care?" is "In this case, because
> we are about to call exit(), we don't care.  The extra complexity
> and code necessary to retain the memory we get from xstrfmt and free
> it is not worth it."  It's not like we do this in a loop that iterates
> unbounded number of times before the exit() happens (in which case
> we should care).
>
> Thanks.

Got it.

Thanks :)

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH] add usage-strings ci check and amend remaining usage strings
  2022-02-25 16:16               ` Ævar Arnfjörð Bjarmason
@ 2022-02-26  4:22                 ` Abhradeep Chakraborty
  2022-02-26  8:55                   ` Julia Lawall
  0 siblings, 1 reply; 59+ messages in thread
From: Abhradeep Chakraborty @ 2022-02-26  4:22 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Abhradeep Chakraborty, Johannes Schindelin, Eric Sunshine,
	Junio C Hamano, Julia Lawall, git


Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:

> A. I think you're right about the t0012-help.sh output being bad,
>    but that's rather easily fixed with something like the [1] below.

Do you think the fix you suggested should be a part of this Patch series
or a dedicated patch request is needed for this?

> C. The case of 42 tests failing because of this could be addressed by just having
>    t0012-help.sh do these checks if we wanted, although in that case we'd need to
>    make sure we deal with other test blind spots. I.e. the
>    "GIT_TEST_PARSE_OPTIONS_DUMP_FIELD_HELP" suggestion I had.

Pardon me, I am having problem to understand the 
`"GIT_TEST_PARSE_OPTIONS_DUMP_FIELD_HELP" suggestion I had.` part
here. Could you please explain a little bit?

Thanks :)

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 2/2] parse-options.c: add style checks for usage-strings
  2022-02-26  1:36           ` Junio C Hamano
@ 2022-02-26  6:08             ` Junio C Hamano
  2022-02-26  6:57               ` Abhradeep Chakraborty
  0 siblings, 1 reply; 59+ messages in thread
From: Junio C Hamano @ 2022-02-26  6:08 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Abhradeep Chakraborty via GitGitGadget, git,
	Ævar Arnfjörð Bjarmason, Julia Lawall,
	Eric Sunshine, Abhradeep Chakraborty

Junio C Hamano <gitster@pobox.com> writes:

> I would imagine that large enough platforms that have their own
> conditionally compiled #ifdef/#endif block already have CI to build
> their conditionally compiled block in practice.  I do not see the
> above as a compelling reason to grow and shift the scope of these
> two patches.

Let's instead drop [2/2] for now.  I do not want us to go back to
shell script that pretends to know about C language, and I do not
want to block [1/2] by waiting for a replacement.  Fixes in [1/2]
are pretty much uncontroversial ones that can even be fast-tracked
down to 'master'.



^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 2/2] parse-options.c: add style checks for usage-strings
  2022-02-26  6:08             ` Junio C Hamano
@ 2022-02-26  6:57               ` Abhradeep Chakraborty
  2022-02-27 19:15                 ` Junio C Hamano
  0 siblings, 1 reply; 59+ messages in thread
From: Abhradeep Chakraborty @ 2022-02-26  6:57 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Abhradeep Chakraborty, Ævar Arnfjörð Bjarmason,
	Johannes Schindelin, Eric Sunshine, Julia Lawall, git


Junio C Hamano <gitster@pobox.com> wrote:

> Let's instead drop [2/2] for now.  I do not want us to go back to
> shell script that pretends to know about C language, and I do not
> want to block [1/2] by waiting for a replacement.  Fixes in [1/2]
> are pretty much uncontroversial ones that can even be fast-tracked
> down to 'master'.

Though, as a new contributor, I felt bad about dropping the last
patch, but if you think the last patch request needs more discussion
( which I think is needed) - I also in favour of dropping the last
commit.

Would you do this on your side or I will re-submit it with the first
commit?

Thanks :)

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH] add usage-strings ci check and amend remaining usage strings
  2022-02-26  4:22                 ` Abhradeep Chakraborty
@ 2022-02-26  8:55                   ` Julia Lawall
  0 siblings, 0 replies; 59+ messages in thread
From: Julia Lawall @ 2022-02-26  8:55 UTC (permalink / raw)
  To: Abhradeep Chakraborty
  Cc: Ævar Arnfjörð Bjarmason, Johannes Schindelin,
	Eric Sunshine, Junio C Hamano, git

[-- Attachment #1: Type: text/plain, Size: 1011 bytes --]

Hello,

Since it seems that Coccinelle is not useful for your problem, could you
remove me from the CC list on this discussion?

thanks,
julia

On Sat, 26 Feb 2022, Abhradeep Chakraborty wrote:

>
> Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
>
> > A. I think you're right about the t0012-help.sh output being bad,
> >    but that's rather easily fixed with something like the [1] below.
>
> Do you think the fix you suggested should be a part of this Patch series
> or a dedicated patch request is needed for this?
>
> > C. The case of 42 tests failing because of this could be addressed by just having
> >    t0012-help.sh do these checks if we wanted, although in that case we'd need to
> >    make sure we deal with other test blind spots. I.e. the
> >    "GIT_TEST_PARSE_OPTIONS_DUMP_FIELD_HELP" suggestion I had.
>
> Pardon me, I am having problem to understand the
> `"GIT_TEST_PARSE_OPTIONS_DUMP_FIELD_HELP" suggestion I had.` part
> here. Could you please explain a little bit?
>
> Thanks :)
>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 2/2] parse-options.c: add style checks for usage-strings
  2022-02-26  6:57               ` Abhradeep Chakraborty
@ 2022-02-27 19:15                 ` Junio C Hamano
  2022-02-28  7:39                   ` Abhradeep Chakraborty
  0 siblings, 1 reply; 59+ messages in thread
From: Junio C Hamano @ 2022-02-27 19:15 UTC (permalink / raw)
  To: Abhradeep Chakraborty
  Cc: Ævar Arnfjörð Bjarmason, Johannes Schindelin,
	Eric Sunshine, Julia Lawall, git

Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com> writes:

> Junio C Hamano <gitster@pobox.com> wrote:
>
>> Let's instead drop [2/2] for now.  I do not want us to go back to
>> shell script that pretends to know about C language, and I do not
>> want to block [1/2] by waiting for a replacement.  Fixes in [1/2]
>> are pretty much uncontroversial ones that can even be fast-tracked
>> down to 'master'.
>
> Though, as a new contributor, I felt bad about dropping the last
> patch, but if you think the last patch request needs more discussion
> ( which I think is needed) - I also in favour of dropping the last
> commit.
>
> Would you do this on your side or I will re-submit it with the first
> commit?

Nah, I can just discard the second commit and keep the first one.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 2/2] parse-options.c: add style checks for usage-strings
  2022-02-27 19:15                 ` Junio C Hamano
@ 2022-02-28  7:39                   ` Abhradeep Chakraborty
  2022-02-28 17:48                     ` Junio C Hamano
  0 siblings, 1 reply; 59+ messages in thread
From: Abhradeep Chakraborty @ 2022-02-28  7:39 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Abhradeep Chakraborty, Ævar Arnfjörð Bjarmason,
	Johannes Schindelin, Eric Sunshine, git


Junio C Hamano <gitster@pobox.com> wrote:

> Nah, I can just discard the second commit and keep the first one.

Okay, that's great. But one thing I want to ask - How the discussion
for `adding check for usage strings` will be held i.e. Whether the
idea is discarded for now.

If it is not discarded, then how to proceed? Johannes prefers the first
version and Ævar prefers the `add check to parse-options.c` version.

Thanks :)

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 2/2] parse-options.c: add style checks for usage-strings
  2022-02-28  7:39                   ` Abhradeep Chakraborty
@ 2022-02-28 17:48                     ` Junio C Hamano
  2022-02-28 19:32                       ` Ævar Arnfjörð Bjarmason
                                         ` (3 more replies)
  0 siblings, 4 replies; 59+ messages in thread
From: Junio C Hamano @ 2022-02-28 17:48 UTC (permalink / raw)
  To: Abhradeep Chakraborty
  Cc: Ævar Arnfjörð Bjarmason, Johannes Schindelin,
	Eric Sunshine, git

Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com> writes:

> Okay, that's great. But one thing I want to ask - How the discussion
> for `adding check for usage strings` will be held i.e. Whether the
> idea is discarded for now.
>
> If it is not discarded, then how to proceed? Johannes prefers the first
> version and Ævar prefers the `add check to parse-options.c` version.

My take on it is that the "first version" that uses an ad-hoc shell
script will not become acceptably robust.  If coccinelle or other
static analyzer can help us check more reliably, that would be great
because we won't incur runtime cost of checking, like the embedded
check we added in the latest version that we are tentatively removing.

I also think Dscho simply overreacted only because the check broke
an in-flight topic that is from his group, which is not universally
built, and the tests in it was written in such a way that the error
output from the embedded check was not immediately available when
run in the CI, making it harder to debug.  None of that is a fault
in the approach of using the embedded check.

If the embedded check were there from the beginning, together with
tons of the existing checks done by parse_options_check(), the
developers themselves of the in-flight topic(s) would have caught
the problem, even before it hit the public CI.  I am very sure Dscho
wouldn't have complained or even noticed that you added a new check
to the parse_options_check().

So from my point of view, plan should be

 (0) I have been assuming that the check we removed tentatively is
     correct and the breakage in in-flight topic caught usage
     strings that were malformed.  If not, we need to tweak it to
     make sure it does not produce false positives.

 (1) Help Microsoft folks fix the in-flight topic with faulty usage
     strings.

 (2) Rethink if parse_options_check() can be made optional at
     runtime, which would (a) allow our test to enable it, and allow
     us to test all code paths that use parse_options() centrally,
     and (b) allow us to bypass the check while the end-user runs
     "git", to avoid overhead of checking the same option[] array,
     which does not change between invocations of "git", over and
     over again all over the world.

     We may add the check back to parse_options_check() after doing
     the above.  There are already tons of "check sanity of what is
     inside option[]" in there, and it would be beneficial if we can
     separate out from parse_options_start() the sanity checking
     code, regardless of this topic.

 (3) While (2) is ongoing, we can let people also explore static
     analysis possibilities.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 2/2] parse-options.c: add style checks for usage-strings
  2022-02-28 17:48                     ` Junio C Hamano
@ 2022-02-28 19:32                       ` Ævar Arnfjörð Bjarmason
  2022-03-01  6:38                       ` Abhradeep Chakraborty
                                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 59+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-28 19:32 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Abhradeep Chakraborty, Johannes Schindelin, Eric Sunshine, git


On Mon, Feb 28 2022, Junio C Hamano wrote:

> [...]
> So from my point of view, plan should be
>
>  (0) I have been assuming that the check we removed tentatively is
>      correct and the breakage in in-flight topic caught usage
>      strings that were malformed.  If not, we need to tweak it to
>      make sure it does not produce false positives.
>
>  (1) Help Microsoft folks fix the in-flight topic with faulty usage
>      strings.

Agreed.

>  (2) Rethink if parse_options_check() can be made optional at
>      runtime, which would (a) allow our test to enable it, and allow
>      us to test all code paths that use parse_options() centrally,
>      and (b) allow us to bypass the check while the end-user runs
>      "git", to avoid overhead of checking the same option[] array,
>      which does not change between invocations of "git", over and
>      over again all over the world.
>
>      We may add the check back to parse_options_check() after doing
>      the above.  There are already tons of "check sanity of what is
>      inside option[]" in there, and it would be beneficial if we can
>      separate out from parse_options_start() the sanity checking
>      code, regardless of this topic.

This is a good idea, but while t0012-help.sh catches most of it, it
doesn't cover e.g. sub-commands that call parse_options() in N functions
one after the other.

We could that in t0012-help.sh with (pseudocode):

    for subcmd write verify ...
    do
        test_expect_success '...' 'git commit-graph $subcmd -h'
    done

etc., but that still won't catch *all* of it, and we don't have a way to
spew out "what commands use subcommands".

Hence why we need to run the rest of the test suite, and why these
things aren't some one-off GIT_TEST_ mode or t/helper/*.c code already.

>  (3) While (2) is ongoing, we can let people also explore static
>      analysis possibilities.

I think with in-flight concerns with (0) and (1) addressed what we have
here is really good enough for now, and we could just add it to the
existing parse_options_check() without needing (2) and (3) at this
point.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 2/2] parse-options.c: add style checks for usage-strings
  2022-02-28 17:48                     ` Junio C Hamano
  2022-02-28 19:32                       ` Ævar Arnfjörð Bjarmason
@ 2022-03-01  6:38                       ` Abhradeep Chakraborty
  2022-03-01 11:12                         ` Junio C Hamano
  2022-03-01 19:37                       ` Johannes Schindelin
  2022-03-01 20:08                       ` [PATCH] parse-options: make parse_options_check() test-only Junio C Hamano
  3 siblings, 1 reply; 59+ messages in thread
From: Abhradeep Chakraborty @ 2022-03-01  6:38 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Abhradeep Chakraborty, Ævar Arnfjörð Bjarmason,
	Johannes Schindelin, Eric Sunshine, git


Junio C Hamano <gitster@pobox.com> wrote:

> I  also think Dscho simply overreacted only because the check broke
> an in-flight topic that is from his group, which is not universally
> built, and the tests in it was written in such a way that the error
> output from the embedded check was not immediately available when
> run in the CI, making it harder to debug.  None of that is a fault
> in the approach of using the embedded check.
>
> If the embedded check were there from the beginning, together with
> tons of the existing checks done by parse_options_check(), the
> developers themselves of the in-flight topic(s) would have caught
> the problem, even before it hit the public CI.  I am very sure Dscho
> wouldn't have complained or even noticed that you added a new check
> to the parse_options_check().

Hmm, that's true.

>  (2) Rethink if parse_options_check() can be made optional at
>      runtime, which would (a) allow our test to enable it, and allow
>      us to test all code paths that use parse_options() centrally,
>      and (b) allow us to bypass the check while the end-user runs
>      "git", to avoid overhead of checking the same option[] array,
>      which does not change between invocations of "git", over and
>      over again all over the world.
>
>      We may add the check back to parse_options_check() after doing
>      the above.  There are already tons of "check sanity of what is
>      inside option[]" in there, and it would be beneficial if we can
>      separate out from parse_options_start() the sanity checking
>      code, regardless of this topic.
>
>  (3) While (2) is ongoing, we can let people also explore static
>      analysis possibilities.

I agree with you. But I think these two points(specially (2)) deserve
a dedicated discussion/patch thread. Because, the latest version of this
patch series (actually this patch series itself) only cares about the
`usage strings`.

So, I argue you to not discard the last commit for now. As you said `There are
already tons of "check sanity of what is inside option[]"`, integrating
one more sanity check would not affect it. I am saying it not because
I made that commit. The discussion or patch integration of (2) and (3)
may take few weeks (or more than a month may be; I also would like to
take part/contribute to that discussion/PR). I fear that another
set of invalid usage-strings would be added in that time. In that case,
we have to make another commit/PR for correcting those strings - disrupting
the purpose of this first commit you are willing to merge.

As Ævar also said - 

> I think with in-flight concerns with (0) and (1) addressed what we have
> here is really good enough for now, and we could just add it to the
> existing parse_options_check() without needing (2) and (3) at this
> point.

Thanks :)

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 2/2] parse-options.c: add style checks for usage-strings
  2022-03-01  6:38                       ` Abhradeep Chakraborty
@ 2022-03-01 11:12                         ` Junio C Hamano
  0 siblings, 0 replies; 59+ messages in thread
From: Junio C Hamano @ 2022-03-01 11:12 UTC (permalink / raw)
  To: Abhradeep Chakraborty
  Cc: Ævar Arnfjörð Bjarmason, Johannes Schindelin,
	Eric Sunshine, git

Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com> writes:

>>  (2) Rethink if parse_options_check() can be made optional at
>> ...
>>  (3) While (2) is ongoing, we can let people also explore static
>>      analysis possibilities.
>
> I agree with you. But I think these two points(specially (2)) deserve
> a dedicated discussion/patch thread. Because, the latest version of this
> patch series (actually this patch series itself) only cares about the
> `usage strings`.

Yes, absolutely.  So applying [2/2] in haste is not a good idea at
all.  Before we accumulate more cruft on top, we should stop and
think if the approach we are taking is sensible to begin with, or
we'll make an already bad situation even worse.


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 2/2] parse-options.c: add style checks for usage-strings
  2022-02-28 17:48                     ` Junio C Hamano
  2022-02-28 19:32                       ` Ævar Arnfjörð Bjarmason
  2022-03-01  6:38                       ` Abhradeep Chakraborty
@ 2022-03-01 19:37                       ` Johannes Schindelin
  2022-03-03 17:34                         ` Abhradeep Chakraborty
  2022-03-01 20:08                       ` [PATCH] parse-options: make parse_options_check() test-only Junio C Hamano
  3 siblings, 1 reply; 59+ messages in thread
From: Johannes Schindelin @ 2022-03-01 19:37 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Abhradeep Chakraborty, Ævar Arnfjörð Bjarmason,
	Eric Sunshine, git

[-- Attachment #1: Type: text/plain, Size: 4926 bytes --]

Hi Junio,

On Mon, 28 Feb 2022, Junio C Hamano wrote:

> Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com> writes:
>
> > Okay, that's great. But one thing I want to ask - How the discussion
> > for `adding check for usage strings` will be held i.e. Whether the
> > idea is discarded for now.
> >
> > If it is not discarded, then how to proceed? Johannes prefers the first
> > version and Ævar prefers the `add check to parse-options.c` version.
>
> My take on it is that the "first version" that uses an ad-hoc shell
> script will not become acceptably robust.

It is unfortunate that the challenge had been characterized as "parse C",
when in reality we are talking about highly idiomatic code. It's not like
we accept arbitrary input in the `OPT_...()` lines. We _really_ want the
option usage string to be a string that is enclosed in `N_()`.

Additionally, this is about Git's own code, not arbitrary C code provided
by users. That makes that shell script more on par with `t/chainlint.sed`
than with `contrib/coccinelle/*`.

Having said that...

> If coccinelle or other static analyzer can help us check more reliably,
> that would be great because we won't incur runtime cost of checking,
> like the embedded check we added in the latest version that we are
> tentatively removing.

I think that Julia gave us enough to work with, so we can (ab-)use
Coccinelle for static usage string checks, and we should probably do that,
too.

> I also think Dscho simply overreacted only because the check broke
> an in-flight topic that is from his group, which is not universally
> built, and the tests in it was written in such a way that the error
> output from the embedded check was not immediately available when
> run in the CI, making it harder to debug.  None of that is a fault
> in the approach of using the embedded check.

No, I would have reacted the same way if I had seen the failures in any
other topic, with an equally trivial fix that blooms into 42 separate test
case failures.

This explosion made me realize _why_ I found the suggestion to patch
`parse_options()` iffy in the first place: it replaces a static check with
a runtime check, which is almost always something that is regretted later.

And since Abhradeep is a new contributor, I found it important to steer
the direction toward sound advice that they can use over and over again
over the course of their career: whenever possible, prefer static checks
over runtime ones.

> If the embedded check were there from the beginning, together with
> tons of the existing checks done by parse_options_check(), the
> developers themselves of the in-flight topic(s) would have caught
> the problem, even before it hit the public CI.  I am very sure Dscho
> wouldn't have complained or even noticed that you added a new check
> to the parse_options_check().

Indeed, if no static check had been proposed first, I would not have
caught on to the unfortunate suggestion to use a runtime check _instead_.

> So from my point of view, plan should be
>
>  (0) I have been assuming that the check we removed tentatively is
>      correct and the breakage in in-flight topic caught usage
>      strings that were malformed.  If not, we need to tweak it to
>      make sure it does not produce false positives.
>
>  (1) Help Microsoft folks fix the in-flight topic with faulty usage
>      strings.

You're so sweet, but I already did that in parallel.

>
>  (2) Rethink if parse_options_check() can be made optional at
>      runtime, which would (a) allow our test to enable it, and allow
>      us to test all code paths that use parse_options() centrally,
>      and (b) allow us to bypass the check while the end-user runs
>      "git", to avoid overhead of checking the same option[] array,
>      which does not change between invocations of "git", over and
>      over again all over the world.
>
>      We may add the check back to parse_options_check() after doing
>      the above.  There are already tons of "check sanity of what is
>      inside option[]" in there, and it would be beneficial if we can
>      separate out from parse_options_start() the sanity checking
>      code, regardless of this topic.
>
>  (3) While (2) is ongoing, we can let people also explore static
>      analysis possibilities.

Of course, if we can convince Coccinelle (together with Python) to give us
the static check, we might very well be able to port more of
`parse_options_check()` from runtime checks to static ones, which would be
a clear win.

If that is possible, we could save ourselves a lot of time by skipping (2)
altogether.

And as I said, Julia's advice looked really good. If only I wasn't
desperately short on time, I would have given it a try because it sounds
not only fun but also very, very useful in Git's context.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH] parse-options: make parse_options_check() test-only
  2022-02-28 17:48                     ` Junio C Hamano
                                         ` (2 preceding siblings ...)
  2022-03-01 19:37                       ` Johannes Schindelin
@ 2022-03-01 20:08                       ` Junio C Hamano
  2022-03-01 21:57                         ` Ævar Arnfjörð Bjarmason
  3 siblings, 1 reply; 59+ messages in thread
From: Junio C Hamano @ 2022-03-01 20:08 UTC (permalink / raw)
  To: git

The array of options given to the parse-options API is sanity
checked for reuse of a single-letter option for multiple entries and
other programmer mistakes by calling parse_options_check() from
parse_options_start().  This allows our developers to catch silly
mistakes early, but all callers of parse-options API pays this cost.
Once the set of options in an array is validated and passes this
check, until a programmer modifies the array, there is no way for it
to fail the check, which is wasteful.

Introduce the GIT_TEST_PARSE_OPTIONS_CHECK environment variable and
make the sanity check only when it is set to true.  Set it in
t/test-lib.sh so that our tests will continue to catch buggy options
arrays.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---

    >  (2) Rethink if parse_options_check() can be made optional at
    >      runtime, which would (a) allow our test to enable it, and allow
    >      us to test all code paths that use parse_options() centrally,
    >      and (b) allow us to bypass the check while the end-user runs
    >      "git", to avoid overhead of checking the same option[] array,
    >      which does not change between invocations of "git", over and
    >      over again all over the world.
    >
    >      We may add the check back to parse_options_check() after doing
    >      the above.  There are already tons of "check sanity of what is
    >      inside option[]" in there, and it would be beneficial if we can
    >      separate out from parse_options_start() the sanity checking
    >      code, regardless of this topic.

    This looked too easy and there may be some pitfalls, but I am
    hoping that we will know soon enough by floating a weather
    balloon like this.

 parse-options.c | 12 +++++++++++-
 t/README        |  5 +++++
 t/test-lib.sh   |  3 +++
 3 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/parse-options.c b/parse-options.c
index 6e57744fd2..02cfe3f2cd 100644
--- a/parse-options.c
+++ b/parse-options.c
@@ -439,6 +439,14 @@ static void check_typos(const char *arg, const struct option *options)
 	}
 }
 
+/*
+ * Check the sanity of contents of opts[] array to find programmer
+ * mistakes (like duplicated short options).
+ *
+ * This function is supposed to be no-op when it returns without
+ * dying, making a call from parse_options_start_1() to it optional
+ * in end-user builds.
+ */
 static void parse_options_check(const struct option *opts)
 {
 	int err = 0;
@@ -523,7 +531,9 @@ static void parse_options_start_1(struct parse_opt_ctx_t *ctx,
 	if ((flags & PARSE_OPT_ONE_SHOT) &&
 	    (flags & PARSE_OPT_KEEP_ARGV0))
 		BUG("Can't keep argv0 if you don't have it");
-	parse_options_check(options);
+
+	if (git_env_bool("GIT_TEST_PARSE_OPTIONS_CHECK", 0))
+		parse_options_check(options);
 }
 
 void parse_options_start(struct parse_opt_ctx_t *ctx,
diff --git a/t/README b/t/README
index f48e0542cd..b7285531f2 100644
--- a/t/README
+++ b/t/README
@@ -472,6 +472,11 @@ a test and then fails then the whole test run will abort. This can help to make
 sure the expected tests are executed and not silently skipped when their
 dependency breaks or is simply not present in a new environment.
 
+GIT_TEST_PARSE_OPTIONS_CHECK=<boolean>, when true, makes all options
+array passed to the parse-options API to be sanity checked.  This
+environment variable is set to true by test-lib.sh unless it is set.
+
+
 Naming Tests
 ------------
 
diff --git a/t/test-lib.sh b/t/test-lib.sh
index e4716b0b86..805f495fd4 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -474,6 +474,9 @@ export GIT_DEFAULT_HASH
 GIT_TEST_MERGE_ALGORITHM="${GIT_TEST_MERGE_ALGORITHM:-ort}"
 export GIT_TEST_MERGE_ALGORITHM
 
+: ${GIT_TEST_PARSE_OPTIONS_CHECK:=1}
+export GIT_TEST_PARSE_OPTIONS_CHECK
+
 # Tests using GIT_TRACE typically don't want <timestamp> <file>:<line> output
 GIT_TRACE_BARE=1
 export GIT_TRACE_BARE
-- 
2.35.1-354-g715d08a9e5


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* Re: [PATCH] parse-options: make parse_options_check() test-only
  2022-03-01 20:08                       ` [PATCH] parse-options: make parse_options_check() test-only Junio C Hamano
@ 2022-03-01 21:57                         ` Ævar Arnfjörð Bjarmason
  2022-03-01 22:18                           ` Junio C Hamano
  0 siblings, 1 reply; 59+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-03-01 21:57 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git


On Tue, Mar 01 2022, Junio C Hamano wrote:

> The array of options given to the parse-options API is sanity
> checked for reuse of a single-letter option for multiple entries and
> other programmer mistakes by calling parse_options_check() from
> parse_options_start().  This allows our developers to catch silly
> mistakes early, but all callers of parse-options API pays this cost.
> Once the set of options in an array is validated and passes this
> check, until a programmer modifies the array, there is no way for it
> to fail the check, which is wasteful.

That's not true due to the "git rev-parse --parseopt" interface. I'd be
happy to deprecate it, but I think the last time I brought it up you
were opposed, i.e. it's documented as plumbing in "git-rev-parse", and
it's easy to have it hit some of these BUG()'s.

I see the benifit of Johannes's suggestion of checking this once (but
with t0012-help.sh etc. we're nowhere near being able to do that).

Now this runs for the whole test suite, so our tests will have the the
same behavior.

So it's just an optimization? Isn't it premature, if you run
parse_options_check() in a loop how many checks/sec can we do? I haven't
tested, but I'm betting it's a *lot*.

So aren't we shaving microseconds off the runtime here?

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH] parse-options: make parse_options_check() test-only
  2022-03-01 21:57                         ` Ævar Arnfjörð Bjarmason
@ 2022-03-01 22:18                           ` Junio C Hamano
  2022-03-02 10:52                             ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 59+ messages in thread
From: Junio C Hamano @ 2022-03-01 22:18 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> On Tue, Mar 01 2022, Junio C Hamano wrote:
>
>> The array of options given to the parse-options API is sanity
>> checked for reuse of a single-letter option for multiple entries and
>> other programmer mistakes by calling parse_options_check() from
>> parse_options_start().  This allows our developers to catch silly
>> mistakes early, but all callers of parse-options API pays this cost.
>> Once the set of options in an array is validated and passes this
>> check, until a programmer modifies the array, there is no way for it
>> to fail the check, which is wasteful.
>
> That's not true due to the "git rev-parse --parseopt" interface. I'd be

Meaning that a parse-options array can be fed by "rev-parse --parseopt"
and having the sanity check enabled does help the use case?  Even there,
I would say that once the script writer finishes developing the script
that uses "rev-parse --parseopt", setting the parseopt input in stone,
there is no need to check the same thing over and over again.  Am I
mistaken?  Does "rev-parse --parseopt" that is fed the same input
sometimes trigger the sanity check and sometimes not?

> I see the benifit of Johannes's suggestion of checking this once (but
> with t0012-help.sh etc. we're nowhere near being able to do that).
>
> Now this runs for the whole test suite, so our tests will have the the
> same behavior.

The code for sanity check is there ONLY to help those who develop
while they develop, and it is logical to enable it during our tests.
There is no reason to trigger the sanity check in the end-user
environment, no?

> So aren't we shaving microseconds off the runtime here?

No, the problem I have with the runtime check is more at the
conceptual level.  Those who remove assert() by setting _NDEBUG
would not be doing so to save nanoseconds, either.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH] parse-options: make parse_options_check() test-only
  2022-03-01 22:18                           ` Junio C Hamano
@ 2022-03-02 10:52                             ` Ævar Arnfjörð Bjarmason
  2022-03-02 18:59                               ` Junio C Hamano
  0 siblings, 1 reply; 59+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-03-02 10:52 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git


On Tue, Mar 01 2022, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>> On Tue, Mar 01 2022, Junio C Hamano wrote:
>>
>>> The array of options given to the parse-options API is sanity
>>> checked for reuse of a single-letter option for multiple entries and
>>> other programmer mistakes by calling parse_options_check() from
>>> parse_options_start().  This allows our developers to catch silly
>>> mistakes early, but all callers of parse-options API pays this cost.
>>> Once the set of options in an array is validated and passes this
>>> check, until a programmer modifies the array, there is no way for it
>>> to fail the check, which is wasteful.
>>
>> That's not true due to the "git rev-parse --parseopt" interface. I'd be
>
> Meaning that a parse-options array can be fed by "rev-parse --parseopt"
> and having the sanity check enabled does help the use case?  Even there,
> I would say that once the script writer finishes developing the script
> that uses "rev-parse --parseopt", setting the parseopt input in stone,
> there is no need to check the same thing over and over again.  Am I
> mistaken?  Does "rev-parse --parseopt" that is fed the same input
> sometimes trigger the sanity check and sometimes not?

If we're declaring that "git rev-parse --parseopt" is something that was
only ever intended for in-tree usage sure, that should hold true.

I.e. "git rev-parse" is documented as plumbing, and we document
--parseopt as a generic option parsing mechanism you can use in
shellscripts.

So out-of-tree users wouldn't guard against
GIT_TEST_PARSE_OPTIONS_CHECK, and I wouldn't be surprised if we could
e.g. segfault on some subsequent code if some of the sanity checks
aren't happening anymore.

No, I'd be quite happy if we declared that it's for our use only, and
could remove it when the last in-tree *.sh user went away. there's a bit
of complexity in parse_options() required only for its use....

>> I see the benifit of Johannes's suggestion of checking this once (but
>> with t0012-help.sh etc. we're nowhere near being able to do that).
>>
>> Now this runs for the whole test suite, so our tests will have the the
>> same behavior.
>
> The code for sanity check is there ONLY to help those who develop
> while they develop, and it is logical to enable it during our tests.
> There is no reason to trigger the sanity check in the end-user
> environment, no?

I don't see the benefit of skipping it. Your commit message mentions
"but all callers of parse-options API pays this cost". As a quick & dumb
perf test I tried:
	
	diff --git a/parse-options.c b/parse-options.c
	index 6e57744fd22..cabea35e8b1 100644
	--- a/parse-options.c
	+++ b/parse-options.c
	@@ -523,7 +523,10 @@ static void parse_options_start_1(struct parse_opt_ctx_t *ctx,
	        if ((flags & PARSE_OPT_ONE_SHOT) &&
	            (flags & PARSE_OPT_KEEP_ARGV0))
	                BUG("Can't keep argv0 if you don't have it");
	-       parse_options_check(options);
	+       while (1) {
	+               printf(".");
	+               parse_options_check(options);
	+       }
	 }
	 
	 void parse_options_start(struct parse_opt_ctx_t *ctx,

And:

    ./git [am|rebase] | pv >/dev/null

Get around 4MiB/s. I.e. we can do this check ~4 million times/sec on my
computer, with -O3, with -O0 -g it's ~3MiB/s.

So the performance cost is trivial & not worth worrying about.

>> So aren't we shaving microseconds off the runtime here?
>
> No, the problem I have with the runtime check is more at the
> conceptual level.  Those who remove assert() by setting _NDEBUG
> would not be doing so to save nanoseconds, either.

I think the trade-off of not having to worry about the runtime
v.s. "development build" checks is one we've done well with BUG(),
i.e. not to have it be an assert().

E.g. in this case we have parse_options_concat(), so you can dynamically
construct the options to be checked.

I happen to have looked in detail at all of that code in the past, and I
don't *think* it's doing something "actually dynamic". I.e. it should be
the same when the tests run and when git runs in the wild.

But having to know and check that when using or changing the API is just
more state to keep in your head.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH] parse-options: make parse_options_check() test-only
  2022-03-02 10:52                             ` Ævar Arnfjörð Bjarmason
@ 2022-03-02 18:59                               ` Junio C Hamano
  2022-03-02 19:17                                 ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 59+ messages in thread
From: Junio C Hamano @ 2022-03-02 18:59 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

>> Meaning that a parse-options array can be fed by "rev-parse --parseopt"
>> and having the sanity check enabled does help the use case?  Even there,
>> I would say that once the script writer finishes developing the script
>> that uses "rev-parse --parseopt", setting the parseopt input in stone,
>> there is no need to check the same thing over and over again.  Am I
>> mistaken?  Does "rev-parse --parseopt" that is fed the same input
>> sometimes trigger the sanity check and sometimes not?
>
> If we're declaring that "git rev-parse --parseopt" is something that was
> only ever intended for in-tree usage sure, that should hold true.

> So out-of-tree users wouldn't guard against
> GIT_TEST_PARSE_OPTIONS_CHECK, and I wouldn't be surprised if we could
> e.g. segfault on some subsequent code if some of the sanity checks
> aren't happening anymore.
> ...
> No, I'd be quite happy if we declared that it's for our use only, and
> could remove it when the last in-tree *.sh user went away. there's a bit
> of complexity in parse_options() required only for its use....

I do not see any need for such a declaration.  We are not changing
the behaviour of "git rev-parse --parseopt" plumbing command at all
for those who feed valid input to it.

"rev-parse --parseopt" users can keep using their scripts just the
same as before, debugging their scripts to catch silly mistakes like
duplicated short options may become slightly harder, but they still
have a way to ask for the same debugging support available.

Yes, I am saying that is perfectly fine, and both in-tree and
out-of-tree users have a way to reinstate the sanity checks.  I also
do not mind if your proposal were one of these:

 * introduce --parseopt-with-sanity-check to "rev-parse" and arrange
   the parse_options_check() call to be made when the command was
   invoked with it; or

 * introduce --parse-opt-without-sanity-check to "rev-parse", and
   arrange the parse_options_check() call to be still made when
   "--parse-opt" is used.  Those who finished developing their
   scripts can rewrite their --parse-opt to "without" version for
   conceptual cleanliness.

> So the performance cost is trivial & not worth worrying about.

I already said I am not worried about it, didn't I?  These numbers
do not matter in this discussion.


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH] parse-options: make parse_options_check() test-only
  2022-03-02 18:59                               ` Junio C Hamano
@ 2022-03-02 19:17                                 ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 59+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-03-02 19:17 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git


On Wed, Mar 02 2022, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
> [...]
>> So the performance cost is trivial & not worth worrying about.
>
> I already said I am not worried about it, didn't I?  These numbers
> do not matter in this discussion.

Sorry, but I really don't see the point then.

You'd like to keep "git rev-parse --parseopt", but now if you feed bad
input to it you'll get worse error messages from it, and it's not for a
performance benefit then why? Why would we have worse error reporting
without any upside?

Another common case would be locally hacking a command that uses
parse_options(), having it do the wrong thing for some cryptic reason
we'd catch in parse_options_check().

Then eventually remember to turn on this GIT_TEST_* knob (i.e.  if
testing via the command-line/debugger instead of the test suite). I for
one do that a lot when working on the parse_options()-using commands
in-tree, if this land I'll probably remember to add this knob to my
.bashrc, but everyone else will find out the hard way...

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 2/2] parse-options.c: add style checks for usage-strings
  2022-03-01 19:37                       ` Johannes Schindelin
@ 2022-03-03 17:34                         ` Abhradeep Chakraborty
  2022-03-03 22:30                           ` Junio C Hamano
  0 siblings, 1 reply; 59+ messages in thread
From: Abhradeep Chakraborty @ 2022-03-03 17:34 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Abhradeep Chakraborty, Ævar Arnfjörð Bjarmason,
	Junio C Hamano, Eric Sunshine, git


Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:

> And since Abhradeep is a new contributor, I found it important to steer
> the direction toward sound advice that they can use over and over again
> over the course of their career: whenever possible, prefer static checks
> over runtime ones.

Thanks Johannes for the advice. I will always remember it ^^

> Of course, if we can convince Coccinelle (together with Python) to give us
> the static check, we might very well be able to port more of
> `parse_options_check()` from runtime checks to static ones, which would be
> a clear win.
>
> If that is possible, we could save ourselves a lot of time by skipping (2)
> altogether.
>
> And as I said, Julia's advice looked really good. If only I wasn't
> desperately short on time, I would have given it a try because it sounds
> not only fun but also very, very useful in Git's context.

Since Junio and you both have an interest in Coccinelle, if you allow,
I want to look into it.

Thanks :)

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 2/2] parse-options.c: add style checks for usage-strings
  2022-03-03 17:34                         ` Abhradeep Chakraborty
@ 2022-03-03 22:30                           ` Junio C Hamano
  2022-03-04 14:21                             ` Abhradeep Chakraborty
  0 siblings, 1 reply; 59+ messages in thread
From: Junio C Hamano @ 2022-03-03 22:30 UTC (permalink / raw)
  To: Abhradeep Chakraborty
  Cc: Johannes Schindelin, Ævar Arnfjörð Bjarmason,
	Eric Sunshine, git

Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com> writes:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
>
>> And since Abhradeep is a new contributor, I found it important to steer
>> the direction toward sound advice that they can use over and over again
>> over the course of their career: whenever possible, prefer static checks
>> over runtime ones.
>
> Thanks Johannes for the advice. I will always remember it ^^

Yup, if we can have static and dynamic checks of the same quality,
static checks are always better alternative.  In this case, runtime
check would probably be an expedite solution suitable for a shorter
term to fill the gap, as a static check with the same quality as it
would probably need some time to develop.

> Since Junio and you both have an interest in Coccinelle, if you allow,
> I want to look into it.

I do not have any particular interest.  If it is a tool fit for the
task, it would be good to use it, that's all ;-)

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 2/2] parse-options.c: add style checks for usage-strings
  2022-03-03 22:30                           ` Junio C Hamano
@ 2022-03-04 14:21                             ` Abhradeep Chakraborty
  2022-03-07 16:12                               ` Johannes Schindelin
  0 siblings, 1 reply; 59+ messages in thread
From: Abhradeep Chakraborty @ 2022-03-04 14:21 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Abhradeep Chakraborty, Ævar Arnfjörð Bjarmason,
	Johannes Schindelin, Eric Sunshine, git


Junio C Hamano <<gitster@pobox.com> wrote:

> Yup, if we can have static and dynamic checks of the same quality,
> static checks are always better alternative.  In this case, runtime
> check would probably be an expedite solution suitable for a shorter
> term to fill the gap, as a static check with the same quality as it
> would probably need some time to develop.

Got it!

> I do not have any particular interest.  If it is a tool fit for the
> task, it would be good to use it, that's all ;-)

Okay, then I would like to research if that is a good fit. Johannes
is pretty confident about it though.

Thanks :)

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 2/2] parse-options.c: add style checks for usage-strings
  2022-03-04 14:21                             ` Abhradeep Chakraborty
@ 2022-03-07 16:12                               ` Johannes Schindelin
  2022-03-08  5:44                                 ` Abhradeep Chakraborty
  0 siblings, 1 reply; 59+ messages in thread
From: Johannes Schindelin @ 2022-03-07 16:12 UTC (permalink / raw)
  To: Abhradeep Chakraborty
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Eric Sunshine, git

Hi,

On Fri, 4 Mar 2022, Abhradeep Chakraborty wrote:

> Junio C Hamano <<gitster@pobox.com> wrote:
>
> > Yup, if we can have static and dynamic checks of the same quality,
> > static checks are always better alternative.  In this case, runtime
> > check would probably be an expedite solution suitable for a shorter
> > term to fill the gap, as a static check with the same quality as it
> > would probably need some time to develop.
>
> Got it!

While the runtime check would address the concern in the short run, paving
the path for future static checks revolving around the same area will pay
off quite happily.

> > I do not have any particular interest.  If it is a tool fit for the
> > task, it would be good to use it, that's all ;-)
>
> Okay, then I would like to research if that is a good fit. Johannes
> is pretty confident about it though.

Yes, he is.

And he wishes he had the time to work on it himself because it sounds like
a really fun (if challenging) project.

In other words: If you ever get stuck somewhere along the lines, please do
push up a work-in-progress branch and reach out here so that I or others
can help.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 2/2] parse-options.c: add style checks for usage-strings
  2022-03-07 16:12                               ` Johannes Schindelin
@ 2022-03-08  5:44                                 ` Abhradeep Chakraborty
  0 siblings, 0 replies; 59+ messages in thread
From: Abhradeep Chakraborty @ 2022-03-08  5:44 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Abhradeep Chakraborty, Ævar Arnfjörð Bjarmason,
	Junio C Hamano, Eric Sunshine, git


Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:

> Yes, he is.

> And he wishes he had the time to work on it himself because it sounds like
> a really fun (if challenging) project.
>
> In other words: If you ever get stuck somewhere along the lines, please do
> push up a work-in-progress branch and reach out here so that I or others
> can help.

Thank you so much ^^

^ permalink raw reply	[flat|nested] 59+ messages in thread

end of thread, other threads:[~2022-03-08  5:45 UTC | newest]

Thread overview: 59+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-16 17:02 [PATCH] add usage-strings ci check and amend remaining usage strings Abhradeep Chakraborty via GitGitGadget
2022-02-21 14:51 ` Abhradeep Chakraborty
2022-02-21 15:39 ` Ævar Arnfjörð Bjarmason
2022-02-21 17:15   ` Junio C Hamano
2022-02-21 17:33   ` Abhradeep Chakraborty
2022-02-21 18:52     ` Ævar Arnfjörð Bjarmason
2022-02-22 10:57     ` Johannes Schindelin
2022-02-22 12:37       ` Ævar Arnfjörð Bjarmason
2022-02-22 12:37         ` [cocci] " Ævar Arnfjörð Bjarmason
2022-02-22 13:42         ` Julia Lawall
2022-02-22 14:03           ` Abhradeep Chakraborty
2022-02-22 15:47           ` Abhradeep Chakraborty
2022-02-25 15:30             ` Johannes Schindelin
2022-02-25 16:16               ` Ævar Arnfjörð Bjarmason
2022-02-26  4:22                 ` Abhradeep Chakraborty
2022-02-26  8:55                   ` Julia Lawall
2022-02-25 15:03           ` [cocci] " Johannes Schindelin
2022-02-25 15:36             ` Julia Lawall
2022-02-25 16:28             ` Ævar Arnfjörð Bjarmason
2022-02-22 10:25   ` Abhradeep Chakraborty
2022-02-22 15:58 ` [PATCH v2] add usage-strings " Abhradeep Chakraborty via GitGitGadget
2022-02-22 17:16   ` Eric Sunshine
2022-02-23 11:59     ` Abhradeep Chakraborty
2022-02-23 21:17     ` Junio C Hamano
2022-02-23 21:20       ` Eric Sunshine
2022-02-24  6:26       ` Abhradeep Chakraborty
2022-02-23 14:27   ` [PATCH v3 0/2] add usage-strings ci " Abhradeep Chakraborty via GitGitGadget
2022-02-23 14:27     ` [PATCH v3 1/2] amend remaining usage strings according to style guide Abhra303 via GitGitGadget
2022-02-23 14:27     ` [PATCH v3 2/2] parse-options.c: add style checks for usage-strings Abhradeep Chakraborty via GitGitGadget
2022-02-25  5:23     ` [PATCH v4 0/2] add usage-strings ci check and amend remaining usage strings Abhradeep Chakraborty via GitGitGadget
2022-02-25  5:23       ` [PATCH v4 1/2] amend remaining usage strings according to style guide Abhradeep Chakraborty via GitGitGadget
2022-02-25  5:23       ` [PATCH v4 2/2] parse-options.c: add style checks for usage-strings Abhradeep Chakraborty via GitGitGadget
2022-02-25  6:13         ` Junio C Hamano
2022-02-25  8:08           ` Abhradeep Chakraborty
2022-02-25 17:06             ` Junio C Hamano
2022-02-26  3:57               ` Abhradeep Chakraborty
2022-02-25 15:36         ` Johannes Schindelin
2022-02-25 16:01           ` Abhradeep Chakraborty
2022-02-26  1:36           ` Junio C Hamano
2022-02-26  6:08             ` Junio C Hamano
2022-02-26  6:57               ` Abhradeep Chakraborty
2022-02-27 19:15                 ` Junio C Hamano
2022-02-28  7:39                   ` Abhradeep Chakraborty
2022-02-28 17:48                     ` Junio C Hamano
2022-02-28 19:32                       ` Ævar Arnfjörð Bjarmason
2022-03-01  6:38                       ` Abhradeep Chakraborty
2022-03-01 11:12                         ` Junio C Hamano
2022-03-01 19:37                       ` Johannes Schindelin
2022-03-03 17:34                         ` Abhradeep Chakraborty
2022-03-03 22:30                           ` Junio C Hamano
2022-03-04 14:21                             ` Abhradeep Chakraborty
2022-03-07 16:12                               ` Johannes Schindelin
2022-03-08  5:44                                 ` Abhradeep Chakraborty
2022-03-01 20:08                       ` [PATCH] parse-options: make parse_options_check() test-only Junio C Hamano
2022-03-01 21:57                         ` Ævar Arnfjörð Bjarmason
2022-03-01 22:18                           ` Junio C Hamano
2022-03-02 10:52                             ` Ævar Arnfjörð Bjarmason
2022-03-02 18:59                               ` Junio C Hamano
2022-03-02 19:17                                 ` Ævar Arnfjörð Bjarmason

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.