git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] clone/fetch: anonymize URLs in the reflog
@ 2020-06-01 19:20 Johannes Schindelin via GitGitGadget
  2020-06-01 21:47 ` Jeff King
  2020-06-04 20:08 ` [PATCH v2] " Johannes Schindelin via GitGitGadget
  0 siblings, 2 replies; 5+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2020-06-01 19:20 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

Even if we strongly discourage putting credentials into the URLs passed
via the command-line, there _is_ support for that, and users _do_ do
that.

Let's scrub them before writing them to the reflog.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
    Anonymize URLs in the reflog
    
    This came up in an internal audit, but we do not consider this to be a
    big deal: the reflog is local and not really shared with anybody.

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-797%2Fdscho%2Fanonymize-clone-reflog-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-797/dscho/anonymize-clone-reflog-v1
Pull-Request: https://github.com/git/git/pull/797

 builtin/clone.c            | 10 ++++++----
 builtin/fetch.c            |  9 +++++++--
 t/t5541-http-push-smart.sh | 15 +++++++++++++++
 3 files changed, 28 insertions(+), 6 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index 1ad26f4d8c8..5fe637a6702 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -938,7 +938,7 @@ static int path_exists(const char *path)
 int cmd_clone(int argc, const char **argv, const char *prefix)
 {
 	int is_bundle = 0, is_local;
-	const char *repo_name, *repo, *work_tree, *git_dir;
+	const char *repo_name, *repo, *display_repo, *work_tree, *git_dir;
 	char *path, *dir;
 	int dest_exists;
 	const struct ref *refs, *remote_head;
@@ -993,11 +993,13 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 
 	path = get_repo_path(repo_name, &is_bundle);
 	if (path)
-		repo = absolute_pathdup(repo_name);
+		display_repo = repo = absolute_pathdup(repo_name);
 	else if (!strchr(repo_name, ':'))
 		die(_("repository '%s' does not exist"), repo_name);
-	else
+	else {
 		repo = repo_name;
+		display_repo = transport_anonymize_url(repo);
+	}
 
 	/* no need to be strict, transport_set_option() will validate it again */
 	if (option_depth && atoi(option_depth) < 1)
@@ -1014,7 +1016,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 		die(_("destination path '%s' already exists and is not "
 			"an empty directory."), dir);
 
-	strbuf_addf(&reflog_msg, "clone: from %s", repo);
+	strbuf_addf(&reflog_msg, "clone: from %s", display_repo);
 
 	if (option_bare)
 		work_tree = NULL;
diff --git a/builtin/fetch.c b/builtin/fetch.c
index bf6bab80fab..d58b7572114 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1765,8 +1765,13 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 
 	/* Record the command line for the reflog */
 	strbuf_addstr(&default_rla, "fetch");
-	for (i = 1; i < argc; i++)
-		strbuf_addf(&default_rla, " %s", argv[i]);
+	for (i = 1; i < argc; i++) {
+		/* This handles non-URLs gracefully */
+		char *anon = transport_anonymize_url(argv[i]);
+
+		strbuf_addf(&default_rla, " %s", anon);
+		free(anon);
+	}
 
 	fetch_config_from_gitmodules(&submodule_fetch_jobs_config,
 				     &recurse_submodules);
diff --git a/t/t5541-http-push-smart.sh b/t/t5541-http-push-smart.sh
index 23be8ce92d6..2d60381a5e7 100755
--- a/t/t5541-http-push-smart.sh
+++ b/t/t5541-http-push-smart.sh
@@ -456,6 +456,21 @@ test_expect_success 'push status output scrubs password' '
 	grep "^To $HTTPD_URL/smart/test_repo.git" status
 '
 
+test_expect_success 'clone/fetch scrubs password from reflogs' '
+	cd "$ROOT_PATH" &&
+	git clone "$HTTPD_URL_USER_PASS/smart/test_repo.git" \
+		reflog-test &&
+	cd reflog-test &&
+	test_commit prepare-for-force-fetch &&
+	git switch -c away &&
+	git fetch "$HTTPD_URL_USER_PASS/smart/test_repo.git" \
+		+master:master &&
+	# should have been scrubbed down to vanilla URL
+	git log -g master >reflog &&
+	grep "$HTTPD_URL" reflog &&
+	! grep "$HTTPD_URL_USER_PASS" reflog
+'
+
 test_expect_success 'colorize errors/hints' '
 	cd "$ROOT_PATH"/test_repo_clone &&
 	test_must_fail git -c color.transport=always -c color.advice=always \

base-commit: af6b65d45ef179ed52087e80cb089f6b2349f4ec
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] clone/fetch: anonymize URLs in the reflog
  2020-06-01 19:20 [PATCH] clone/fetch: anonymize URLs in the reflog Johannes Schindelin via GitGitGadget
@ 2020-06-01 21:47 ` Jeff King
  2020-06-02 16:55   ` Junio C Hamano
  2020-06-04 20:08 ` [PATCH v2] " Johannes Schindelin via GitGitGadget
  1 sibling, 1 reply; 5+ messages in thread
From: Jeff King @ 2020-06-01 21:47 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget; +Cc: git, Johannes Schindelin

On Mon, Jun 01, 2020 at 07:20:02PM +0000, Johannes Schindelin via GitGitGadget wrote:

> From: Johannes Schindelin <johannes.schindelin@gmx.de>
> 
> Even if we strongly discourage putting credentials into the URLs passed
> via the command-line, there _is_ support for that, and users _do_ do
> that.
> 
> Let's scrub them before writing them to the reflog.

Good idea.

>     This came up in an internal audit, but we do not consider this to be a
>     big deal: the reflog is local and not really shared with anybody.

Agreed.

>  builtin/clone.c            | 10 ++++++----
>  builtin/fetch.c            |  9 +++++++--
>  t/t5541-http-push-smart.sh | 15 +++++++++++++++

The patch itself looks very neatly done.

> @@ -993,11 +993,13 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
>  
>  	path = get_repo_path(repo_name, &is_bundle);
>  	if (path)
> -		repo = absolute_pathdup(repo_name);
> +		display_repo = repo = absolute_pathdup(repo_name);
>  	else if (!strchr(repo_name, ':'))
>  		die(_("repository '%s' does not exist"), repo_name);
> -	else
> +	else {
>  		repo = repo_name;
> +		display_repo = transport_anonymize_url(repo);
> +	}

Not introduced by your patch, but I had to read this a few times to make
sure we always end up with repo and display_repo set. IMHO it would be
easier to read as:

  if (this) {
     repo = ...;
     display_repo = ...;
  } else if (that) {
     repo = ...;
     display_repo = ...;
  } else {
     die(...);
  }

instead of sticking the die() in the middle.  Maybe just personal
preference, though. :)

> +	# should have been scrubbed down to vanilla URL
> +	git log -g master >reflog &&
> +	grep "$HTTPD_URL" reflog &&
> +	! grep "$HTTPD_URL_USER_PASS" reflog
> +'

And you make sure we retain the username. Nice.

-Peff

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] clone/fetch: anonymize URLs in the reflog
  2020-06-01 21:47 ` Jeff King
@ 2020-06-02 16:55   ` Junio C Hamano
  0 siblings, 0 replies; 5+ messages in thread
From: Junio C Hamano @ 2020-06-02 16:55 UTC (permalink / raw)
  To: Jeff King; +Cc: Johannes Schindelin via GitGitGadget, git, Johannes Schindelin

Jeff King <peff@peff.net> writes:

> On Mon, Jun 01, 2020 at 07:20:02PM +0000, Johannes Schindelin via GitGitGadget wrote:
>
>> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>> 
>> Even if we strongly discourage putting credentials into the URLs passed
>> via the command-line, there _is_ support for that, and users _do_ do
>> that.
>> 
>> Let's scrub them before writing them to the reflog.
>
> Good idea.
>
>>     This came up in an internal audit, but we do not consider this to be a
>>     big deal: the reflog is local and not really shared with anybody.
>
> Agreed.

Nice.

>>  builtin/clone.c            | 10 ++++++----
>>  builtin/fetch.c            |  9 +++++++--
>>  t/t5541-http-push-smart.sh | 15 +++++++++++++++
>
> The patch itself looks very neatly done.
>
>> @@ -993,11 +993,13 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
>>  
>>  	path = get_repo_path(repo_name, &is_bundle);
>>  	if (path)
>> -		repo = absolute_pathdup(repo_name);
>> +		display_repo = repo = absolute_pathdup(repo_name);
>>  	else if (!strchr(repo_name, ':'))
>>  		die(_("repository '%s' does not exist"), repo_name);
>> -	else
>> +	else {
>>  		repo = repo_name;
>> +		display_repo = transport_anonymize_url(repo);
>> +	}
>
> Not introduced by your patch, but I had to read this a few times to make
> sure we always end up with repo and display_repo set. IMHO it would be
> easier to read as:
>
>   if (this) {
>      repo = ...;
>      display_repo = ...;
>   } else if (that) {
>      repo = ...;
>      display_repo = ...;
>   } else {
>      die(...);
>   }
>
> instead of sticking the die() in the middle.  Maybe just personal
> preference, though. :)

For a if/elseif cascade of few-liner blocks each, I do not think it
would matter, but if a block were larger, having the die() case at
the beginning or at the end would indeed make it easier to spot any
anomalies.

>> +	# should have been scrubbed down to vanilla URL
>> +	git log -g master >reflog &&
>> +	grep "$HTTPD_URL" reflog &&
>> +	! grep "$HTTPD_URL_USER_PASS" reflog
>> +'
>
> And you make sure we retain the username. Nice.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v2] clone/fetch: anonymize URLs in the reflog
  2020-06-01 19:20 [PATCH] clone/fetch: anonymize URLs in the reflog Johannes Schindelin via GitGitGadget
  2020-06-01 21:47 ` Jeff King
@ 2020-06-04 20:08 ` Johannes Schindelin via GitGitGadget
  2020-06-04 20:30   ` Junio C Hamano
  1 sibling, 1 reply; 5+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2020-06-04 20:08 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

Even if we strongly discourage putting credentials into the URLs passed
via the command-line, there _is_ support for that, and users _do_ do
that.

Let's scrub them before writing them to the reflog.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
    Anonymize URLs in the reflog
    
    This came up in an internal audit, but we do not consider this to be a
    big deal: the reflog is local and not really shared with anybody.
    
    Changes since v1:
    
     * Changed the if...else if...else cadence to move the die() to the last
       arm
     * Stopped the memory leak of display_repo (allocated by 
       transport_anonymize_url())

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-797%2Fdscho%2Fanonymize-clone-reflog-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-797/dscho/anonymize-clone-reflog-v2
Pull-Request: https://github.com/git/git/pull/797

Range-diff vs v1:

 1:  11c0d47c95e ! 1:  933a7353847 clone/fetch: anonymize URLs in the reflog
     @@ Commit message
          Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
      
       ## builtin/clone.c ##
     -@@ builtin/clone.c: static int path_exists(const char *path)
     - int cmd_clone(int argc, const char **argv, const char *prefix)
     +@@ builtin/clone.c: int cmd_clone(int argc, const char **argv, const char *prefix)
       {
       	int is_bundle = 0, is_local;
     --	const char *repo_name, *repo, *work_tree, *git_dir;
     -+	const char *repo_name, *repo, *display_repo, *work_tree, *git_dir;
     - 	char *path, *dir;
     + 	const char *repo_name, *repo, *work_tree, *git_dir;
     +-	char *path, *dir;
     ++	char *path, *dir, *display_repo = NULL;
       	int dest_exists;
       	const struct ref *refs, *remote_head;
     + 	const struct ref *remote_head_points_at;
      @@ builtin/clone.c: int cmd_clone(int argc, const char **argv, const char *prefix)
     - 
       	path = get_repo_path(repo_name, &is_bundle);
       	if (path)
     --		repo = absolute_pathdup(repo_name);
     -+		display_repo = repo = absolute_pathdup(repo_name);
     - 	else if (!strchr(repo_name, ':'))
     - 		die(_("repository '%s' does not exist"), repo_name);
     + 		repo = absolute_pathdup(repo_name);
     +-	else if (!strchr(repo_name, ':'))
     +-		die(_("repository '%s' does not exist"), repo_name);
      -	else
     -+	else {
     ++	else if (strchr(repo_name, ':')) {
       		repo = repo_name;
      +		display_repo = transport_anonymize_url(repo);
     -+	}
     ++	} else
     ++		die(_("repository '%s' does not exist"), repo_name);
       
       	/* no need to be strict, transport_set_option() will validate it again */
       	if (option_depth && atoi(option_depth) < 1)
     @@ builtin/clone.c: int cmd_clone(int argc, const char **argv, const char *prefix)
       			"an empty directory."), dir);
       
      -	strbuf_addf(&reflog_msg, "clone: from %s", repo);
     -+	strbuf_addf(&reflog_msg, "clone: from %s", display_repo);
     ++	strbuf_addf(&reflog_msg, "clone: from %s",
     ++		    display_repo ? display_repo : repo);
     ++	free(display_repo);
       
       	if (option_bare)
       		work_tree = NULL;


 builtin/clone.c            | 13 ++++++++-----
 builtin/fetch.c            |  9 +++++++--
 t/t5541-http-push-smart.sh | 15 +++++++++++++++
 3 files changed, 30 insertions(+), 7 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index 1ad26f4d8c8..002d23ab0a2 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -939,7 +939,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 {
 	int is_bundle = 0, is_local;
 	const char *repo_name, *repo, *work_tree, *git_dir;
-	char *path, *dir;
+	char *path, *dir, *display_repo = NULL;
 	int dest_exists;
 	const struct ref *refs, *remote_head;
 	const struct ref *remote_head_points_at;
@@ -994,10 +994,11 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	path = get_repo_path(repo_name, &is_bundle);
 	if (path)
 		repo = absolute_pathdup(repo_name);
-	else if (!strchr(repo_name, ':'))
-		die(_("repository '%s' does not exist"), repo_name);
-	else
+	else if (strchr(repo_name, ':')) {
 		repo = repo_name;
+		display_repo = transport_anonymize_url(repo);
+	} else
+		die(_("repository '%s' does not exist"), repo_name);
 
 	/* no need to be strict, transport_set_option() will validate it again */
 	if (option_depth && atoi(option_depth) < 1)
@@ -1014,7 +1015,9 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 		die(_("destination path '%s' already exists and is not "
 			"an empty directory."), dir);
 
-	strbuf_addf(&reflog_msg, "clone: from %s", repo);
+	strbuf_addf(&reflog_msg, "clone: from %s",
+		    display_repo ? display_repo : repo);
+	free(display_repo);
 
 	if (option_bare)
 		work_tree = NULL;
diff --git a/builtin/fetch.c b/builtin/fetch.c
index bf6bab80fab..d58b7572114 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1765,8 +1765,13 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 
 	/* Record the command line for the reflog */
 	strbuf_addstr(&default_rla, "fetch");
-	for (i = 1; i < argc; i++)
-		strbuf_addf(&default_rla, " %s", argv[i]);
+	for (i = 1; i < argc; i++) {
+		/* This handles non-URLs gracefully */
+		char *anon = transport_anonymize_url(argv[i]);
+
+		strbuf_addf(&default_rla, " %s", anon);
+		free(anon);
+	}
 
 	fetch_config_from_gitmodules(&submodule_fetch_jobs_config,
 				     &recurse_submodules);
diff --git a/t/t5541-http-push-smart.sh b/t/t5541-http-push-smart.sh
index 23be8ce92d6..2d60381a5e7 100755
--- a/t/t5541-http-push-smart.sh
+++ b/t/t5541-http-push-smart.sh
@@ -456,6 +456,21 @@ test_expect_success 'push status output scrubs password' '
 	grep "^To $HTTPD_URL/smart/test_repo.git" status
 '
 
+test_expect_success 'clone/fetch scrubs password from reflogs' '
+	cd "$ROOT_PATH" &&
+	git clone "$HTTPD_URL_USER_PASS/smart/test_repo.git" \
+		reflog-test &&
+	cd reflog-test &&
+	test_commit prepare-for-force-fetch &&
+	git switch -c away &&
+	git fetch "$HTTPD_URL_USER_PASS/smart/test_repo.git" \
+		+master:master &&
+	# should have been scrubbed down to vanilla URL
+	git log -g master >reflog &&
+	grep "$HTTPD_URL" reflog &&
+	! grep "$HTTPD_URL_USER_PASS" reflog
+'
+
 test_expect_success 'colorize errors/hints' '
 	cd "$ROOT_PATH"/test_repo_clone &&
 	test_must_fail git -c color.transport=always -c color.advice=always \

base-commit: af6b65d45ef179ed52087e80cb089f6b2349f4ec
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] clone/fetch: anonymize URLs in the reflog
  2020-06-04 20:08 ` [PATCH v2] " Johannes Schindelin via GitGitGadget
@ 2020-06-04 20:30   ` Junio C Hamano
  0 siblings, 0 replies; 5+ messages in thread
From: Junio C Hamano @ 2020-06-04 20:30 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget; +Cc: git, Johannes Schindelin

"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> diff --git a/builtin/clone.c b/builtin/clone.c
> index 1ad26f4d8c8..002d23ab0a2 100644
> --- a/builtin/clone.c
> +++ b/builtin/clone.c
> @@ -939,7 +939,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
>  {
>  	int is_bundle = 0, is_local;
>  	const char *repo_name, *repo, *work_tree, *git_dir;
> -	char *path, *dir;
> +	char *path, *dir, *display_repo = NULL;
>  	int dest_exists;
>  	const struct ref *refs, *remote_head;
>  	const struct ref *remote_head_points_at;
> @@ -994,10 +994,11 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
>  	path = get_repo_path(repo_name, &is_bundle);
>  	if (path)
>  		repo = absolute_pathdup(repo_name);
> -	else if (!strchr(repo_name, ':'))
> -		die(_("repository '%s' does not exist"), repo_name);
> -	else
> +	else if (strchr(repo_name, ':')) {
>  		repo = repo_name;
> +		display_repo = transport_anonymize_url(repo);
> +	} else
> +		die(_("repository '%s' does not exist"), repo_name);
>  
>  	/* no need to be strict, transport_set_option() will validate it again */
>  	if (option_depth && atoi(option_depth) < 1)
> @@ -1014,7 +1015,9 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
>  		die(_("destination path '%s' already exists and is not "
>  			"an empty directory."), dir);
>  
> -	strbuf_addf(&reflog_msg, "clone: from %s", repo);
> +	strbuf_addf(&reflog_msg, "clone: from %s",
> +		    display_repo ? display_repo : repo);
> +	free(display_repo);

The new patch is easier to see because display_repo becomes non NULL
only when anonymization was necessary and done.  Makes sense.

Will queue.  Thanks.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-06-04 20:30 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-01 19:20 [PATCH] clone/fetch: anonymize URLs in the reflog Johannes Schindelin via GitGitGadget
2020-06-01 21:47 ` Jeff King
2020-06-02 16:55   ` Junio C Hamano
2020-06-04 20:08 ` [PATCH v2] " Johannes Schindelin via GitGitGadget
2020-06-04 20:30   ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).