All of lore.kernel.org
 help / color / mirror / Atom feed
From: Johannes Schindelin <Johannes.Schindelin@gmx.de>
To: Jeff King <peff@peff.net>
Cc: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Martin Langhoff" <martin.langhoff@gmail.com>,
	"Git Mailing List" <git@vger.kernel.org>
Subject: Re: [PATCH 1/3] transport_anonymize_url(): support retaining username
Date: Mon, 20 May 2019 18:36:13 +0200 (CEST)	[thread overview]
Message-ID: <nycvar.QRO.7.76.6.1905201834450.46@tvgsbejvaqbjf.bet> (raw)
In-Reply-To: <20190519051031.GA19434@sigill.intra.peff.net>

Hi Peff,

On Sun, 19 May 2019, Jeff King wrote:

> When we anonymize URLs to show in messages, we strip out both the
> username and password (if any). But there are also contexts where we
> should strip out the password (to avoid leaking it) but retain the
> username.
>
> Let's generalize transport_anonymize_url() to support both cases. We'll
> give it a new name since the password-only mode isn't really
> "anonymizing", but keep the old name as a synonym to avoid disrupting
> existing callers.
>
> Note that there are actually three places we parse URLs, and this
> functionality _could_ go into any of them:
>
>   - transport_anonymize_url(), which we modify here
>
>   - the urlmatch.c code parses a URL into its constituent parts, from
>     which we could easily remove the elements we want to drop and
>     re-format it as a single URL. But its parsing also normalizes
>     elements (e.g., downcasing hostnames).  This isn't wrong, but it's
>     more friendly if we can leave the rest of the URL untouched.

I have not looked into it at all, but I seem to vaguely remember that the
result of this code might be used to look up `url.<url>.insteadOf`
settings, where the middle part *is* case-sensitive.

>   - credential_form_url() parses a URL and decodes the specific
>     elements, but it's hard to convert it back into a regular URL. It
>     treats "host:port" as a single unit, meaning it needs to be
>     re-encoded specially (since a colon would otherwise end
>     percent-encoded).
>
> Since transport_anonymize_url() seemed closest to what we want here, I
> used that as the base.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> I think it would be beneficial to unify these three cases under a single
> parser, but it seemed like too big a rabbit hole for this topic. Of the
> three, the urlmatch one seems the most mature. I think if we could
> simply separate the normalization from the parsing/decoding, the others
> could build on top of it. It might also require some careful thinking
> about how pseudo-urls like ssh "host:path" interact.

In light of what I mentioned above, I am not sure that we should go there
in the first place...

Thanks,
Dscho

> I won't call that a #leftoverbits, because it's more of a feast. :)
>
>  transport.c | 21 ++++++++++++++-------
>  transport.h | 11 ++++++++++-
>  2 files changed, 24 insertions(+), 8 deletions(-)
>
> diff --git a/transport.c b/transport.c
> index f1fcd2c4b0..ba61e57295 100644
> --- a/transport.c
> +++ b/transport.c
> @@ -1335,11 +1335,7 @@ int transport_disconnect(struct transport *transport)
>  	return ret;
>  }
>
> -/*
> - * Strip username (and password) from a URL and return
> - * it in a newly allocated string.
> - */
> -char *transport_anonymize_url(const char *url)
> +char *transport_strip_url(const char *url, int strip_user)
>  {
>  	char *scheme_prefix, *anon_part;
>  	size_t anon_len, prefix_len = 0;
> @@ -1348,7 +1344,10 @@ char *transport_anonymize_url(const char *url)
>  	if (url_is_local_not_ssh(url) || !anon_part)
>  		goto literal_copy;
>
> -	anon_len = strlen(++anon_part);
> +	anon_len = strlen(anon_part);
> +	if (strip_user)
> +		anon_part++;
> +
>  	scheme_prefix = strstr(url, "://");
>  	if (!scheme_prefix) {
>  		if (!strchr(anon_part, ':'))
> @@ -1373,7 +1372,15 @@ char *transport_anonymize_url(const char *url)
>  		cp = strchr(scheme_prefix + 3, '/');
>  		if (cp && cp < anon_part)
>  			goto literal_copy;
> -		prefix_len = scheme_prefix - url + 3;
> +
> +		if (strip_user)
> +			prefix_len = scheme_prefix - url + 3;
> +		else {
> +			cp = strchr(scheme_prefix + 3, ':');
> +			if (cp && cp > anon_part)
> +				goto literal_copy; /* username only */
> +			prefix_len = cp - url;
> +		}
>  	}
>  	return xstrfmt("%.*s%.*s", (int)prefix_len, url,
>  		       (int)anon_len, anon_part);
> diff --git a/transport.h b/transport.h
> index 06e06d3d89..6d8c99ac91 100644
> --- a/transport.h
> +++ b/transport.h
> @@ -243,10 +243,19 @@ const struct ref *transport_get_remote_refs(struct transport *transport,
>  int transport_fetch_refs(struct transport *transport, struct ref *refs);
>  void transport_unlock_pack(struct transport *transport);
>  int transport_disconnect(struct transport *transport);
> -char *transport_anonymize_url(const char *url);
>  void transport_take_over(struct transport *transport,
>  			 struct child_process *child);
>
> +/*
> + * Strip password and optionally username from a URL and return
> + * it in a newly allocated string (even if nothing was stripped).
> + */
> +char *transport_strip_url(const char *url, int strip_username);
> +static inline char *transport_anonymize_url(const char *url)
> +{
> +	return transport_strip_url(url, 1);
> +}
> +
>  int transport_connect(struct transport *transport, const char *name,
>  		      const char *exec, int fd[2]);
>
> --
> 2.22.0.rc0.583.g23d90da2b3
>
>

  parent reply	other threads:[~2019-05-20 16:36 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-15 17:49 Git ransom campaign incident report - May 2019 Martin Langhoff
2019-05-15 18:59 ` Ævar Arnfjörð Bjarmason
2019-05-16  4:27   ` Jeff King
2019-05-17 19:39     ` Johannes Schindelin
2019-05-17 22:20       ` Jeff King
2019-05-17 23:13         ` Martin Langhoff
2019-05-19  5:07         ` Jeff King
2019-05-19  5:10           ` [PATCH 1/3] transport_anonymize_url(): support retaining username Jeff King
2019-05-19 23:28             ` Eric Sunshine
2019-05-20 16:14             ` René Scharfe
2019-05-20 16:36             ` Johannes Schindelin [this message]
2019-05-20 16:43             ` Johannes Schindelin
2019-05-19  5:12           ` [PATCH 2/3] clone: avoid storing URL passwords in config Jeff King
2019-05-19  5:16           ` [PATCH 3/3] clone: auto-enable git-credential-store when necessary Jeff King
2019-05-20 11:28             ` Eric Sunshine
2019-05-20 12:31               ` Jeff King
2019-05-20 16:48                 ` Johannes Schindelin
2019-05-20 13:56             ` Ævar Arnfjörð Bjarmason
2019-05-20 14:08               ` Jeff King
2019-05-20 15:17                 ` Ævar Arnfjörð Bjarmason
2019-05-20 15:24                   ` Jeff King
2019-05-20 17:08             ` Ævar Arnfjörð Bjarmason
2019-05-20 14:43           ` Git ransom campaign incident report - May 2019 Johannes Schindelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=nycvar.QRO.7.76.6.1905201834450.46@tvgsbejvaqbjf.bet \
    --to=johannes.schindelin@gmx.de \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=martin.langhoff@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.