All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] fetch: Strip usernames from url's before storing them
@ 2009-04-15 12:16 Andreas Ericsson
  2009-04-15 12:30 ` Michael J Gruber
  2009-04-15 13:18 ` Johannes Sixt
  0 siblings, 2 replies; 13+ messages in thread
From: Andreas Ericsson @ 2009-04-15 12:16 UTC (permalink / raw)
  To: git; +Cc: Andreas Ericsson

When pulling from a remote, the full URL including username
is by default added to the commit message. Since it adds
very little value but could be used by malicious people to
glean valid usernames (with matching hostnames), we're far
better off just stripping the username before storing the
remote URL locally.

Signed-off-by: Andreas Ericsson <ae@op5.se>
---
 builtin-fetch.c |   48 ++++++++++++++++++++++++++++++++++++++++++++++--
 1 files changed, 46 insertions(+), 2 deletions(-)

diff --git a/builtin-fetch.c b/builtin-fetch.c
index 3c998ea..47fba00 100644
--- a/builtin-fetch.c
+++ b/builtin-fetch.c
@@ -289,7 +289,48 @@ static int update_local_ref(struct ref *ref,
 	}
 }
 
-static int store_updated_refs(const char *url, const char *remote_name,
+/*
+ * strip username information from the url
+ * This will allocate a new string, or return its argument
+ * if no stripping is necessary.
+ *
+ * The url's we want to catch are the following:
+ *   ssh://[user@]host.xz[:port]/path/to/repo.git/
+ *   [user@]host.xz:/path/to/repo.git/
+ *   http[s]://[user[:password]@]host.xz/path/to/repo.git
+ *
+ * Although git doesn't currently support giving the password
+ * to http url's on the command-line, it's easier to catch
+ * that case too than it is to cater for it specially.
+ */
+static char *anonymize_url(const char *url)
+{
+	char *anon_url;
+	const char *at_sign = strchr(url, '@');
+	size_t prefix_len = 0;
+
+	if (!at_sign)
+		return strdup(url);
+
+	if (!prefixcmp(url, "ssh://"))
+		prefix_len = strlen("ssh://");
+	else if (!prefixcmp(url, "http://"))
+		prefix_len = strlen("http://");
+	else if (!prefixcmp(url, "https://"))
+		prefix_len = strlen("https://");
+	else if (!strchr(at_sign + 1, ':'))
+		return strdup(url);
+
+	anon_url = xcalloc(1, 1 + prefix_len +
+			   ((unsigned long)at_sign - (unsigned long)url));
+	if (prefix_len)
+		memcpy(anon_url, url, prefix_len);
+	memcpy(anon_url + prefix_len, at_sign + 1, strlen(at_sign + 1));
+
+	return anon_url;
+}
+
+static int store_updated_refs(const char *raw_url, const char *remote_name,
 		struct ref *ref_map)
 {
 	FILE *fp;
@@ -298,11 +339,13 @@ static int store_updated_refs(const char *url, const char *remote_name,
 	char note[1024];
 	const char *what, *kind;
 	struct ref *rm;
-	char *filename = git_path("FETCH_HEAD");
+	char *url, *filename = git_path("FETCH_HEAD");
 
 	fp = fopen(filename, "a");
 	if (!fp)
 		return error("cannot open %s: %s\n", filename, strerror(errno));
+
+	url = anonymize_url(raw_url);
 	for (rm = ref_map; rm; rm = rm->next) {
 		struct ref *ref = NULL;
 
@@ -376,6 +419,7 @@ static int store_updated_refs(const char *url, const char *remote_name,
 				fprintf(stderr, " %s\n", note);
 		}
 	}
+	free(url);
 	fclose(fp);
 	if (rc & 2)
 		error("some local refs could not be updated; try running\n"
-- 
1.6.3.rc0.2.g7cd31

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] fetch: Strip usernames from url's before storing them
  2009-04-15 12:16 [PATCH] fetch: Strip usernames from url's before storing them Andreas Ericsson
@ 2009-04-15 12:30 ` Michael J Gruber
  2009-04-15 14:01   ` Andreas Ericsson
  2009-04-15 13:18 ` Johannes Sixt
  1 sibling, 1 reply; 13+ messages in thread
From: Michael J Gruber @ 2009-04-15 12:30 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: git

Andreas Ericsson venit, vidit, dixit 15.04.2009 14:16:
> When pulling from a remote, the full URL including username
> is by default added to the commit message. Since it adds
> very little value but could be used by malicious people to
> glean valid usernames (with matching hostnames), we're far
> better off just stripping the username before storing the
> remote URL locally.

Uhm, this is for non-fast-forwards when pull uses "merge" and creates a
merge commit, right?
Fetch does not create commit messages, and pull does not either if it
rebases. So maybe the commit message could make it clearer for lesser
git-educated people such as myself ;)

Michael
> Signed-off-by: Andreas Ericsson <ae@op5.se>
> ---
>  builtin-fetch.c |   48 ++++++++++++++++++++++++++++++++++++++++++++++--
>  1 files changed, 46 insertions(+), 2 deletions(-)
> 
> diff --git a/builtin-fetch.c b/builtin-fetch.c
> index 3c998ea..47fba00 100644
> --- a/builtin-fetch.c
> +++ b/builtin-fetch.c
> @@ -289,7 +289,48 @@ static int update_local_ref(struct ref *ref,
>  	}
>  }
>  
> -static int store_updated_refs(const char *url, const char *remote_name,
> +/*
> + * strip username information from the url
> + * This will allocate a new string, or return its argument
> + * if no stripping is necessary.
> + *
> + * The url's we want to catch are the following:
> + *   ssh://[user@]host.xz[:port]/path/to/repo.git/
> + *   [user@]host.xz:/path/to/repo.git/
> + *   http[s]://[user[:password]@]host.xz/path/to/repo.git
> + *
> + * Although git doesn't currently support giving the password
> + * to http url's on the command-line, it's easier to catch
> + * that case too than it is to cater for it specially.
> + */
> +static char *anonymize_url(const char *url)
> +{
> +	char *anon_url;
> +	const char *at_sign = strchr(url, '@');
> +	size_t prefix_len = 0;
> +
> +	if (!at_sign)
> +		return strdup(url);
> +
> +	if (!prefixcmp(url, "ssh://"))
> +		prefix_len = strlen("ssh://");
> +	else if (!prefixcmp(url, "http://"))
> +		prefix_len = strlen("http://");
> +	else if (!prefixcmp(url, "https://"))
> +		prefix_len = strlen("https://");
> +	else if (!strchr(at_sign + 1, ':'))
> +		return strdup(url);
> +
> +	anon_url = xcalloc(1, 1 + prefix_len +
> +			   ((unsigned long)at_sign - (unsigned long)url));
> +	if (prefix_len)
> +		memcpy(anon_url, url, prefix_len);
> +	memcpy(anon_url + prefix_len, at_sign + 1, strlen(at_sign + 1));
> +
> +	return anon_url;
> +}
> +
> +static int store_updated_refs(const char *raw_url, const char *remote_name,
>  		struct ref *ref_map)
>  {
>  	FILE *fp;
> @@ -298,11 +339,13 @@ static int store_updated_refs(const char *url, const char *remote_name,
>  	char note[1024];
>  	const char *what, *kind;
>  	struct ref *rm;
> -	char *filename = git_path("FETCH_HEAD");
> +	char *url, *filename = git_path("FETCH_HEAD");
>  
>  	fp = fopen(filename, "a");
>  	if (!fp)
>  		return error("cannot open %s: %s\n", filename, strerror(errno));
> +
> +	url = anonymize_url(raw_url);
>  	for (rm = ref_map; rm; rm = rm->next) {
>  		struct ref *ref = NULL;
>  
> @@ -376,6 +419,7 @@ static int store_updated_refs(const char *url, const char *remote_name,
>  				fprintf(stderr, " %s\n", note);
>  		}
>  	}
> +	free(url);
>  	fclose(fp);
>  	if (rc & 2)
>  		error("some local refs could not be updated; try running\n"

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] fetch: Strip usernames from url's before storing them
  2009-04-15 12:16 [PATCH] fetch: Strip usernames from url's before storing them Andreas Ericsson
  2009-04-15 12:30 ` Michael J Gruber
@ 2009-04-15 13:18 ` Johannes Sixt
  2009-04-15 14:14   ` Andreas Ericsson
  1 sibling, 1 reply; 13+ messages in thread
From: Johannes Sixt @ 2009-04-15 13:18 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: git

Andreas Ericsson schrieb:
> +/*
> + * strip username information from the url
> + * This will allocate a new string, or return its argument
> + * if no stripping is necessary.
> + *
> + * The url's we want to catch are the following:
> + *   ssh://[user@]host.xz[:port]/path/to/repo.git/
> + *   [user@]host.xz:/path/to/repo.git/
> + *   http[s]://[user[:password]@]host.xz/path/to/repo.git
> + *
> + * Although git doesn't currently support giving the password
> + * to http url's on the command-line, it's easier to catch
> + * that case too than it is to cater for it specially.
> + */
> +static char *anonymize_url(const char *url)
> +{
> +	char *anon_url;
> +	const char *at_sign = strchr(url, '@');
> +	size_t prefix_len = 0;
> +
> +	if (!at_sign)

	if (!at_sign || has_dos_drive_prefix(url))

or even better move this function to transport.c and use is_local().

> +		return strdup(url);
> +
> +	if (!prefixcmp(url, "ssh://"))
> +		prefix_len = strlen("ssh://");
> +	else if (!prefixcmp(url, "http://"))
> +		prefix_len = strlen("http://");
> +	else if (!prefixcmp(url, "https://"))
> +		prefix_len = strlen("https://");
> +	else if (!strchr(at_sign + 1, ':'))
> +		return strdup(url);
> +
> +	anon_url = xcalloc(1, 1 + prefix_len +
> +			   ((unsigned long)at_sign - (unsigned long)url));
> +	if (prefix_len)
> +		memcpy(anon_url, url, prefix_len);
> +	memcpy(anon_url + prefix_len, at_sign + 1, strlen(at_sign + 1));
> +
> +	return anon_url;
> +}

-- Hannes

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] fetch: Strip usernames from url's before storing them
  2009-04-15 12:30 ` Michael J Gruber
@ 2009-04-15 14:01   ` Andreas Ericsson
  2009-04-15 17:19     ` Junio C Hamano
  0 siblings, 1 reply; 13+ messages in thread
From: Andreas Ericsson @ 2009-04-15 14:01 UTC (permalink / raw)
  To: Michael J Gruber; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 2390 bytes --]

Michael J Gruber wrote:
> Andreas Ericsson venit, vidit, dixit 15.04.2009 14:16:
>> When pulling from a remote, the full URL including username
>> is by default added to the commit message. Since it adds
>> very little value but could be used by malicious people to
>> glean valid usernames (with matching hostnames), we're far
>> better off just stripping the username before storing the
>> remote URL locally.
> 
> Uhm, this is for non-fast-forwards when pull uses "merge" and creates a
> merge commit, right?
> Fetch does not create commit messages, and pull does not either if it
> rebases. So maybe the commit message could make it clearer for lesser
> git-educated people such as myself ;)
> 

Yes and no. This alters what gets written to .git/FETCH_HEAD, but since
what's written there only ever turns up in the history in the form of a
commit-message, you're essentially right.

The reason for this patch is that we published some repositories publicly
a week or two ago and one such malicious person started attacking all our
public servers with the usernames found in the commit messages. In our
case, this isn't such a big issue since all of the servers just fake an
SSH daemon when connected to from outside our internal network, but we
shouldn't take too lightly on casual information disclosure. It *could*
have been a problem and, if nothing else, this patch will probably save
some diskspace if it can prevent others being targeted by the same kind
of brute-force attack as we were.

Junio, this is based off of master, but applies cleanly to maint as well.
I'd actually prefer it to go on maint than master. The usernames in the
url's provide no real value but are potentially dangerous to disclose.

Attached is the micro-program I wrote to test the function itself.
Since I couldn't quite figure out how to set up a remote repository with
password protection and then fetch from it in a way that was generic
enough to go into the test-suite I didn't bother with that, but issuing
a git-pull shows that FETCH_HEAD and the commit message gets the correct
text.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.

[-- Attachment #2: anon-url.c --]
[-- Type: text/x-csrc, Size: 2332 bytes --]

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define prefixcmp(haystack, needle) strncmp(haystack, needle, strlen(needle))
#define xcalloc(n, size) calloc(n, size)
#define xstrdup(str) strdup(str)

static char *anonymize_url(const char *url)
{
	char *anon_url;
	const char *at_sign = strchr(url, '@');
	size_t len, prefix_len = 0;

	if (!at_sign)
		return xstrdup(url);

	if (!prefixcmp(url, "ssh://"))
		prefix_len = strlen("ssh://");
	else if (!prefixcmp(url, "http://"))
		prefix_len = strlen("http://");
	else if (!prefixcmp(url, "https://"))
		prefix_len = strlen("https://");
	else if (!strchr(at_sign + 1, ':'))
		return xstrdup(url);

	len = prefix_len + strlen(at_sign + 1);
	anon_url = xcalloc(1, 1 + prefix_len + strlen(at_sign + 1));
	if (prefix_len)
		memcpy(anon_url, url, prefix_len);
	memcpy(anon_url + prefix_len, at_sign + 1, strlen(at_sign + 1));

	return anon_url;
}

int main(int argc, char **argv)
{
	int errors = 0;
	struct {
		char *raw;
		char *correct;
	}
	urls[] = {
		{ "rsync://host.xz/path/to/repo.git/", NULL, },
		{ "http://host.xz:port/path/to/repo.git/", NULL, },
		{ "https://host.xz:port/path/to/repo.git/", NULL,},
		{ "git://host.xz:port/path/to/repo.git/", NULL, },
		{ "git://host.xz:port/~user/path/to/repo.git/", NULL, },
		{
			"http://user@host.xz:port/path/to/repo.git/",
				"http://host.xz:port/path/to/repo.git/",
		},
		{
			"https://user@host.xz:port/path/to/repo.git/",
				"https://host.xz:port/path/to/repo.git/",
		},
		{
			"ssh://user@host.xz:port/path/to/repo.git/",
				"ssh://host.xz:port/path/to/repo.git/",
		},
		{
			"ssh://user@host.xz/path/to/repo.git/",
				"ssh://host.xz/path/to/repo.git/",
		},
		{
			"ssh://user@host.xz/~user/path/to/repo.git/",
				"ssh://host.xz/~user/path/to/repo.git/",
		},
		{
			"user@host.xz:/path/to/repo.git/",
				"host.xz:/path/to/repo.git/",
		},
		{
			"user@host.xz:~user/path/to/repo.git/",
				"host.xz:~user/path/to/repo.git/",
		},
		{ NULL, NULL },
	};
	int i;

	for (i = 0; urls[i].raw; i++) {
		char *anon_url = anonymize_url(urls[i].raw);
		if (!strcmp(anon_url, urls[i].correct ? urls[i].correct : urls[i].raw))
			continue;

		errors++;
		printf("raw    : %s\nanon   : %s\n", urls[i].raw, anon_url);
		printf("correct: %s\n", urls[i].correct);
	}

	printf("There were %d errors\n", errors);
	return 0;
}

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] fetch: Strip usernames from url's before storing them
  2009-04-15 13:18 ` Johannes Sixt
@ 2009-04-15 14:14   ` Andreas Ericsson
  2009-04-15 14:30     ` [PATCH v2] " Andreas Ericsson
  0 siblings, 1 reply; 13+ messages in thread
From: Andreas Ericsson @ 2009-04-15 14:14 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: git

Johannes Sixt wrote:
> Andreas Ericsson schrieb:
>> +/*
>> + * strip username information from the url
>> + * This will allocate a new string, or return its argument
>> + * if no stripping is necessary.
>> + *
>> + * The url's we want to catch are the following:
>> + *   ssh://[user@]host.xz[:port]/path/to/repo.git/
>> + *   [user@]host.xz:/path/to/repo.git/
>> + *   http[s]://[user[:password]@]host.xz/path/to/repo.git
>> + *
>> + * Although git doesn't currently support giving the password
>> + * to http url's on the command-line, it's easier to catch
>> + * that case too than it is to cater for it specially.
>> + */
>> +static char *anonymize_url(const char *url)
>> +{
>> +	char *anon_url;
>> +	const char *at_sign = strchr(url, '@');
>> +	size_t prefix_len = 0;
>> +
>> +	if (!at_sign)
> 
> 	if (!at_sign || has_dos_drive_prefix(url))
> 
> or even better move this function to transport.c and use is_local().

Good idea, even though it really shouldn't matter in practice due
to the last if() below. Then again, there's no telling what some
silly IDE might take into its head to name paths ;-)

I also see now I botched the job and sent the un-amended patch.
v2 incoming.

> 
>> +		return strdup(url);
>> +
>> +	if (!prefixcmp(url, "ssh://"))
>> +		prefix_len = strlen("ssh://");
>> +	else if (!prefixcmp(url, "http://"))
>> +		prefix_len = strlen("http://");
>> +	else if (!prefixcmp(url, "https://"))
>> +		prefix_len = strlen("https://");
>> +	else if (!strchr(at_sign + 1, ':'))
>> +		return strdup(url);
>> +

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2] fetch: Strip usernames from url's before storing them
  2009-04-15 14:14   ` Andreas Ericsson
@ 2009-04-15 14:30     ` Andreas Ericsson
  2009-04-15 17:19       ` Junio C Hamano
  0 siblings, 1 reply; 13+ messages in thread
From: Andreas Ericsson @ 2009-04-15 14:30 UTC (permalink / raw)
  To: gitster; +Cc: git, Andreas Ericsson

When pulling from a remote, the full URL including username
is by default added to the commit message. Since it adds
very little value but could be used by malicious people to
glean valid usernames (with matching hostnames), we're far
better off just stripping the username before storing the
remote URL locally.

Note that this patch has no lasting visible effect when
"git pull" does not create a merge commit. It simply
alters what gets written to .git/FETCH_HEAD, which is used
by "git merge" to automagically create its' messages.

Signed-off-by: Andreas Ericsson <ae@op5.se>
---

This incorporates the changes suggested by both J6t and
Michael Gruber, as well as the properly functioning
version of the patch (the last one had an off-by-lots
for some url's, as I failed at --amend'ing it).

 builtin-fetch.c |    7 +++++--
 transport.c     |   40 ++++++++++++++++++++++++++++++++++++++++
 transport.h     |    1 +
 3 files changed, 46 insertions(+), 2 deletions(-)

diff --git a/builtin-fetch.c b/builtin-fetch.c
index 3c998ea..0bb290b 100644
--- a/builtin-fetch.c
+++ b/builtin-fetch.c
@@ -289,7 +289,7 @@ static int update_local_ref(struct ref *ref,
 	}
 }
 
-static int store_updated_refs(const char *url, const char *remote_name,
+static int store_updated_refs(const char *raw_url, const char *remote_name,
 		struct ref *ref_map)
 {
 	FILE *fp;
@@ -298,11 +298,13 @@ static int store_updated_refs(const char *url, const char *remote_name,
 	char note[1024];
 	const char *what, *kind;
 	struct ref *rm;
-	char *filename = git_path("FETCH_HEAD");
+	char *url, *filename = git_path("FETCH_HEAD");
 
 	fp = fopen(filename, "a");
 	if (!fp)
 		return error("cannot open %s: %s\n", filename, strerror(errno));
+
+	url = transport_anonymize_url(raw_url);
 	for (rm = ref_map; rm; rm = rm->next) {
 		struct ref *ref = NULL;
 
@@ -376,6 +378,7 @@ static int store_updated_refs(const char *url, const char *remote_name,
 				fprintf(stderr, " %s\n", note);
 		}
 	}
+	free(url);
 	fclose(fp);
 	if (rc & 2)
 		error("some local refs could not be updated; try running\n"
diff --git a/transport.c b/transport.c
index 3dfb03c..9e6dc5e 100644
--- a/transport.c
+++ b/transport.c
@@ -1083,3 +1083,43 @@ int transport_disconnect(struct transport *transport)
 	free(transport);
 	return ret;
 }
+
+/*
+ * Strip username information from the url and return it in a
+ * newly allocated string which the caller has to free.
+ *
+ * The url's we want to catch are the following:
+ *   ssh://[user@]host.xz[:port]/path/to/repo.git/
+ *   [user@]host.xz:/path/to/repo.git/
+ *   http[s]://[user[:password]@]host.xz/path/to/repo.git
+ *
+ * Although git doesn't currently support giving the password
+ * to http url's on the command-line, it's easier to catch
+ * that case too than it is to cater for it specially.
+ */
+char *transport_anonymize_url(const char *url)
+{
+	char *anon_url;
+	const char *at_sign = strchr(url, '@');
+	size_t len, prefix_len = 0;
+
+	if (is_local(url) || !at_sign)
+		return xstrdup(url);
+
+	if (!prefixcmp(url, "ssh://"))
+		prefix_len = strlen("ssh://");
+	else if (!prefixcmp(url, "http://"))
+		prefix_len = strlen("http://");
+	else if (!prefixcmp(url, "https://"))
+		prefix_len = strlen("https://");
+	else if (!strchr(at_sign + 1, ':'))
+		return xstrdup(url);
+
+	len = prefix_len + strlen(at_sign + 1);
+	anon_url = xcalloc(1, 1 + prefix_len + strlen(at_sign + 1));
+	if (prefix_len)
+		memcpy(anon_url, url, prefix_len);
+	memcpy(anon_url + prefix_len, at_sign + 1, strlen(at_sign + 1));
+
+	return anon_url;
+}
diff --git a/transport.h b/transport.h
index b1c2252..27bfc52 100644
--- a/transport.h
+++ b/transport.h
@@ -74,5 +74,6 @@ const struct ref *transport_get_remote_refs(struct transport *transport);
 int transport_fetch_refs(struct transport *transport, const struct ref *refs);
 void transport_unlock_pack(struct transport *transport);
 int transport_disconnect(struct transport *transport);
+char *transport_anonymize_url(const char *url);
 
 #endif
-- 
1.6.3.rc0.2.g743353.dirty

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] fetch: Strip usernames from url's before storing them
  2009-04-15 14:01   ` Andreas Ericsson
@ 2009-04-15 17:19     ` Junio C Hamano
  2009-04-15 18:08       ` Andreas Ericsson
  0 siblings, 1 reply; 13+ messages in thread
From: Junio C Hamano @ 2009-04-15 17:19 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: Michael J Gruber, git

Andreas Ericsson <ae@op5.se> writes:

> The reason for this patch is that we published some repositories publicly
> a week or two ago and one such malicious person started attacking all our
> public servers with the usernames found in the commit messages.

Interesting.  Do you also worry about the names on committer and author
lines?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2] fetch: Strip usernames from url's before storing them
  2009-04-15 14:30     ` [PATCH v2] " Andreas Ericsson
@ 2009-04-15 17:19       ` Junio C Hamano
  2009-04-15 20:45         ` Andreas Ericsson
  2009-04-17  8:20         ` [PATCH v3] " Andreas Ericsson
  0 siblings, 2 replies; 13+ messages in thread
From: Junio C Hamano @ 2009-04-15 17:19 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: git

Andreas Ericsson <ae@op5.se> writes:

> When pulling from a remote, the full URL including username
> is by default added to the commit message. Since it adds
> very little value but could be used by malicious people to
> glean valid usernames (with matching hostnames), we're far
> better off just stripping the username before storing the
> remote URL locally.

Sounds like a sensible thing to do.

> +/*
> + * Strip username information from the url and return it in a
> + * newly allocated string which the caller has to free.
> + *
> + * The url's we want to catch are the following:
> + *   ssh://[user@]host.xz[:port]/path/to/repo.git/
> + *   [user@]host.xz:/path/to/repo.git/
> + *   http[s]://[user[:password]@]host.xz/path/to/repo.git

If this is a valid URL:

	scheme://host.xz/path@with@at@sign.git/

we do not want to mistakenly trigger this logic.

I do not know if rsync://me@there/path is supported, but we should
generalize to support any scheme://me@there/path to keep the code simpler.
You do not do anything special based on the URL scheme other than learning
how long the scheme:// part is to copy it anyway.  Perhaps like...

char *transport_anonymize_url(const char *url)
{
	char *anon_url, *scheme_prefix, *anon_part;
	size_t len, prefix_len = 0;

	anon_part = strchr(url, '@');
	if (is_local(url) || !anon_part)
		goto literal_copy;

	anon_part++;
	scheme_prefix = strstr(url, "://");
	if (scheme_prefix) {
		const char *cp;
		/* make sure scheme is reasonable */
		for (cp = url; cp < scheme_prefix; cp++) {
			switch (*cp) { /* RFC 1738 2.1 */
			case '+':
			case '.':
			case '-':
				break; /* ok */
			default:
				if (isalnum(*cp))
					break;
				/* it isn't */
				goto literal_copy;
			}
		}
		/* @ past the first slash does not count */
		cp = strchr(scheme_prefix + 3, '/');
		if (cp < anon_part)
			goto literal_copy;
		prefix_len = scheme_prefix - url + 3;
	}
	else if (!strchr(anon_part, ':'))
		/* cannot be "me@there:/path/name" */
		goto literal_copy;
	len = prefix_len + strlen(anon_part);
	anon_url = xmalloc(len + 1);
	memcpy(anon_url, url, prefix_len);
	memcpy(anon_url + prefix_len, anon_part, strlen(anon_part));
	return anon_url;
 literal_copy:
	return xstrdup(url);
}

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] fetch: Strip usernames from url's before storing them
  2009-04-15 17:19     ` Junio C Hamano
@ 2009-04-15 18:08       ` Andreas Ericsson
  0 siblings, 0 replies; 13+ messages in thread
From: Andreas Ericsson @ 2009-04-15 18:08 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Michael J Gruber, git

Junio C Hamano wrote:
> Andreas Ericsson <ae@op5.se> writes:
> 
>> The reason for this patch is that we published some repositories publicly
>> a week or two ago and one such malicious person started attacking all our
>> public servers with the usernames found in the commit messages.
> 
> Interesting.  Do you also worry about the names on committer and author
> lines?

We don't refuse anyone who's allowed to push by file-permissions. Perhaps
we should, but we don't. This was discovered as a nasty after-shock, and
"unfortunately" a bunch of people are already working with the commits
exposed by the code. Since we're not really affected at all by the bad
parts of the code, we've decided not to bother rewriting history. We'd
rather keep life simple for our contributors (we're not as lively a
community as git, so we can't afford to lose half a dozen just to protect
ourselves; It's better to just alter those usernames and keep going with
the history we've got).

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2] fetch: Strip usernames from url's before storing them
  2009-04-15 17:19       ` Junio C Hamano
@ 2009-04-15 20:45         ` Andreas Ericsson
  2009-04-17  8:20         ` [PATCH v3] " Andreas Ericsson
  1 sibling, 0 replies; 13+ messages in thread
From: Andreas Ericsson @ 2009-04-15 20:45 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Thanks for the feedback. Many appreciated.

Junio C Hamano wrote:
> Andreas Ericsson <ae@op5.se> writes:
> 
>> +/*
>> + * Strip username information from the url and return it in a
>> + * newly allocated string which the caller has to free.
>> + *
>> + * The url's we want to catch are the following:
>> + *   ssh://[user@]host.xz[:port]/path/to/repo.git/
>> + *   [user@]host.xz:/path/to/repo.git/
>> + *   http[s]://[user[:password]@]host.xz/path/to/repo.git
> 
> If this is a valid URL:
> 
> 	scheme://host.xz/path@with@at@sign.git/
> 
> we do not want to mistakenly trigger this logic.
> 

I'm guessing, and I'm slightly inebriated so don't yell too hard at me if
you think your mail will hit me in the morning ;-)

It *is* valid for "bare ssh" url's, but that would only trigger the fourth
(and last) case of the if() else if() thing which would return a strdup().
It won't have a scheme, and it won't have a colon after the @-sign.

An url with a scheme can't contain a colon *before* the at-sign if it does
contain a username, and it won't contain a colon *after* the @-sign if it
*does* contain a username. This is a guess, but I wrote that part of the
transport code initially, so I've got a decent inkling of what git accepts
in the first place (although it might not be strong enough in the face of
the various RFC's, I know).

> I do not know if rsync://me@there/path is supported, but we should
> generalize to support any scheme://me@there/path to keep the code simpler.
> You do not do anything special based on the URL scheme other than learning
> how long the scheme:// part is to copy it anyway.

I've never seen rsync://user@ style url's before. It's trivial to add support
for it if it's valid though, but for the reasons above, I'm not too fussed
about trying to support URL's we're extremely unlikely to see in real life.
The last else if() really does catch a lot.

>  Perhaps like...
> 
> char *transport_anonymize_url(const char *url)
> {
> 	char *anon_url, *scheme_prefix, *anon_part;
> 	size_t len, prefix_len = 0;
> 
> 	anon_part = strchr(url, '@');
> 	if (is_local(url) || !anon_part)
> 		goto literal_copy;
> 
> 	anon_part++;
> 	scheme_prefix = strstr(url, "://");
> 	if (scheme_prefix) {
> 		const char *cp;
> 		/* make sure scheme is reasonable */
> 		for (cp = url; cp < scheme_prefix; cp++) {
> 			switch (*cp) { /* RFC 1738 2.1 */
> 			case '+':
> 			case '.':
> 			case '-':
> 				break; /* ok */

Isn't %xx also a valid escape sequence for unknown chars in url's, according
to the aforementioned RFC?


> 			default:
> 				if (isalnum(*cp))
> 					break;
> 				/* it isn't */
> 				goto literal_copy;
> 			}
> 		}
> 		/* @ past the first slash does not count */
> 		cp = strchr(scheme_prefix + 3, '/');
> 		if (cp < anon_part)
> 			goto literal_copy;
> 		prefix_len = scheme_prefix - url + 3;
> 	}
> 	else if (!strchr(anon_part, ':'))
> 		/* cannot be "me@there:/path/name" */

Nice, but the comment is misleading to the simple (or drunk) mind.
It can't contain an @-sign at all due to the check above, which I'd
be happier if it mentions (me being both atm, I'm inclined to think
in rather straight lines).

Otherwise I really like it.

> 		goto literal_copy;
> 	len = prefix_len + strlen(anon_part);
> 	anon_url = xmalloc(len + 1);
> 	memcpy(anon_url, url, prefix_len);
> 	memcpy(anon_url + prefix_len, anon_part, strlen(anon_part));
> 	return anon_url;
>  literal_copy:
> 	return xstrdup(url);
> }


This looks sensible and fairly generic, and I'll happily defer to a
more capable and less drunk programmer any time of the day. I'll copy
it into anon-url.c tomorrow and see how it pans out with some of the
weirder URL's you gave today.

If anyone's got some tricky url's they want me to try, email me
before 08:00 GMT tomorrow and I'll give 'em a whirl with multiple
algo's.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v3] fetch: Strip usernames from url's before storing them
  2009-04-15 17:19       ` Junio C Hamano
  2009-04-15 20:45         ` Andreas Ericsson
@ 2009-04-17  8:20         ` Andreas Ericsson
  2009-04-20  7:39           ` Andreas Ericsson
  1 sibling, 1 reply; 13+ messages in thread
From: Andreas Ericsson @ 2009-04-17  8:20 UTC (permalink / raw)
  To: gitster; +Cc: git, Andreas Ericsson

When pulling from a remote, the full URL including username
is by default added to the commit message. Since it adds
very little value but could be used by malicious people to
glean valid usernames (with matching hostnames), we're far
better off just stripping the username before storing the
remote URL locally.

Note that this patch has no lasting visible effect when
"git pull" does not create a merge commit. It simply
alters what gets written to .git/FETCH_HEAD, which is used
by "git merge" to automagically create its messages.

Signed-off-by: Andreas Ericsson <ae@op5.se>
---

I made some minor modifications to your function, Junio.
* use xcalloc() instead of malloc() to make sure the string
  is nul-terminated.
* take strlen() of anon_part instead of calculating the whole
  thing once, as we use that measurement twice.
* moved handling of !scheme_prefix && !is_bare_ssh_url(url)
  up top, so both conditions can be seen at once on my fairly
  cramped editor.

 builtin-fetch.c |    7 +++++--
 transport.c     |   48 ++++++++++++++++++++++++++++++++++++++++++++++++
 transport.h     |    1 +
 3 files changed, 54 insertions(+), 2 deletions(-)

diff --git a/builtin-fetch.c b/builtin-fetch.c
index 3c998ea..0bb290b 100644
--- a/builtin-fetch.c
+++ b/builtin-fetch.c
@@ -289,7 +289,7 @@ static int update_local_ref(struct ref *ref,
 	}
 }
 
-static int store_updated_refs(const char *url, const char *remote_name,
+static int store_updated_refs(const char *raw_url, const char *remote_name,
 		struct ref *ref_map)
 {
 	FILE *fp;
@@ -298,11 +298,13 @@ static int store_updated_refs(const char *url, const char *remote_name,
 	char note[1024];
 	const char *what, *kind;
 	struct ref *rm;
-	char *filename = git_path("FETCH_HEAD");
+	char *url, *filename = git_path("FETCH_HEAD");
 
 	fp = fopen(filename, "a");
 	if (!fp)
 		return error("cannot open %s: %s\n", filename, strerror(errno));
+
+	url = transport_anonymize_url(raw_url);
 	for (rm = ref_map; rm; rm = rm->next) {
 		struct ref *ref = NULL;
 
@@ -376,6 +378,7 @@ static int store_updated_refs(const char *url, const char *remote_name,
 				fprintf(stderr, " %s\n", note);
 		}
 	}
+	free(url);
 	fclose(fp);
 	if (rc & 2)
 		error("some local refs could not be updated; try running\n"
diff --git a/transport.c b/transport.c
index 3dfb03c..38c12e7 100644
--- a/transport.c
+++ b/transport.c
@@ -1083,3 +1083,51 @@ int transport_disconnect(struct transport *transport)
 	free(transport);
 	return ret;
 }
+
+/*
+ * Strip username (and password) from an url and return
+ * it in a newly allocated string.
+ */
+static char *transport_anonymize_url(const char *url)
+{
+	char *anon_url, *scheme_prefix, *anon_part;
+	size_t anon_len, prefix_len = 0;
+
+	anon_part = strchr(url, '@');
+	if (is_local(url) || !anon_part)
+		goto literal_copy;
+
+	anon_len = strlen(++anon_part);
+	scheme_prefix = strstr(url, "://");
+	if (!scheme_prefix) {
+		if (!strchr(anon_part, ':'))
+			/* cannot be "me@there:/path/name" */
+			goto literal_copy;
+	} else {
+		const char *cp;
+		/* make sure scheme is reasonable */
+		for (cp = url; cp < scheme_prefix; cp++) {
+			switch (*cp) {
+				/* RFC 1738 2.1 */
+			case '+': case '.': case '-':
+				break; /* ok */
+			default:
+				if (isalnum(*cp))
+					break;
+				/* it isn't */
+				goto literal_copy;
+			}
+		}
+		/* @ past the first slash does not count */
+		cp = strchr(scheme_prefix + 3, '/');
+		if (cp && cp < anon_part)
+			goto literal_copy;
+		prefix_len = scheme_prefix - url + 3;
+	}
+	anon_url = xcalloc(1, 1 + prefix_len + anon_len);
+	memcpy(anon_url, url, prefix_len);
+	memcpy(anon_url + prefix_len, anon_part, anon_len);
+	return anon_url;
+	literal_copy:
+	return xstrdup(url);
+}
diff --git a/transport.h b/transport.h
index b1c2252..27bfc52 100644
--- a/transport.h
+++ b/transport.h
@@ -74,5 +74,6 @@ const struct ref *transport_get_remote_refs(struct transport *transport);
 int transport_fetch_refs(struct transport *transport, const struct ref *refs);
 void transport_unlock_pack(struct transport *transport);
 int transport_disconnect(struct transport *transport);
+char *transport_anonymize_url(const char *url);
 
 #endif
-- 
1.6.3.rc0.2.g743353.dirty

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v3] fetch: Strip usernames from url's before storing them
  2009-04-17  8:20         ` [PATCH v3] " Andreas Ericsson
@ 2009-04-20  7:39           ` Andreas Ericsson
  2009-04-20  8:36             ` Junio C Hamano
  0 siblings, 1 reply; 13+ messages in thread
From: Andreas Ericsson @ 2009-04-20  7:39 UTC (permalink / raw)
  To: gitster; +Cc: git

We've used this patch in production the past couple of days.
All tests pass and it works just fine. Any issues with it I
should fix, or did it just slip through?

Andreas Ericsson wrote:
> When pulling from a remote, the full URL including username
> is by default added to the commit message. Since it adds
> very little value but could be used by malicious people to
> glean valid usernames (with matching hostnames), we're far
> better off just stripping the username before storing the
> remote URL locally.
> 
> Note that this patch has no lasting visible effect when
> "git pull" does not create a merge commit. It simply
> alters what gets written to .git/FETCH_HEAD, which is used
> by "git merge" to automagically create its messages.
> 
> Signed-off-by: Andreas Ericsson <ae@op5.se>
> ---
> 
> I made some minor modifications to your function, Junio.
> * use xcalloc() instead of malloc() to make sure the string
>   is nul-terminated.
> * take strlen() of anon_part instead of calculating the whole
>   thing once, as we use that measurement twice.
> * moved handling of !scheme_prefix && !is_bare_ssh_url(url)
>   up top, so both conditions can be seen at once on my fairly
>   cramped editor.
> 
>  builtin-fetch.c |    7 +++++--
>  transport.c     |   48 ++++++++++++++++++++++++++++++++++++++++++++++++
>  transport.h     |    1 +
>  3 files changed, 54 insertions(+), 2 deletions(-)
> 
> diff --git a/builtin-fetch.c b/builtin-fetch.c
> index 3c998ea..0bb290b 100644
> --- a/builtin-fetch.c
> +++ b/builtin-fetch.c
> @@ -289,7 +289,7 @@ static int update_local_ref(struct ref *ref,
>  	}
>  }
>  
> -static int store_updated_refs(const char *url, const char *remote_name,
> +static int store_updated_refs(const char *raw_url, const char *remote_name,
>  		struct ref *ref_map)
>  {
>  	FILE *fp;
> @@ -298,11 +298,13 @@ static int store_updated_refs(const char *url, const char *remote_name,
>  	char note[1024];
>  	const char *what, *kind;
>  	struct ref *rm;
> -	char *filename = git_path("FETCH_HEAD");
> +	char *url, *filename = git_path("FETCH_HEAD");
>  
>  	fp = fopen(filename, "a");
>  	if (!fp)
>  		return error("cannot open %s: %s\n", filename, strerror(errno));
> +
> +	url = transport_anonymize_url(raw_url);
>  	for (rm = ref_map; rm; rm = rm->next) {
>  		struct ref *ref = NULL;
>  
> @@ -376,6 +378,7 @@ static int store_updated_refs(const char *url, const char *remote_name,
>  				fprintf(stderr, " %s\n", note);
>  		}
>  	}
> +	free(url);
>  	fclose(fp);
>  	if (rc & 2)
>  		error("some local refs could not be updated; try running\n"
> diff --git a/transport.c b/transport.c
> index 3dfb03c..38c12e7 100644
> --- a/transport.c
> +++ b/transport.c
> @@ -1083,3 +1083,51 @@ int transport_disconnect(struct transport *transport)
>  	free(transport);
>  	return ret;
>  }
> +
> +/*
> + * Strip username (and password) from an url and return
> + * it in a newly allocated string.
> + */
> +static char *transport_anonymize_url(const char *url)
> +{
> +	char *anon_url, *scheme_prefix, *anon_part;
> +	size_t anon_len, prefix_len = 0;
> +
> +	anon_part = strchr(url, '@');
> +	if (is_local(url) || !anon_part)
> +		goto literal_copy;
> +
> +	anon_len = strlen(++anon_part);
> +	scheme_prefix = strstr(url, "://");
> +	if (!scheme_prefix) {
> +		if (!strchr(anon_part, ':'))
> +			/* cannot be "me@there:/path/name" */
> +			goto literal_copy;
> +	} else {
> +		const char *cp;
> +		/* make sure scheme is reasonable */
> +		for (cp = url; cp < scheme_prefix; cp++) {
> +			switch (*cp) {
> +				/* RFC 1738 2.1 */
> +			case '+': case '.': case '-':
> +				break; /* ok */
> +			default:
> +				if (isalnum(*cp))
> +					break;
> +				/* it isn't */
> +				goto literal_copy;
> +			}
> +		}
> +		/* @ past the first slash does not count */
> +		cp = strchr(scheme_prefix + 3, '/');
> +		if (cp && cp < anon_part)
> +			goto literal_copy;
> +		prefix_len = scheme_prefix - url + 3;
> +	}
> +	anon_url = xcalloc(1, 1 + prefix_len + anon_len);
> +	memcpy(anon_url, url, prefix_len);
> +	memcpy(anon_url + prefix_len, anon_part, anon_len);
> +	return anon_url;
> +	literal_copy:
> +	return xstrdup(url);
> +}
> diff --git a/transport.h b/transport.h
> index b1c2252..27bfc52 100644
> --- a/transport.h
> +++ b/transport.h
> @@ -74,5 +74,6 @@ const struct ref *transport_get_remote_refs(struct transport *transport);
>  int transport_fetch_refs(struct transport *transport, const struct ref *refs);
>  void transport_unlock_pack(struct transport *transport);
>  int transport_disconnect(struct transport *transport);
> +char *transport_anonymize_url(const char *url);
>  
>  #endif


-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3] fetch: Strip usernames from url's before storing them
  2009-04-20  7:39           ` Andreas Ericsson
@ 2009-04-20  8:36             ` Junio C Hamano
  0 siblings, 0 replies; 13+ messages in thread
From: Junio C Hamano @ 2009-04-20  8:36 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: git

Andreas Ericsson <ae@op5.se> writes:

> We've used this patch in production the past couple of days.
> All tests pass and it works just fine. Any issues with it I
> should fix, or did it just slip through?

The latter, and a bit of my slowing down on handling new feature patches
during the pre-release freeze.

Will queue.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2009-04-20  8:38 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-04-15 12:16 [PATCH] fetch: Strip usernames from url's before storing them Andreas Ericsson
2009-04-15 12:30 ` Michael J Gruber
2009-04-15 14:01   ` Andreas Ericsson
2009-04-15 17:19     ` Junio C Hamano
2009-04-15 18:08       ` Andreas Ericsson
2009-04-15 13:18 ` Johannes Sixt
2009-04-15 14:14   ` Andreas Ericsson
2009-04-15 14:30     ` [PATCH v2] " Andreas Ericsson
2009-04-15 17:19       ` Junio C Hamano
2009-04-15 20:45         ` Andreas Ericsson
2009-04-17  8:20         ` [PATCH v3] " Andreas Ericsson
2009-04-20  7:39           ` Andreas Ericsson
2009-04-20  8:36             ` Junio C Hamano

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.