[PATCH v7 0/4] config: add support for http.<url>.* settings

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v7 0/4] config: add support for http.<url>.* settings
@ 2013-07-22  3:18 Kyle J. McKay
  2013-07-22  3:18 ` [PATCH v7 1/4] " Kyle J. McKay
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Kyle J. McKay @ 2013-07-22  3:18 UTC (permalink / raw)
  To: git
  Cc: David Aguilar, Petr Baudis, Junio C Hamano, Richard Hartmann,
	Jeff King, Daniel Knittl-Frank, Jan Krüger, Alejandro Mery,
	Aaron Schrab, Eric Sunshine

NOTE: This patch requires the following preparatory change:

 f1ff763 http.c: fix parsing of http.sslCertPasswordProtected variable

which is currently in pu.

This patch series adds support for http.<url>.* settings.  The patch is
organized as a series of improvements on the functionality:

1/4 - adds basic textual matching support
2/4 - adds URL normalization before matching
3/4 - adds a test for the URL normalization function
4/4 - adds any user matching

With-Feedback-From-jh: Junio C Hamano <gitster@pobox.com>

Differences from v6:

1/4 - Updated from v6's 1/4:

* Added an http_init memset for http_option_max_matched_len (feedback-jh)

2/4 - Updated from v6's 2/4:

* Reduce use of strchr (feedback-jh)
* Correctly handle characters with high bit set during normalization
* Detect some additional invalid scheme names

3/4 - Updated from v6's 3/4:

* Add several additional tests for high bit set characters and invalid schemes

4/4 - Updated from v6's 4/4:

* Added an http_init memset for http_option_user_matched (feedback-jh)

Applicable comments from earlier cover:

To better support matching URLs that are equivalent but spelled differently, a
url_normalize function has been added.  Currently this patch leaves it in
http.c as http_options_url_normalize as I am unclear whether it should go into
url.{h,c} at this time since only http.c uses it.

Since the url_normalize function's behavior is non-trivial, it is presented as
a separate patch on top of the basic http.<url>.* settings support.  A new test
for it has also been included as a separate patch.  I am unclear on the proper
number for this test, but have gone ahead and put it with the other http tests
since this patch series places the url_normalize function into http.c.

Kyle J. McKay (4):
  config: add support for http.<url>.* settings
  config: improve support for http.<url>.* settings
  tests: add new test for the url_normalize function
  config: allow http.<url>.* any user matching

 .gitignore               |   1 +
 Documentation/config.txt |  19 ++
 Makefile                 |   5 +
 http.c                   | 666 +++++++++++++++++++++++++++++++++++++++++++++--
 t/t5200-url-normalize.sh | 182 +++++++++++++
 t/t5200/README           |   9 +
 t/t5200/url-1            |   1 +
 t/t5200/url-10           |   1 +
 t/t5200/url-11           |   1 +
 t/t5200/url-2            |   1 +
 t/t5200/url-3            |   1 +
 t/t5200/url-4            |   1 +
 t/t5200/url-5            |   1 +
 t/t5200/url-6            |   1 +
 t/t5200/url-7            |   1 +
 t/t5200/url-8            |   1 +
 t/t5200/url-9            |   1 +
 test-url-normalize.c     |  62 +++++
 18 files changed, 938 insertions(+), 17 deletions(-)
 create mode 100755 t/t5200-url-normalize.sh
 create mode 100644 t/t5200/README
 create mode 100644 t/t5200/url-1
 create mode 100644 t/t5200/url-10
 create mode 100644 t/t5200/url-11
 create mode 100644 t/t5200/url-2
 create mode 100644 t/t5200/url-3
 create mode 100644 t/t5200/url-4
 create mode 100644 t/t5200/url-5
 create mode 100644 t/t5200/url-6
 create mode 100644 t/t5200/url-7
 create mode 100644 t/t5200/url-8
 create mode 100644 t/t5200/url-9
 create mode 100644 test-url-normalize.c

-- 
1.8.3

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v7 1/4] config: add support for http.<url>.* settings
  2013-07-22  3:18 [PATCH v7 0/4] config: add support for http.<url>.* settings Kyle J. McKay
@ 2013-07-22  3:18 ` Kyle J. McKay
  2013-07-22  3:18 ` [PATCH v7 2/4] config: improve " Kyle J. McKay
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 11+ messages in thread
From: Kyle J. McKay @ 2013-07-22  3:18 UTC (permalink / raw)
  To: git
  Cc: David Aguilar, Petr Baudis, Junio C Hamano, Richard Hartmann,
	Jeff King, Daniel Knittl-Frank, Jan Krüger, Alejandro Mery,
	Aaron Schrab, Eric Sunshine

The credentials configuration values already support url-specific
configuration items in the form credential.<url>.*.  This patch
adds similar support for http configuration values.

The <url> value is considered a match to a url if the <url> value
is either an exact match or a prefix of the url which ends on a
path component boundary ('/').  So "https://example.com/test" will
match "https://example.com/test" and "https://example.com/test/too"
but not "https://example.com/testextra".

Longer matches take precedence over shorter matches with
environment variable settings taking precedence over all.

With this configuration:

[http]
        useragent = other-agent
        noEPSV = true
[http "https://example.com"]
        useragent = example-agent
        sslVerify = false
[http "https://example.com/path"]
        useragent = path-agent

The "https://other.example.com/" url will have useragent
"other-agent" and sslVerify will be on.

The "https://example.com/" url will have useragent
"example-agent" and sslVerify will be off.

The "https://example.com/path/sub" url will have useragent
"path-agent" and sslVerify will be off.

All three of the examples will have noEPSV enabled.

Signed-off-by: Kyle J. McKay <mackyle@gmail.com>
---
 Documentation/config.txt |  15 +++++
 http.c                   | 170 ++++++++++++++++++++++++++++++++++++++++++-----
 2 files changed, 168 insertions(+), 17 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 6e53fc5..41cab91 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1513,6 +1513,21 @@ http.useragent::
 	of common USER_AGENT strings (but not including those like git/1.7.1).
 	Can be overridden by the 'GIT_HTTP_USER_AGENT' environment variable.
 
+http.<url>.*::
+	Any of the http.* options above can be applied selectively to some urls.
+	For example "http.https://example.com.useragent" would set the user
+	agent only for https connections to example.com.  The <url> value
+	matches a url if it is an exact match or a prefix of the url matching
+	at a "/" boundary.  Longer <url> matches take precedence over shorter
+	ones with the environment variable settings taking precedence over all.
+	Note that <url> must match the url passed to git exactly (other than
+	possibly being a prefix).  This means any user, password and/or port
+	setting that appears in a url as well as any %XX escapes that are
+	present must also appear in <url> to have a successful match.  The urls
+	that are matched against are those given directly to git commands.  In
+	other words, use exactly the same url that was passed to git (possibly
+	shortened) for the <url> value of the config setting.
+
 i18n.commitEncoding::
 	Character encoding the commit messages are stored in; Git itself
 	does not care per se, but this information is necessary e.g. when
diff --git a/http.c b/http.c
index 37986f8..1531ffa 100644
--- a/http.c
+++ b/http.c
@@ -30,6 +30,34 @@ static CURL *curl_default;
 
 char curl_errorstr[CURL_ERROR_SIZE];
 
+enum http_option_type {
+	OPT_POST_BUFFER,
+	OPT_MIN_SESSIONS,
+	OPT_SSL_VERIFY,
+	OPT_SSL_TRY,
+	OPT_SSL_CERT,
+	OPT_SSL_CAINFO,
+	OPT_LOW_SPEED,
+	OPT_LOW_TIME,
+	OPT_NO_EPSV,
+	OPT_HTTP_PROXY,
+	OPT_COOKIE_FILE,
+	OPT_USER_AGENT,
+	OPT_PASSWD_REQ,
+#ifdef USE_CURL_MULTI
+	OPT_MAX_REQUESTS,
+#endif
+#if LIBCURL_VERSION_NUM >= 0x070903
+	OPT_SSL_KEY,
+#endif
+#if LIBCURL_VERSION_NUM >= 0x070908
+	OPT_SSL_CAPATH,
+#endif
+	OPT_MAX
+};
+
+static size_t http_option_max_matched_len[OPT_MAX];
+
 static int curl_ssl_verify = -1;
 static int curl_ssl_try;
 static const char *ssl_cert;
@@ -141,33 +169,121 @@ static void process_curl_messages(void)
 }
 #endif
 
+static size_t http_options_url_match_prefix(const char *url,
+					    const char *url_prefix,
+					    size_t url_prefix_len)
+{
+	/*
+	 * url_prefix matches url if url_prefix is an exact match for url or it
+	 * is a prefix of url and the match ends on a path component boundary.
+	 * Both url and url_prefix are considered to have an implicit '/' on the
+	 * end for matching purposes if they do not already.
+	 *
+	 * url must be NUL terminated.  url_prefix_len is the length of
+	 * url_prefix which need not be NUL terminated.
+	 *
+	 * The return value is the length of the match in characters (excluding
+	 * any final '/') or 0 for no match.  Passing "/" as url_prefix will
+	 * always cause 0 to be returned.
+	 */
+	size_t url_len;
+	if (url_prefix_len && url_prefix[url_prefix_len - 1] == '/')
+		url_prefix_len--;
+	if (!url_prefix_len || strncmp(url, url_prefix, url_prefix_len))
+		return 0;
+	url_len = strlen(url);
+	if ((url_len == url_prefix_len) || (url[url_prefix_len] == '/'))
+		return url_prefix_len;
+	return 0;
+}
+
+static int new_match_is_shorter(size_t matchlen, enum http_option_type opt)
+{
+	/*
+	 * Compare matchlen to the last matched length of option opt and
+	 * return true if matchlen is shorter than the last matched length
+	 * (meaning the config setting should be ignored).  Upon seeing the
+	 * _same_ key (i.e. new key has the same match length which is therefore
+	 * not shorter) the new setting will override the previous setting.
+	 * Otherwise return false and record matchlen as the current last
+	 * matched length of option opt.
+	 */
+	if (matchlen < http_option_max_matched_len[opt])
+		return 1;
+	http_option_max_matched_len[opt] = matchlen;
+	return 0;
+}
+
 static int http_options(const char *var, const char *value, void *cb)
 {
-	if (!strcmp("http.sslverify", var)) {
+	const char *url = cb;
+	const char *key, *dot;
+	size_t matchlen = 0;
+
+	key = skip_prefix(var, "http.");
+	if (!key)
+		return git_default_config(var, value, cb);
+
+	/*
+	 * If the part following the leading "http." contains a '.' then check
+	 * to see if the part between "http." and the last '.' is a match to
+	 * url.  If it's not then ignore the setting.  Otherwise set key to
+	 * point to the option which is the part after the final '.' and
+	 * use key in subsequent comparisons to determine the option type.
+	 */
+	dot = strrchr(key, '.');
+	if (dot) {
+		matchlen = http_options_url_match_prefix(url, key, dot - key);
+		if (!matchlen)
+			return 0;
+		key = dot + 1;
+	}
+
+	if (!strcmp("sslverify", key)) {
+		if (new_match_is_shorter(matchlen, OPT_SSL_VERIFY))
+			return 0;
 		curl_ssl_verify = git_config_bool(var, value);
 		return 0;
 	}
-	if (!strcmp("http.sslcert", var))
+	if (!strcmp("sslcert", key)) {
+		if (new_match_is_shorter(matchlen, OPT_SSL_CERT))
+			return 0;
 		return git_config_string(&ssl_cert, var, value);
+	}
 #if LIBCURL_VERSION_NUM >= 0x070903
-	if (!strcmp("http.sslkey", var))
+	if (!strcmp("sslkey", key)) {
+		if (new_match_is_shorter(matchlen, OPT_SSL_KEY))
+			return 0;
 		return git_config_string(&ssl_key, var, value);
+	}
 #endif
 #if LIBCURL_VERSION_NUM >= 0x070908
-	if (!strcmp("http.sslcapath", var))
+	if (!strcmp("sslcapath", key)) {
+		if (new_match_is_shorter(matchlen, OPT_SSL_CAPATH))
+			return 0;
 		return git_config_string(&ssl_capath, var, value);
+	}
 #endif
-	if (!strcmp("http.sslcainfo", var))
+	if (!strcmp("sslcainfo", key)) {
+		if (new_match_is_shorter(matchlen, OPT_SSL_CAINFO))
+			return 0;
 		return git_config_string(&ssl_cainfo, var, value);
-	if (!strcmp("http.sslcertpasswordprotected", var)) {
+	}
+	if (!strcmp("sslcertpasswordprotected", key)) {
+		if (new_match_is_shorter(matchlen, OPT_PASSWD_REQ))
+			return 0;
 		ssl_cert_password_required = git_config_bool(var, value);
 		return 0;
 	}
-	if (!strcmp("http.ssltry", var)) {
+	if (!strcmp("ssltry", key)) {
+		if (new_match_is_shorter(matchlen, OPT_SSL_TRY))
+			return 0;
 		curl_ssl_try = git_config_bool(var, value);
 		return 0;
 	}
-	if (!strcmp("http.minsessions", var)) {
+	if (!strcmp("minsessions", key)) {
+		if (new_match_is_shorter(matchlen, OPT_MIN_SESSIONS))
+			return 0;
 		min_curl_sessions = git_config_int(var, value);
 #ifndef USE_CURL_MULTI
 		if (min_curl_sessions > 1)
@@ -176,39 +292,58 @@ static int http_options(const char *var, const char *value, void *cb)
 		return 0;
 	}
 #ifdef USE_CURL_MULTI
-	if (!strcmp("http.maxrequests", var)) {
+	if (!strcmp("maxrequests", key)) {
+		if (new_match_is_shorter(matchlen, OPT_MAX_REQUESTS))
+			return 0;
 		max_requests = git_config_int(var, value);
 		return 0;
 	}
 #endif
-	if (!strcmp("http.lowspeedlimit", var)) {
+	if (!strcmp("lowspeedlimit", key)) {
+		if (new_match_is_shorter(matchlen, OPT_LOW_SPEED))
+			return 0;
 		curl_low_speed_limit = (long)git_config_int(var, value);
 		return 0;
 	}
-	if (!strcmp("http.lowspeedtime", var)) {
+	if (!strcmp("lowspeedtime", key)) {
+		if (new_match_is_shorter(matchlen, OPT_LOW_TIME))
+			return 0;
 		curl_low_speed_time = (long)git_config_int(var, value);
 		return 0;
 	}
 
-	if (!strcmp("http.noepsv", var)) {
+	if (!strcmp("noepsv", key)) {
+		if (new_match_is_shorter(matchlen, OPT_NO_EPSV))
+			return 0;
 		curl_ftp_no_epsv = git_config_bool(var, value);
 		return 0;
 	}
-	if (!strcmp("http.proxy", var))
+	if (!strcmp("proxy", key)) {
+		if (new_match_is_shorter(matchlen, OPT_HTTP_PROXY))
+			return 0;
 		return git_config_string(&curl_http_proxy, var, value);
+	}
 
-	if (!strcmp("http.cookiefile", var))
+	if (!strcmp("cookiefile", key)) {
+		if (new_match_is_shorter(matchlen, OPT_COOKIE_FILE))
+			return 0;
 		return git_config_string(&curl_cookie_file, var, value);
+	}
 
-	if (!strcmp("http.postbuffer", var)) {
+	if (!strcmp("postbuffer", key)) {
+		if (new_match_is_shorter(matchlen, OPT_POST_BUFFER))
+			return 0;
 		http_post_buffer = git_config_int(var, value);
 		if (http_post_buffer < LARGE_PACKET_MAX)
 			http_post_buffer = LARGE_PACKET_MAX;
 		return 0;
 	}
 
-	if (!strcmp("http.useragent", var))
+	if (!strcmp("useragent", key)) {
+		if (new_match_is_shorter(matchlen, OPT_USER_AGENT))
+			return 0;
 		return git_config_string(&user_agent, var, value);
+	}
 
 	/* Fall back on the default ones */
 	return git_default_config(var, value, cb);
@@ -337,7 +472,8 @@ void http_init(struct remote *remote, const char *url, int proactive_auth)
 
 	http_is_verbose = 0;
 
-	git_config(http_options, NULL);
+	memset(http_option_max_matched_len, 0, sizeof(http_option_max_matched_len));
+	git_config(http_options, (void *)url);
 
 	curl_global_init(CURL_GLOBAL_ALL);
 
-- 
1.8.3

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v7 2/4] config: improve support for http.<url>.* settings
  2013-07-22  3:18 [PATCH v7 0/4] config: add support for http.<url>.* settings Kyle J. McKay
  2013-07-22  3:18 ` [PATCH v7 1/4] " Kyle J. McKay
@ 2013-07-22  3:18 ` Kyle J. McKay
  2013-07-22  3:18 ` [PATCH v7 3/4] tests: add new test for the url_normalize function Kyle J. McKay
  2013-07-22  3:18 ` [PATCH v7 4/4] config: allow http.<url>.* any user matching Kyle J. McKay
  3 siblings, 0 replies; 11+ messages in thread
From: Kyle J. McKay @ 2013-07-22  3:18 UTC (permalink / raw)
  To: git
  Cc: David Aguilar, Petr Baudis, Junio C Hamano, Richard Hartmann,
	Jeff King, Daniel Knittl-Frank, Jan Krüger, Alejandro Mery,
	Aaron Schrab, Eric Sunshine

Improve on the http.<url>.* url matching behavior by first
normalizing the urls before they are compared.

With this change, for example, the following configuration
section:

[http "https://example.com/path"]
        useragent = example-agent
        sslVerify = false

will properly match a "HTTPS://example.COM/p%61th" url which
is equivalent.

The normalization rules are based on RFC 3986 and should
result in any two equivalent urls being a match.

Signed-off-by: Kyle J. McKay <mackyle@gmail.com>
---
 Documentation/config.txt |  19 ++-
 http.c                   | 311 ++++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 318 insertions(+), 12 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 41cab91..e461f32 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1517,16 +1517,15 @@ http.<url>.*::
 	Any of the http.* options above can be applied selectively to some urls.
 	For example "http.https://example.com.useragent" would set the user
 	agent only for https connections to example.com.  The <url> value
-	matches a url if it is an exact match or a prefix of the url matching
-	at a "/" boundary.  Longer <url> matches take precedence over shorter
-	ones with the environment variable settings taking precedence over all.
-	Note that <url> must match the url passed to git exactly (other than
-	possibly being a prefix).  This means any user, password and/or port
-	setting that appears in a url as well as any %XX escapes that are
-	present must also appear in <url> to have a successful match.  The urls
-	that are matched against are those given directly to git commands.  In
-	other words, use exactly the same url that was passed to git (possibly
-	shortened) for the <url> value of the config setting.
+	matches a url if it is an exact match or if it is a prefix of the url
+	matching at a "/" boundary.  Longer <url> matches take precedence over
+	shorter ones with the environment variable settings taking precedence
+	over all.  The urls are normalized before testing for a match.  Note,
+	however, that any user, password and/or port setting that appears in a
+	url must also match that part of <url> to have a successful match.  The
+	urls that are matched against are those given directly to git commands.
+	This means any urls visited as a result of a redirection do not
+	participate in matching.
 
 i18n.commitEncoding::
 	Character encoding the commit messages are stored in; Git itself
diff --git a/http.c b/http.c
index 1531ffa..29e119c 100644
--- a/http.c
+++ b/http.c
@@ -169,6 +169,300 @@ static void process_curl_messages(void)
 }
 #endif
 
+#define URL_ALPHA "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
+#define URL_DIGIT "0123456789"
+#define URL_ALPHADIGIT URL_ALPHA URL_DIGIT
+#define URL_SCHEME_CHARS URL_ALPHADIGIT "+.-"
+#define URL_HOST_CHARS URL_ALPHADIGIT ".-[:]" /* IPv6 literals need [:] */
+#define URL_UNSAFE_CHARS " <>\"%{}|\\^`" /* plus 0x00-0x1F,0x7F-0xFF */
+#define URL_GEN_RESERVED ":/?#[]@"
+#define URL_SUB_RESERVED "!$&'()*+,;="
+#define URL_RESERVED URL_GEN_RESERVED URL_SUB_RESERVED /* only allowed delims */
+
+static int append_normalized_escapes(struct strbuf *buf,
+				     const char *from,
+				     size_t from_len,
+				     const char *esc_extra,
+				     const char *esc_ok)
+{
+	/*
+	 * Append to strbuf 'buf' characters from string 'from' with length
+	 * 'from_len' while unescaping characters that do not need to be escaped
+	 * and escaping characters that do.  The set of characters to escape
+	 * (the complement of which is unescaped) starts out as the RFC 3986
+	 * unsafe characters (0x00-0x1F,0x7F-0xFF," <>\"#%{}|\\^`").  If
+	 * 'esc_extra' is not NULL, those additional characters will also always
+	 * be escaped.  If 'esc_ok' is not NULL, those characters will be left
+	 * escaped if found that way, but will not be unescaped otherwise (used
+	 * for delimiters).  If a %-escape sequence is encountered that is not
+	 * followed by 2 hexadecimal digits, the sequence is invalid and
+	 * false (0) will be returned.  Otherwise true (1) will be returned for
+	 * success.
+	 *
+	 * Note that all %-escape sequences will be normalized to UPPERCASE
+	 * as indicated in RFC 3986.  Unless included in esc_extra or esc_ok
+	 * alphanumerics and "-._~" will always be unescaped as per RFC 3986.
+	 */
+
+	while (from_len) {
+		int ch = *from++;
+		int was_esc = 0;
+
+		from_len--;
+		if (ch == '%') {
+			if (from_len < 2 ||
+			    !isxdigit((unsigned char)from[0]) ||
+			    !isxdigit((unsigned char)from[1]))
+				return 0;
+			ch = hexval_table[(unsigned char)*from++] << 4;
+			ch |= hexval_table[(unsigned char)*from++];
+			from_len -= 2;
+			was_esc = 1;
+		}
+		if ((unsigned char)ch <= 0x1F || (unsigned char)ch >= 0x7F ||
+		    strchr(URL_UNSAFE_CHARS, ch) ||
+		    (esc_extra && strchr(esc_extra, ch)) ||
+		    (was_esc && strchr(esc_ok, ch)))
+			strbuf_addf(buf, "%%%02X", (unsigned char)ch);
+		else
+			strbuf_addch(buf, ch);
+	}
+
+	return 1;
+}
+
+static char *http_options_url_normalize(const char *url)
+{
+	/*
+	 * Normalize NUL-terminated url using the following rules:
+	 *
+	 * 1. Case-insensitive parts of url will be converted to lower case
+	 * 2. %-encoded characters that do not need to be will be unencoded
+	 * 3. Characters that are not %-encoded and must be will be encoded
+	 * 4. All %-encodings will be converted to upper case hexadecimal
+	 * 5. Leading 0s are removed from port numbers
+	 * 6. If the default port for the scheme is given it will be removed
+	 * 7. A path part (including empty) not starting with '/' has one added
+	 * 8. Any dot segments (. or ..) in the path are resolved and removed
+	 * 9. IPv6 host literals are allowed (but not normalized or validated)
+	 *
+	 * The rules are based on information in RFC 3986.
+	 *
+	 * Please note this function requires a full URL including a scheme
+	 * and host part (except for file: URLs which may have an empty host).
+	 *
+	 * The return value is a newly allocated string that must be freed
+	 * or NULL if the url is not valid.
+	 *
+	 * This is NOT a URL validation function.  Full URL validation is NOT
+	 * performed.  Some invalid host names are passed through this function
+	 * undetected.  However, most all other problems that make a URL invalid
+	 * will be detected (including a missing host for non file: URLs).
+	 */
+
+	size_t url_len = strlen(url);
+	struct strbuf norm;
+	size_t spanned;
+	const char *slash_ptr, *at_ptr, *colon_ptr, *path_start;
+	int found_host = 0;
+
+
+	/*
+	 * Copy lowercased scheme and :// suffix, %-escapes are not allowed
+	 * First character of scheme must be URL_ALPHA
+	 */
+	spanned = strspn(url, URL_SCHEME_CHARS);
+	if (!spanned || !isalpha(url[0]) || spanned + 3 > url_len ||
+	    url[spanned] != ':' || url[spanned+1] != '/' || url[spanned+2] != '/')
+		return NULL; /* Bad scheme and/or missing "://" part */
+	strbuf_init(&norm, url_len);
+	spanned += 3;
+	url_len -= spanned;
+	while (spanned--)
+		strbuf_addch(&norm, tolower(*url++));
+
+
+	/*
+	 * Copy any username:password if present normalizing %-escapes
+	 */
+	at_ptr = strchr(url, '@');
+	slash_ptr = url + strcspn(url, "/?#");
+	if (at_ptr && at_ptr < slash_ptr) {
+		if (at_ptr > url) {
+			if (!append_normalized_escapes(&norm, url, at_ptr - url,
+						       "", URL_RESERVED)) {
+				strbuf_release(&norm);
+				return NULL;
+			}
+		}
+		strbuf_addch(&norm, '@');
+		url_len -= (++at_ptr - url);
+		url = at_ptr;
+	}
+
+
+	/*
+	 * Copy the host part excluding any port part, no %-escapes allowed
+	 */
+	if (!url_len || strchr(":/?#", *url)) {
+		/* Missing host invalid for all URL schemes except file */
+		if (strncmp(norm.buf, "file:", 5)) {
+			strbuf_release(&norm);
+			return NULL;
+		}
+	} else {
+		found_host = 1;
+	}
+	colon_ptr = slash_ptr - 1;
+	while (colon_ptr > url && *colon_ptr != ':' && *colon_ptr != ']')
+		colon_ptr--;
+	if (*colon_ptr != ':') {
+		colon_ptr = slash_ptr;
+	} else if (!found_host && colon_ptr < slash_ptr && colon_ptr + 1 != slash_ptr) {
+		/* file: URLs may not have a port number */
+		strbuf_release(&norm);
+		return NULL;
+	}
+	spanned = strspn(url, URL_HOST_CHARS);
+	if (spanned < colon_ptr - url) {
+		/* Host name has invalid characters */
+		strbuf_release(&norm);
+		return NULL;
+	}
+	while (url < colon_ptr) {
+		strbuf_addch(&norm, tolower(*url++));
+		url_len--;
+	}
+
+
+	/*
+	 * Check the port part and copy if not the default (after removing any
+	 * leading 0s); no %-escapes allowed
+	 */
+	if (colon_ptr < slash_ptr) {
+		/* skip the ':' and leading 0s but not the last one if all 0s */
+		url++;
+		url += strspn(url, "0");
+		if (url == slash_ptr && url[-1] == '0')
+			url--;
+		if (url == slash_ptr) {
+			/* Skip ":" port with no number, it's same as default */
+		} else if (slash_ptr - url == 2 &&
+			   !strncmp(norm.buf, "http:", 5) &&
+			   !strncmp(url, "80", 2)) {
+			/* Skip http :80 as it's the default */
+		} else if (slash_ptr - url == 3 &&
+			   !strncmp(norm.buf, "https:", 6) &&
+			   !strncmp(url, "443", 3)) {
+			/* Skip https :443 as it's the default */
+		} else {
+			/*
+			 * Port number must be all digits with leading 0s removed
+			 * and since all the protocols we deal with have a 16-bit
+			 * port number it must also be in the range 1..65535
+			 * 0 is not allowed because that means "next available"
+			 * on just about every system and therefore cannot be used
+			 */
+			unsigned long pnum = 0;
+			spanned = strspn(url, URL_DIGIT);
+			if (spanned < slash_ptr - url) {
+				/* port number has invalid characters */
+				strbuf_release(&norm);
+				return NULL;
+			}
+			if (slash_ptr - url <= 5)
+				pnum = strtoul(url, NULL, 10);
+			if (pnum == 0 || pnum > 65535) {
+				/* port number not in range 1..65535 */
+				strbuf_release(&norm);
+				return NULL;
+			}
+			strbuf_addch(&norm, ':');
+			strbuf_add(&norm, url, slash_ptr - url);
+		}
+		url_len -= slash_ptr - colon_ptr;
+		url = slash_ptr;
+	}
+
+
+	/*
+	 * Now copy the path resolving any . and .. segments being careful not
+	 * to corrupt the URL by unescaping any delimiters, but do add an
+	 * initial '/' if it's missing and do normalize any %-escape sequences.
+	 */
+	path_start = norm.buf + norm.len;
+	strbuf_addch(&norm, '/');
+	if (*url == '/') {
+		url++;
+		url_len--;
+	}
+	for (;;) {
+		const char *seg_start = norm.buf + norm.len;
+		const char *next_slash = url + strcspn(url, "/?#");
+		int skip_add_slash = 0;
+		/*
+		 * RFC 3689 indicates that any . or .. segments should be
+		 * unescaped before being checked for.
+		 */
+		if (!append_normalized_escapes(&norm, url, next_slash - url, "",
+					       URL_RESERVED)) {
+			strbuf_release(&norm);
+			return NULL;
+		}
+		if (!strcmp(seg_start, ".")) {
+			/* ignore a . segment; be careful not to remove initial '/' */
+			if (seg_start == path_start + 1) {
+				strbuf_setlen(&norm, norm.len - 1);
+				skip_add_slash = 1;
+			} else {
+				strbuf_setlen(&norm, norm.len - 2);
+			}
+		} else if (!strcmp(seg_start, "..")) {
+			/*
+			 * ignore a .. segment and remove the previous segment;
+			 * be careful not to remove initial '/' from path
+			 */
+			const char *prev_slash = norm.buf + norm.len - 3;
+			if (prev_slash == path_start) {
+				/* invalid .. because no previous segment to remove */
+				strbuf_release(&norm);
+				return NULL;
+			}
+			while (*--prev_slash != '/') {}
+			if (prev_slash == path_start) {
+				strbuf_setlen(&norm, prev_slash - norm.buf + 1);
+				skip_add_slash = 1;
+			} else {
+				strbuf_setlen(&norm, prev_slash - norm.buf);
+			}
+		}
+		url_len -= next_slash - url;
+		url = next_slash;
+		/* if the next char is not '/' done with the path */
+		if (*url != '/')
+			break;
+		url++;
+		url_len--;
+		if (!skip_add_slash)
+			strbuf_addch(&norm, '/');
+	}
+
+
+	/*
+	 * Now simply copy the rest, if any, only normalizing %-escapes and
+	 * being careful not to corrupt the URL by unescaping any delimiters.
+	 */
+	if (*url) {
+		if (!append_normalized_escapes(&norm, url, url_len, "", URL_RESERVED)) {
+			strbuf_release(&norm);
+			return NULL;
+		}
+	}
+
+
+	return strbuf_detach(&norm, NULL);
+}
+
 static size_t http_options_url_match_prefix(const char *url,
 					    const char *url_prefix,
 					    size_t url_prefix_len)
@@ -185,8 +479,13 @@ static size_t http_options_url_match_prefix(const char *url,
 	 * The return value is the length of the match in characters (excluding
 	 * any final '/') or 0 for no match.  Passing "/" as url_prefix will
 	 * always cause 0 to be returned.
+	 *
+	 * Passing NULL as url and/or url_prefix will always cause 0 to be
+	 * returned without causing any faults.
 	 */
 	size_t url_len;
+	if (!url || !url_prefix)
+		return 0;
 	if (url_prefix_len && url_prefix[url_prefix_len - 1] == '/')
 		url_prefix_len--;
 	if (!url_prefix_len || strncmp(url, url_prefix, url_prefix_len))
@@ -233,7 +532,13 @@ static int http_options(const char *var, const char *value, void *cb)
 	 */
 	dot = strrchr(key, '.');
 	if (dot) {
-		matchlen = http_options_url_match_prefix(url, key, dot - key);
+		char *config_url = xmemdupz(key, dot - key);
+		char *norm_url = http_options_url_normalize(config_url);
+		free(config_url);
+		if (!norm_url)
+			return 0;
+		matchlen = http_options_url_match_prefix(url, norm_url, strlen(norm_url));
+		free(norm_url);
 		if (!matchlen)
 			return 0;
 		key = dot + 1;
@@ -469,11 +774,13 @@ void http_init(struct remote *remote, const char *url, int proactive_auth)
 {
 	char *low_speed_limit;
 	char *low_speed_time;
+	char *norm_url = http_options_url_normalize(url);
 
 	http_is_verbose = 0;
 
 	memset(http_option_max_matched_len, 0, sizeof(http_option_max_matched_len));
-	git_config(http_options, (void *)url);
+	git_config(http_options, norm_url);
+	free(norm_url);
 
 	curl_global_init(CURL_GLOBAL_ALL);
 
-- 
1.8.3

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v7 3/4] tests: add new test for the url_normalize function
  2013-07-22  3:18 [PATCH v7 0/4] config: add support for http.<url>.* settings Kyle J. McKay
  2013-07-22  3:18 ` [PATCH v7 1/4] " Kyle J. McKay
  2013-07-22  3:18 ` [PATCH v7 2/4] config: improve " Kyle J. McKay
@ 2013-07-22  3:18 ` Kyle J. McKay
  2013-07-22  5:15   ` Junio C Hamano
  2013-07-22  6:45   ` Junio C Hamano
  2013-07-22  3:18 ` [PATCH v7 4/4] config: allow http.<url>.* any user matching Kyle J. McKay
  3 siblings, 2 replies; 11+ messages in thread
From: Kyle J. McKay @ 2013-07-22  3:18 UTC (permalink / raw)
  To: git
  Cc: David Aguilar, Petr Baudis, Junio C Hamano, Richard Hartmann,
	Jeff King, Daniel Knittl-Frank, Jan Krüger, Alejandro Mery,
	Aaron Schrab, Eric Sunshine

In order to perform sane URL matching for http.<url>.* options,
http.c normalizes URLs before performing matches.

A new test-url-normalize test program is introduced along with
a new t5200-url-normalize.sh script to run the tests.

Since the url_normalize function currently lives in http.c this
test will be skipped if NO_CURL is defined since http.c is skipped
in that case.

Signed-off-by: Kyle J. McKay <mackyle@gmail.com>
---
 .gitignore               |   1 +
 Makefile                 |   5 ++
 t/t5200-url-normalize.sh | 182 +++++++++++++++++++++++++++++++++++++++++++++++
 t/t5200/README           |   9 +++
 t/t5200/url-1            |   1 +
 t/t5200/url-10           |   1 +
 t/t5200/url-11           |   1 +
 t/t5200/url-2            |   1 +
 t/t5200/url-3            |   1 +
 t/t5200/url-4            |   1 +
 t/t5200/url-5            |   1 +
 t/t5200/url-6            |   1 +
 t/t5200/url-7            |   1 +
 t/t5200/url-8            |   1 +
 t/t5200/url-9            |   1 +
 test-url-normalize.c     |  61 ++++++++++++++++
 16 files changed, 269 insertions(+)
 create mode 100755 t/t5200-url-normalize.sh
 create mode 100644 t/t5200/README
 create mode 100644 t/t5200/url-1
 create mode 100644 t/t5200/url-10
 create mode 100644 t/t5200/url-11
 create mode 100644 t/t5200/url-2
 create mode 100644 t/t5200/url-3
 create mode 100644 t/t5200/url-4
 create mode 100644 t/t5200/url-5
 create mode 100644 t/t5200/url-6
 create mode 100644 t/t5200/url-7
 create mode 100644 t/t5200/url-8
 create mode 100644 t/t5200/url-9
 create mode 100644 test-url-normalize.c

diff --git a/.gitignore b/.gitignore
index 6669bf0..cd97e16 100644
--- a/.gitignore
+++ b/.gitignore
@@ -198,6 +198,7 @@
 /test-string-list
 /test-subprocess
 /test-svn-fe
+/test-url-normalize
 /test-wildmatch
 /common-cmds.h
 *.tar.gz
diff --git a/Makefile b/Makefile
index 0f931a2..f83879c 100644
--- a/Makefile
+++ b/Makefile
@@ -567,6 +567,7 @@ TEST_PROGRAMS_NEED_X += test-sigchain
 TEST_PROGRAMS_NEED_X += test-string-list
 TEST_PROGRAMS_NEED_X += test-subprocess
 TEST_PROGRAMS_NEED_X += test-svn-fe
+TEST_PROGRAMS_NEED_X += test-url-normalize
 TEST_PROGRAMS_NEED_X += test-wildmatch
 
 TEST_PROGRAMS = $(patsubst %,%$X,$(TEST_PROGRAMS_NEED_X))
@@ -2235,6 +2236,10 @@ test-parse-options$X: parse-options.o parse-options-cb.o
 
 test-svn-fe$X: vcs-svn/lib.a
 
+test-url-normalize$X: test-url-normalize.o GIT-LDFLAGS $(GITLIBS)
+	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) \
+		$(LIBS) $(CURL_LIBCURL) $(EXPAT_LIBEXPAT)
+
 .PRECIOUS: $(TEST_OBJS)
 
 test-%$X: test-%.o GIT-LDFLAGS $(GITLIBS)
diff --git a/t/t5200-url-normalize.sh b/t/t5200-url-normalize.sh
new file mode 100755
index 0000000..12ac411
--- /dev/null
+++ b/t/t5200-url-normalize.sh
@@ -0,0 +1,182 @@
+#!/bin/sh
+
+test_description='url normalization'
+. ./test-lib.sh
+
+if test -n "$NO_CURL"; then
+	skip_all='skipping test, git built without http support'
+	test_done
+fi
+
+# The base name of the test url files
+tu="$TEST_DIRECTORY/t5200/url"
+
+# Note that only file: URLs should be allowed without a host
+
+test_expect_success 'url scheme' '
+	! test-url-normalize "" &&
+	! test-url-normalize "_" &&
+	! test-url-normalize "scheme" &&
+	! test-url-normalize "scheme:" &&
+	! test-url-normalize "scheme:/" &&
+	! test-url-normalize "scheme://" &&
+	! test-url-normalize "file" &&
+	! test-url-normalize "file:" &&
+	! test-url-normalize "file:/" &&
+	test-url-normalize "file://" &&
+	! test-url-normalize "://acme.co" &&
+	! test-url-normalize "x_test://acme.co" &&
+	! test-url-normalize "-test://acme.co" &&
+	! test-url-normalize "0test://acme.co" &&
+	! test-url-normalize "+test://acme.co" &&
+	! test-url-normalize ".test://acme.co" &&
+	! test-url-normalize "schem%6e://" &&
+	test-url-normalize "x-Test+v1.0://acme.co" &&
+	test "$(test-url-normalize -p "AbCdeF://x.Y")" = "abcdef://x.y/"
+'
+
+test_expect_success 'url authority' '
+	! test-url-normalize "scheme://user:pass@" &&
+	! test-url-normalize "scheme://?" &&
+	! test-url-normalize "scheme://#" &&
+	! test-url-normalize "scheme:///" &&
+	! test-url-normalize "scheme://:" &&
+	! test-url-normalize "scheme://:555" &&
+	test-url-normalize "file://user:pass@" &&
+	test-url-normalize "file://?" &&
+	test-url-normalize "file://#" &&
+	test-url-normalize "file:///" &&
+	test-url-normalize "file://:" &&
+	! test-url-normalize "file://:555" &&
+	test-url-normalize "scheme://user:pass@host" &&
+	test-url-normalize "scheme://@host" &&
+	test-url-normalize "scheme://%00@host" &&
+	! test-url-normalize "scheme://%%@host" &&
+	! test-url-normalize "scheme://host_" &&
+	test-url-normalize "scheme://user:pass@host/" &&
+	test-url-normalize "scheme://@host/" &&
+	test-url-normalize "scheme://host/" &&
+	test-url-normalize "scheme://host?x" &&
+	test-url-normalize "scheme://host#x" &&
+	test-url-normalize "scheme://host/@" &&
+	test-url-normalize "scheme://host?@x" &&
+	test-url-normalize "scheme://host#@x" &&
+	test-url-normalize "scheme://[::1]" &&
+	test-url-normalize "scheme://[::1]/" &&
+	! test-url-normalize "scheme://hos%41/" &&
+	test-url-normalize "scheme://[invalid....:/" &&
+	test-url-normalize "scheme://invalid....:]/" &&
+	! test-url-normalize "scheme://invalid....:[/" &&
+	! test-url-normalize "scheme://invalid....:["
+'
+
+test_expect_success 'url port checks' '
+	test-url-normalize "xyz://q@some.host:" &&
+	test-url-normalize "xyz://q@some.host:456/" &&
+	! test-url-normalize "xyz://q@some.host:0" &&
+	! test-url-normalize "xyz://q@some.host:0000000" &&
+	test-url-normalize "xyz://q@some.host:0000001?" &&
+	test-url-normalize "xyz://q@some.host:065535#" &&
+	test-url-normalize "xyz://q@some.host:65535" &&
+	! test-url-normalize "xyz://q@some.host:65536" &&
+	! test-url-normalize "xyz://q@some.host:99999" &&
+	! test-url-normalize "xyz://q@some.host:100000" &&
+	! test-url-normalize "xyz://q@some.host:100001" &&
+	test-url-normalize "http://q@some.host:80" &&
+	test-url-normalize "https://q@some.host:443" &&
+	test-url-normalize "http://q@some.host:80/" &&
+	test-url-normalize "https://q@some.host:443?" &&
+	! test-url-normalize "http://q@:8008" &&
+	! test-url-normalize "http://:8080" &&
+	! test-url-normalize "http://:" &&
+	test-url-normalize "xyz://q@some.host:456/" &&
+	test-url-normalize "xyz://[::1]:456/" &&
+	test-url-normalize "xyz://[::1]:/" &&
+	! test-url-normalize "xyz://[::1]:000/" &&
+	! test-url-normalize "xyz://[::1]:0%300/" &&
+	! test-url-normalize "xyz://[::1]:0x80/" &&
+	! test-url-normalize "xyz://[::1]:4294967297/" &&
+	! test-url-normalize "xyz://[::1]:030f/"
+'
+
+test_expect_success 'url port normalization' '
+	test "$(test-url-normalize -p "http://x:800")" = "http://x:800/" &&
+	test "$(test-url-normalize -p "http://x:0800")" = "http://x:800/" &&
+	test "$(test-url-normalize -p "http://x:00000800")" = "http://x:800/" &&
+	test "$(test-url-normalize -p "http://x:065535")" = "http://x:65535/" &&
+	test "$(test-url-normalize -p "http://x:1")" = "http://x:1/" &&
+	test "$(test-url-normalize -p "http://x:80")" = "http://x/" &&
+	test "$(test-url-normalize -p "http://x:080")" = "http://x/" &&
+	test "$(test-url-normalize -p "http://x:000000080")" = "http://x/" &&
+	test "$(test-url-normalize -p "https://x:443")" = "https://x/" &&
+	test "$(test-url-normalize -p "https://x:0443")" = "https://x/" &&
+	test "$(test-url-normalize -p "https://x:000000443")" = "https://x/"
+'
+
+test_expect_success 'url general escapes' '
+	! test-url-normalize "http://x.y?%fg" &&
+	test "$(test-url-normalize -p "X://W/%7e%41^%3a")" = "x://w/~A%5E%3A" &&
+	test "$(test-url-normalize -p "X://W/:/?#[]@")" = "x://w/:/?#[]@" &&
+	test "$(test-url-normalize -p "X://W/$&()*+,;=")" = "x://w/$&()*+,;=" &&
+	test "$(test-url-normalize -p "X://W/'\''")" = "x://w/'\''" &&
+	test "$(test-url-normalize -p "X://W?'\!'")" = "x://w/?'\!'"
+';#'
+
+test_expect_success 'url high-bit escapes' '
+	test "$(test-url-normalize -p "$(cat "$tu-1")")" = "x://q/%01%02%03%04%05%06%07%08%0E%0F%10%11%12" &&
+	test "$(test-url-normalize -p "$(cat "$tu-2")")" = "x://q/%13%14%15%16%17%18%19%1B%1C%1D%1E%1F%7F" &&
+	test "$(test-url-normalize -p "$(cat "$tu-3")")" = "x://q/%80%81%82%83%84%85%86%87%88%89%8A%8B%8C%8D%8E%8F" &&
+	test "$(test-url-normalize -p "$(cat "$tu-4")")" = "x://q/%90%91%92%93%94%95%96%97%98%99%9A%9B%9C%9D%9E%9F" &&
+	test "$(test-url-normalize -p "$(cat "$tu-5")")" = "x://q/%A0%A1%A2%A3%A4%A5%A6%A7%A8%A9%AA%AB%AC%AD%AE%AF" &&
+	test "$(test-url-normalize -p "$(cat "$tu-6")")" = "x://q/%B0%B1%B2%B3%B4%B5%B6%B7%B8%B9%BA%BB%BC%BD%BE%BF" &&
+	test "$(test-url-normalize -p "$(cat "$tu-7")")" = "x://q/%C0%C1%C2%C3%C4%C5%C6%C7%C8%C9%CA%CB%CC%CD%CE%CF" &&
+	test "$(test-url-normalize -p "$(cat "$tu-8")")" = "x://q/%D0%D1%D2%D3%D4%D5%D6%D7%D8%D9%DA%DB%DC%DD%DE%DF" &&
+	test "$(test-url-normalize -p "$(cat "$tu-9")")" = "x://q/%E0%E1%E2%E3%E4%E5%E6%E7%E8%E9%EA%EB%EC%ED%EE%EF" &&
+	test "$(test-url-normalize -p "$(cat "$tu-10")")" = "x://q/%F0%F1%F2%F3%F4%F5%F6%F7%F8%F9%FA%FB%FC%FD%FE%FF" &&
+	test "$(test-url-normalize -p "$(cat "$tu-11")")" = "x://q/%C2%80%DF%BF%E0%A0%80%EF%BF%BD%F0%90%80%80%F0%AF%BF%BD"
+'
+
+test_expect_success 'url username/password escapes' '
+	test "$(test-url-normalize -p "x://%41%62(^):%70+d@foo")" = "x://Ab(%5E):p+d@foo/"
+'
+
+test_expect_success 'url normalized lengths' '
+	test "$(test-url-normalize -l "Http://%4d%65:%4d^%70@The.Host")" = 25 &&
+	test "$(test-url-normalize -l "http://%41:%42@x.y/%61/")" = 17 &&
+	test "$(test-url-normalize -l "http://@x.y/^")" = 15
+'
+
+test_expect_success 'url . and .. segments' '
+	test "$(test-url-normalize -p "x://y/.")" = "x://y/" &&
+	test "$(test-url-normalize -p "x://y/./")" = "x://y/" &&
+	test "$(test-url-normalize -p "x://y/a/.")" = "x://y/a" &&
+	test "$(test-url-normalize -p "x://y/a/./")" = "x://y/a/" &&
+	test "$(test-url-normalize -p "x://y/.?")" = "x://y/?" &&
+	test "$(test-url-normalize -p "x://y/./?")" = "x://y/?" &&
+	test "$(test-url-normalize -p "x://y/a/.?")" = "x://y/a?" &&
+	test "$(test-url-normalize -p "x://y/a/./?")" = "x://y/a/?" &&
+	test "$(test-url-normalize -p "x://y/a/./b/.././../c")" = "x://y/c" &&
+	test "$(test-url-normalize -p "x://y/a/./b/../.././c/")" = "x://y/c/" &&
+	test "$(test-url-normalize -p "x://y/a/./b/.././../c/././.././.")" = "x://y/" &&
+	! test-url-normalize "x://y/a/./b/.././../c/././.././.." &&
+	test "$(test-url-normalize -p "x://y/a/./?/././..")" = "x://y/a/?/././.." &&
+	test "$(test-url-normalize -p "x://y/%2e/")" = "x://y/" &&
+	test "$(test-url-normalize -p "x://y/%2E/")" = "x://y/" &&
+	test "$(test-url-normalize -p "x://y/a/%2e./")" = "x://y/" &&
+	test "$(test-url-normalize -p "x://y/b/.%2E/")" = "x://y/" &&
+	test "$(test-url-normalize -p "x://y/c/%2e%2E/")" = "x://y/"
+'
+
+# http://@foo specifies an empty user name but does not specify a password
+# http://foo  specifies neither a user name nor a password
+# So they should not be equivalent
+test_expect_success 'url equivalents' '
+	test-url-normalize "httP://x" "Http://X/" &&
+	test-url-normalize "Http://%4d%65:%4d^%70@The.Host" "hTTP://Me:%4D^p@the.HOST:80/" &&
+	! test-url-normalize "https://@x.y/^" "httpS://x.y:443/^" &&
+	test-url-normalize "https://@x.y/^" "httpS://@x.y:0443/^" &&
+	test-url-normalize "https://@x.y/^/../abc" "httpS://@x.y:0443/abc" &&
+	test-url-normalize "https://@x.y/^/.." "httpS://@x.y:0443/"
+'
+
+test_done
diff --git a/t/t5200/README b/t/t5200/README
new file mode 100644
index 0000000..b908fe9
--- /dev/null
+++ b/t/t5200/README
@@ -0,0 +1,9 @@
+The data files in this directory contain URLs with characters in
+the range 0x01-0x1f and 0x7f-0xff to test the proper normalization
+of unprintable characters.
+
+A select few characters in the 0x01-0x1f range are skipped to
+help avoid problems running the test itself.
+
+They are in test files in this directory rather than being
+embedded in the test script for portability.
diff --git a/t/t5200/url-1 b/t/t5200/url-1
new file mode 100644
index 0000000..519019c
--- /dev/null
+++ b/t/t5200/url-1
@@ -0,0 +1 @@
+x://q/\x01\x02\x03\x04\x05\x06\a\b\x0e\x0f\x10\x11\x12
diff --git a/t/t5200/url-10 b/t/t5200/url-10
new file mode 100644
index 0000000..b9965de
--- /dev/null
+++ b/t/t5200/url-10
@@ -0,0 +1 @@
+x://q/ðñòóôõö÷øùúûüýþÿ
diff --git a/t/t5200/url-11 b/t/t5200/url-11
new file mode 100644
index 0000000..f0a50f1
--- /dev/null
+++ b/t/t5200/url-11
@@ -0,0 +1 @@
+x://q/Â€ß¿à €ï¿½ð€€ð¯¿½
diff --git a/t/t5200/url-2 b/t/t5200/url-2
new file mode 100644
index 0000000..43334b0
--- /dev/null
+++ b/t/t5200/url-2
@@ -0,0 +1 @@
+x://q/\x13\x14\x15\x16\x17\x18\x19^[\x1c\x1d\x1e\x1f\x7f
diff --git a/t/t5200/url-3 b/t/t5200/url-3
new file mode 100644
index 0000000..7378c7b
--- /dev/null
+++ b/t/t5200/url-3
@@ -0,0 +1 @@
+x://q/€‚ƒ„…†‡ˆ‰Š‹ŒŽ
diff --git a/t/t5200/url-4 b/t/t5200/url-4
new file mode 100644
index 0000000..220b198
--- /dev/null
+++ b/t/t5200/url-4
@@ -0,0 +1 @@
+x://q/‘’“”•–—˜™š›œžŸ
diff --git a/t/t5200/url-5 b/t/t5200/url-5
new file mode 100644
index 0000000..1ccd927
--- /dev/null
+++ b/t/t5200/url-5
@@ -0,0 +1 @@
+x://q/ ¡¢£¤¥¦§¨©ª«¬®¯
diff --git a/t/t5200/url-6 b/t/t5200/url-6
new file mode 100644
index 0000000..e8283aa
--- /dev/null
+++ b/t/t5200/url-6
@@ -0,0 +1 @@
+x://q/°±²³´µ¶·¸¹º»¼½¾¿
diff --git a/t/t5200/url-7 b/t/t5200/url-7
new file mode 100644
index 0000000..fa7c10b
--- /dev/null
+++ b/t/t5200/url-7
@@ -0,0 +1 @@
+x://q/ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏ
diff --git a/t/t5200/url-8 b/t/t5200/url-8
new file mode 100644
index 0000000..79a0ba8
--- /dev/null
+++ b/t/t5200/url-8
@@ -0,0 +1 @@
+x://q/ÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞß
diff --git a/t/t5200/url-9 b/t/t5200/url-9
new file mode 100644
index 0000000..8b44bec
--- /dev/null
+++ b/t/t5200/url-9
@@ -0,0 +1 @@
+x://q/àáâãäåæçèéêëìíîï
diff --git a/test-url-normalize.c b/test-url-normalize.c
new file mode 100644
index 0000000..d68312d
--- /dev/null
+++ b/test-url-normalize.c
@@ -0,0 +1,61 @@
+#ifdef NO_CURL
+
+int main()
+{
+	return 125;
+}
+
+#else /* !NO_CURL */
+
+#include "http.c"
+
+#define url_normalize(u) http_options_url_normalize(u)
+
+int main(int argc, char **argv)
+{
+	const char *usage = "test-url-normalize [-p | -l] <url1> | <url1> <url2>";
+	char *url1, *url2;
+	int opt_p = 0, opt_l = 0;
+
+	/*
+	 * For one url, succeed if url_normalize succeeds on it, fail otherwise.
+	 * For two urls, succeed only if url_normalize succeeds on both and
+	 * the results compare equal with strcmp.  If -p is given (one url only)
+	 * and url_normalize succeeds, print the result followed by "\n".  If
+	 * -l is given (one url only) and url_normalize succeeds, print the
+	 * returned length in decimal followed by "\n".
+	 */
+
+	if (argc > 1 && !strcmp(argv[1], "-p")) {
+		opt_p = 1;
+		argc--;
+		argv++;
+	} else if (argc > 1 && !strcmp(argv[1], "-l")) {
+		opt_l = 1;
+		argc--;
+		argv++;
+	}
+
+	if (argc < 2 || argc > 3)
+		die(usage);
+
+	if (argc == 2) {
+		url1 = url_normalize(argv[1]);
+		if (!url1)
+			return 1;
+		if (opt_p)
+			printf("%s\n", url1);
+		if (opt_l)
+			printf("%u\n", (unsigned)strlen(url1));
+		return 0;
+	}
+
+	if (opt_p || opt_l)
+		die(usage);
+
+	url1 = url_normalize(argv[1]);
+	url2 = url_normalize(argv[2]);
+	return (url1 && url2 && !strcmp(url1, url2)) ? 0 : 1;
+}
+
+#endif /* !NO_CURL */
-- 
1.8.3

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v7 4/4] config: allow http.<url>.* any user matching
  2013-07-22  3:18 [PATCH v7 0/4] config: add support for http.<url>.* settings Kyle J. McKay
                   ` (2 preceding siblings ...)
  2013-07-22  3:18 ` [PATCH v7 3/4] tests: add new test for the url_normalize function Kyle J. McKay
@ 2013-07-22  3:18 ` Kyle J. McKay
  2013-07-22  5:28   ` Junio C Hamano
  3 siblings, 1 reply; 11+ messages in thread
From: Kyle J. McKay @ 2013-07-22  3:18 UTC (permalink / raw)
  To: git
  Cc: David Aguilar, Petr Baudis, Junio C Hamano, Richard Hartmann,
	Jeff King, Daniel Knittl-Frank, Jan Krüger, Alejandro Mery,
	Aaron Schrab, Eric Sunshine

Previously the <url> had to specify an exactly matching user name
and password if those were present in the url being matched against.

Now the password portion is always ignored and omitting the user
name from <url> allows it to match against any user name.

Signed-off-by: Kyle J. McKay <mackyle@gmail.com>
---
 Documentation/config.txt |  23 ++--
 http.c                   | 281 +++++++++++++++++++++++++++++++++++++++--------
 test-url-normalize.c     |  11 +-
 3 files changed, 255 insertions(+), 60 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index e461f32..8b32a15 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1517,15 +1517,20 @@ http.<url>.*::
 	Any of the http.* options above can be applied selectively to some urls.
 	For example "http.https://example.com.useragent" would set the user
 	agent only for https connections to example.com.  The <url> value
-	matches a url if it is an exact match or if it is a prefix of the url
-	matching at a "/" boundary.  Longer <url> matches take precedence over
-	shorter ones with the environment variable settings taking precedence
-	over all.  The urls are normalized before testing for a match.  Note,
-	however, that any user, password and/or port setting that appears in a
-	url must also match that part of <url> to have a successful match.  The
-	urls that are matched against are those given directly to git commands.
-	This means any urls visited as a result of a redirection do not
-	participate in matching.
+	matches a url if it refers to the same scheme, host and port and the
+	path portion is an exact match or a prefix that matches at a "/"
+	boundary.  If <url> does not include a user name, it will match a url
+	with any username otherwise the user name must match as well (the
+	password part, if present in the url, is always ignored).  Longer <url>
+	path matches take precedence over shorter matches no matter what order
+	they occur in.  For same length matches, the last one wins except that a
+	same-length <url> match that includes a user name will be preferred over
+	a same-length <url> match that does not.  The urls are normalized before
+	matching so that equivalent urls that are simply spelled differently
+	will match properly.  Environment variable settings always override any
+	matches.  The urls that are matched against are those given directly to
+	git commands.  This means any urls visited as a result of a redirection
+	do not participate in matching.
 
 i18n.commitEncoding::
 	Character encoding the commit messages are stored in; Git itself
diff --git a/http.c b/http.c
index 29e119c..c636d3c 100644
--- a/http.c
+++ b/http.c
@@ -56,7 +56,35 @@ enum http_option_type {
 	OPT_MAX
 };
 
+struct url_info {
+	char *url;		/* normalized url on success, must be freed, otherwise NULL */
+	const char *err;	/* if !url, a brief reason for the failure, otherwise NULL */
+
+	/* the rest of the fields are only set if url != NULL */
+
+	size_t url_len;		/* total length of url (which is now normalized) */
+	size_t scheme_len;	/* length of scheme name (excluding final :) */
+	size_t user_off;	/* offset into url to start of user name (0 => none) */
+	size_t user_len;	/* length of user name; if user_off != 0 but
+				   user_len == 0, an empty user name was given */
+	size_t passwd_off;	/* offset into url to start of passwd (0 => none) */
+	size_t passwd_len;	/* length of passwd; if passwd_off != 0 but
+				   passwd_len == 0, an empty passwd was given */
+	size_t host_off;	/* offset into url to start of host name (0 => none) */
+	size_t host_len;	/* length of host name; this INCLUDES any ':portnum';
+				 * file urls may have host_len == 0 */
+	size_t port_len;	/* if a portnum is present (port_len != 0), it has
+				 * this length (excluding the leading ':') at the
+				 * end of the host name (always 0 for file urls) */
+	size_t path_off;	/* offset into url to the start of the url path;
+				 * this will always point to a '/' character
+				 * after the url has been normalized */
+	size_t path_len;	/* length of path portion excluding any trailing
+				 * '?...' and '#...' portion; will always be >= 1 */
+};
+
 static size_t http_option_max_matched_len[OPT_MAX];
+static int http_option_user_matched[OPT_MAX];
 
 static int curl_ssl_verify = -1;
 static int curl_ssl_try;
@@ -231,7 +259,7 @@ static int append_normalized_escapes(struct strbuf *buf,
 	return 1;
 }
 
-static char *http_options_url_normalize(const char *url)
+static char *http_options_url_normalize(const char *url, struct url_info *out_info)
 {
 	/*
 	 * Normalize NUL-terminated url using the following rules:
@@ -254,6 +282,14 @@ static char *http_options_url_normalize(const char *url)
 	 * The return value is a newly allocated string that must be freed
 	 * or NULL if the url is not valid.
 	 *
+	 * If out_info is non-NULL, the url and err fields therein will always
+	 * be set.  If a non-NULL value is returned, it will be stored in
+	 * out_info->url as well, out_info->err will be set to NULL and the
+	 * other fields of *out_info will also be filled in.  If a NULL value
+	 * is returned, NULL will be stored in out_info->url and out_info->err
+	 * will be set to a brief, translated, error message, but no other
+	 * fields will be filled in.
+	 *
 	 * This is NOT a URL validation function.  Full URL validation is NOT
 	 * performed.  Some invalid host names are passed through this function
 	 * undetected.  However, most all other problems that make a URL invalid
@@ -263,9 +299,10 @@ static char *http_options_url_normalize(const char *url)
 	size_t url_len = strlen(url);
 	struct strbuf norm;
 	size_t spanned;
+	size_t scheme_len, user_off=0, user_len=0, passwd_off=0, passwd_len=0;
+	size_t host_off=0, host_len=0, port_len=0, path_off, path_len, result_len;
 	const char *slash_ptr, *at_ptr, *colon_ptr, *path_start;
-	int found_host = 0;
-
+	char *result;
 
 	/*
 	 * Copy lowercased scheme and :// suffix, %-escapes are not allowed
@@ -273,9 +310,15 @@ static char *http_options_url_normalize(const char *url)
 	 */
 	spanned = strspn(url, URL_SCHEME_CHARS);
 	if (!spanned || !isalpha(url[0]) || spanned + 3 > url_len ||
-	    url[spanned] != ':' || url[spanned+1] != '/' || url[spanned+2] != '/')
+	    url[spanned] != ':' || url[spanned+1] != '/' || url[spanned+2] != '/') {
+		if (out_info) {
+			out_info->url = NULL;
+			out_info->err = _("invalid URL scheme name or missing '://' suffix");
+		}
 		return NULL; /* Bad scheme and/or missing "://" part */
+	}
 	strbuf_init(&norm, url_len);
+	scheme_len = spanned;
 	spanned += 3;
 	url_len -= spanned;
 	while (spanned--)
@@ -288,12 +331,25 @@ static char *http_options_url_normalize(const char *url)
 	at_ptr = strchr(url, '@');
 	slash_ptr = url + strcspn(url, "/?#");
 	if (at_ptr && at_ptr < slash_ptr) {
+		user_off = norm.len;
 		if (at_ptr > url) {
 			if (!append_normalized_escapes(&norm, url, at_ptr - url,
 						       "", URL_RESERVED)) {
+				if (out_info) {
+					out_info->url = NULL;
+					out_info->err = _("invalid %XX escape sequence");
+				}
 				strbuf_release(&norm);
 				return NULL;
 			}
+			colon_ptr = strchr(norm.buf + scheme_len + 3, ':');
+			if (colon_ptr) {
+				passwd_off = (colon_ptr + 1) - norm.buf;
+				passwd_len = norm.len - passwd_off;
+				user_len = (passwd_off - 1) - (scheme_len + 3);
+			} else {
+				user_len = norm.len - (scheme_len + 3);
+			}
 		}
 		strbuf_addch(&norm, '@');
 		url_len -= (++at_ptr - url);
@@ -307,25 +363,37 @@ static char *http_options_url_normalize(const char *url)
 	if (!url_len || strchr(":/?#", *url)) {
 		/* Missing host invalid for all URL schemes except file */
 		if (strncmp(norm.buf, "file:", 5)) {
+			if (out_info) {
+				out_info->url = NULL;
+				out_info->err = _("missing host and scheme is not 'file:'");
+			}
 			strbuf_release(&norm);
 			return NULL;
 		}
 	} else {
-		found_host = 1;
+		host_off = norm.len;
 	}
 	colon_ptr = slash_ptr - 1;
 	while (colon_ptr > url && *colon_ptr != ':' && *colon_ptr != ']')
 		colon_ptr--;
 	if (*colon_ptr != ':') {
 		colon_ptr = slash_ptr;
-	} else if (!found_host && colon_ptr < slash_ptr && colon_ptr + 1 != slash_ptr) {
+	} else if (!host_off && colon_ptr < slash_ptr && colon_ptr + 1 != slash_ptr) {
 		/* file: URLs may not have a port number */
+		if (out_info) {
+			out_info->url = NULL;
+			out_info->err = _("a 'file:' URL may not have a port number");
+		}
 		strbuf_release(&norm);
 		return NULL;
 	}
 	spanned = strspn(url, URL_HOST_CHARS);
 	if (spanned < colon_ptr - url) {
 		/* Host name has invalid characters */
+		if (out_info) {
+			out_info->url = NULL;
+			out_info->err = _("invalid characters in host name");
+		}
 		strbuf_release(&norm);
 		return NULL;
 	}
@@ -367,6 +435,10 @@ static char *http_options_url_normalize(const char *url)
 			spanned = strspn(url, URL_DIGIT);
 			if (spanned < slash_ptr - url) {
 				/* port number has invalid characters */
+				if (out_info) {
+					out_info->url = NULL;
+					out_info->err = _("invalid port number");
+				}
 				strbuf_release(&norm);
 				return NULL;
 			}
@@ -374,15 +446,22 @@ static char *http_options_url_normalize(const char *url)
 				pnum = strtoul(url, NULL, 10);
 			if (pnum == 0 || pnum > 65535) {
 				/* port number not in range 1..65535 */
+				if (out_info) {
+					out_info->url = NULL;
+					out_info->err = _("invalid port number");
+				}
 				strbuf_release(&norm);
 				return NULL;
 			}
 			strbuf_addch(&norm, ':');
 			strbuf_add(&norm, url, slash_ptr - url);
+			port_len = slash_ptr - url;
 		}
 		url_len -= slash_ptr - colon_ptr;
 		url = slash_ptr;
 	}
+	if (host_off)
+		host_len = norm.len - host_off;
 
 
 	/*
@@ -390,7 +469,8 @@ static char *http_options_url_normalize(const char *url)
 	 * to corrupt the URL by unescaping any delimiters, but do add an
 	 * initial '/' if it's missing and do normalize any %-escape sequences.
 	 */
-	path_start = norm.buf + norm.len;
+	path_off = norm.len;
+	path_start = norm.buf + path_off;
 	strbuf_addch(&norm, '/');
 	if (*url == '/') {
 		url++;
@@ -406,6 +486,10 @@ static char *http_options_url_normalize(const char *url)
 		 */
 		if (!append_normalized_escapes(&norm, url, next_slash - url, "",
 					       URL_RESERVED)) {
+			if (out_info) {
+				out_info->url = NULL;
+				out_info->err = _("invalid %XX escape sequence");
+			}
 			strbuf_release(&norm);
 			return NULL;
 		}
@@ -425,6 +509,10 @@ static char *http_options_url_normalize(const char *url)
 			const char *prev_slash = norm.buf + norm.len - 3;
 			if (prev_slash == path_start) {
 				/* invalid .. because no previous segment to remove */
+				if (out_info) {
+					out_info->url = NULL;
+					out_info->err = _("invalid '..' path segment");
+				}
 				strbuf_release(&norm);
 				return NULL;
 			}
@@ -446,6 +534,7 @@ static char *http_options_url_normalize(const char *url)
 		if (!skip_add_slash)
 			strbuf_addch(&norm, '/');
 	}
+	path_len = norm.len - path_off;
 
 
 	/*
@@ -454,13 +543,33 @@ static char *http_options_url_normalize(const char *url)
 	 */
 	if (*url) {
 		if (!append_normalized_escapes(&norm, url, url_len, "", URL_RESERVED)) {
+			if (out_info) {
+				out_info->url = NULL;
+				out_info->err = _("invalid %XX escape sequence");
+			}
 			strbuf_release(&norm);
 			return NULL;
 		}
 	}
 
 
-	return strbuf_detach(&norm, NULL);
+	result = strbuf_detach(&norm, &result_len);
+	if (out_info) {
+		out_info->url = result;
+		out_info->err = NULL;
+		out_info->url_len = result_len;
+		out_info->scheme_len = scheme_len;
+		out_info->user_off = user_off;
+		out_info->user_len = user_len;
+		out_info->passwd_off = passwd_off;
+		out_info->passwd_len = passwd_len;
+		out_info->host_off = host_off;
+		out_info->host_len = host_len;
+		out_info->port_len = port_len;
+		out_info->path_off = path_off;
+		out_info->path_len = path_len;
+	}
+	return result;
 }
 
 static size_t http_options_url_match_prefix(const char *url,
@@ -476,48 +585,120 @@ static size_t http_options_url_match_prefix(const char *url,
 	 * url must be NUL terminated.  url_prefix_len is the length of
 	 * url_prefix which need not be NUL terminated.
 	 *
-	 * The return value is the length of the match in characters (excluding
-	 * any final '/') or 0 for no match.  Passing "/" as url_prefix will
-	 * always cause 0 to be returned.
+	 * The return value is the length of the match in characters (including
+	 * the final '/' even if it's implicit) or 0 for no match.
 	 *
 	 * Passing NULL as url and/or url_prefix will always cause 0 to be
 	 * returned without causing any faults.
 	 */
-	size_t url_len;
 	if (!url || !url_prefix)
 		return 0;
-	if (url_prefix_len && url_prefix[url_prefix_len - 1] == '/')
+	if (!url_prefix_len || (url_prefix_len == 1 && *url_prefix == '/'))
+		return (!*url || *url == '/') ? 1 : 0;
+	if (url_prefix[url_prefix_len - 1] == '/')
 		url_prefix_len--;
-	if (!url_prefix_len || strncmp(url, url_prefix, url_prefix_len))
+	if (strncmp(url, url_prefix, url_prefix_len))
 		return 0;
-	url_len = strlen(url);
-	if ((url_len == url_prefix_len) || (url[url_prefix_len] == '/'))
-		return url_prefix_len;
+	if ((strlen(url) == url_prefix_len) || (url[url_prefix_len] == '/'))
+		return url_prefix_len + 1;
 	return 0;
 }
 
-static int new_match_is_shorter(size_t matchlen, enum http_option_type opt)
+static int http_options_match_urls(const struct url_info *url,
+				   const struct url_info *url_prefix,
+				   int *exactusermatch)
+{
+	/*
+	 * url_prefix matches url if the scheme, host and port of url_prefix
+	 * are the same as those of url and the path portion of url_prefix
+	 * is the same as the path portion of url or it is a prefix that
+	 * matches at a '/' boundary.  If url_prefix contains a user name,
+	 * that must also exactly match the user name in url.
+	 *
+	 * If the user, host, port and path match in this fashion, the returned
+	 * value is the length of the path match including any implicit
+	 * final '/'.  For example, "http://me@example.com/path" is matched by
+	 * "http://example.com" with a path length of 1.
+	 *
+	 * If there is a match and exactusermatch is not NULL, then
+	 * *exactusermatch will be set to true if both url and url_prefix
+	 * contained a user name or false if url_prefix did not have a
+	 * user name.  If there is no match *exactusermatch is left untouched.
+	 */
+	int usermatched = 0;
+	int pathmatchlen;
+
+	if (!url || !url_prefix || !url->url || !url_prefix->url)
+		return 0;
+
+	/* check the scheme */
+	if (url_prefix->scheme_len != url->scheme_len ||
+	    strncmp(url->url, url_prefix->url, url->scheme_len))
+		return 0; /* schemes do not match */
+
+	/* check the user name if url_prefix has one */
+	if (url_prefix->user_off) {
+		if (!url->user_off || url->user_len != url_prefix->user_len ||
+		    strncmp(url->url + url->user_off,
+			    url_prefix->url + url_prefix->user_off,
+			    url->user_len))
+			return 0; /* url_prefix has a user but it's not a match */
+		usermatched = 1;
+	}
+
+	/* check the host and port */
+	if (url_prefix->host_len != url->host_len ||
+	    strncmp(url->url + url->host_off,
+		    url_prefix->url + url_prefix->host_off, url->host_len))
+		return 0; /* host names and/or ports do not match */
+
+	/* check the path */
+	pathmatchlen = http_options_url_match_prefix(
+		url->url + url->path_off,
+		url_prefix->url + url_prefix->path_off,
+		url_prefix->url_len - url_prefix->path_off);
+
+	if (pathmatchlen && exactusermatch)
+		*exactusermatch = usermatched;
+	return pathmatchlen;
+}
+
+static int match_is_ignored(size_t matchlen, int usermatch, enum http_option_type opt)
 {
 	/*
-	 * Compare matchlen to the last matched length of option opt and
+	 * Compare matchlen to the last matched path length of option opt and
 	 * return true if matchlen is shorter than the last matched length
 	 * (meaning the config setting should be ignored).  Upon seeing the
 	 * _same_ key (i.e. new key has the same match length which is therefore
-	 * not shorter) the new setting will override the previous setting.
-	 * Otherwise return false and record matchlen as the current last
-	 * matched length of option opt.
+	 * not shorter) the new setting will override the previous setting
+	 * unless the new setting did not match the user and the previous match
+	 * did.  Otherwise return false and record matchlen as the current last
+	 * matched path length of option opt and usermatch as the last user
+	 * matching state for option opt.
 	 */
 	if (matchlen < http_option_max_matched_len[opt])
 		return 1;
-	http_option_max_matched_len[opt] = matchlen;
+	if (matchlen > http_option_max_matched_len[opt]) {
+		http_option_max_matched_len[opt] = matchlen;
+		http_option_user_matched[opt] = usermatch;
+		return 0;
+	}
+	/*
+	 * If a previous match of the same length explicitly matched the user
+	 * name, but the current match matched on any user, ignore it.
+	 */
+	if (!usermatch && http_option_user_matched[opt])
+		return 1;
+	http_option_user_matched[opt] = usermatch;
 	return 0;
 }
 
 static int http_options(const char *var, const char *value, void *cb)
 {
-	const char *url = cb;
+	const struct url_info *info = cb;
 	const char *key, *dot;
 	size_t matchlen = 0;
+	int usermatch = 0;
 
 	key = skip_prefix(var, "http.");
 	if (!key)
@@ -532,12 +713,18 @@ static int http_options(const char *var, const char *value, void *cb)
 	 */
 	dot = strrchr(key, '.');
 	if (dot) {
-		char *config_url = xmemdupz(key, dot - key);
-		char *norm_url = http_options_url_normalize(config_url);
+		char *config_url;
+		struct url_info norm_info;
+		char *norm_url;
+
+		if (!info || !info->url)
+			return 0;
+		config_url = xmemdupz(key, dot - key);
+		norm_url = http_options_url_normalize(config_url, &norm_info);
 		free(config_url);
 		if (!norm_url)
 			return 0;
-		matchlen = http_options_url_match_prefix(url, norm_url, strlen(norm_url));
+		matchlen = http_options_match_urls(info, &norm_info, &usermatch);
 		free(norm_url);
 		if (!matchlen)
 			return 0;
@@ -545,49 +732,49 @@ static int http_options(const char *var, const char *value, void *cb)
 	}
 
 	if (!strcmp("sslverify", key)) {
-		if (new_match_is_shorter(matchlen, OPT_SSL_VERIFY))
+		if (match_is_ignored(matchlen, usermatch, OPT_SSL_VERIFY))
 			return 0;
 		curl_ssl_verify = git_config_bool(var, value);
 		return 0;
 	}
 	if (!strcmp("sslcert", key)) {
-		if (new_match_is_shorter(matchlen, OPT_SSL_CERT))
+		if (match_is_ignored(matchlen, usermatch, OPT_SSL_CERT))
 			return 0;
 		return git_config_string(&ssl_cert, var, value);
 	}
 #if LIBCURL_VERSION_NUM >= 0x070903
 	if (!strcmp("sslkey", key)) {
-		if (new_match_is_shorter(matchlen, OPT_SSL_KEY))
+		if (match_is_ignored(matchlen, usermatch, OPT_SSL_KEY))
 			return 0;
 		return git_config_string(&ssl_key, var, value);
 	}
 #endif
 #if LIBCURL_VERSION_NUM >= 0x070908
 	if (!strcmp("sslcapath", key)) {
-		if (new_match_is_shorter(matchlen, OPT_SSL_CAPATH))
+		if (match_is_ignored(matchlen, usermatch, OPT_SSL_CAPATH))
 			return 0;
 		return git_config_string(&ssl_capath, var, value);
 	}
 #endif
 	if (!strcmp("sslcainfo", key)) {
-		if (new_match_is_shorter(matchlen, OPT_SSL_CAINFO))
+		if (match_is_ignored(matchlen, usermatch, OPT_SSL_CAINFO))
 			return 0;
 		return git_config_string(&ssl_cainfo, var, value);
 	}
 	if (!strcmp("sslcertpasswordprotected", key)) {
-		if (new_match_is_shorter(matchlen, OPT_PASSWD_REQ))
+		if (match_is_ignored(matchlen, usermatch, OPT_PASSWD_REQ))
 			return 0;
 		ssl_cert_password_required = git_config_bool(var, value);
 		return 0;
 	}
 	if (!strcmp("ssltry", key)) {
-		if (new_match_is_shorter(matchlen, OPT_SSL_TRY))
+		if (match_is_ignored(matchlen, usermatch, OPT_SSL_TRY))
 			return 0;
 		curl_ssl_try = git_config_bool(var, value);
 		return 0;
 	}
 	if (!strcmp("minsessions", key)) {
-		if (new_match_is_shorter(matchlen, OPT_MIN_SESSIONS))
+		if (match_is_ignored(matchlen, usermatch, OPT_MIN_SESSIONS))
 			return 0;
 		min_curl_sessions = git_config_int(var, value);
 #ifndef USE_CURL_MULTI
@@ -598,45 +785,45 @@ static int http_options(const char *var, const char *value, void *cb)
 	}
 #ifdef USE_CURL_MULTI
 	if (!strcmp("maxrequests", key)) {
-		if (new_match_is_shorter(matchlen, OPT_MAX_REQUESTS))
+		if (match_is_ignored(matchlen, usermatch, OPT_MAX_REQUESTS))
 			return 0;
 		max_requests = git_config_int(var, value);
 		return 0;
 	}
 #endif
 	if (!strcmp("lowspeedlimit", key)) {
-		if (new_match_is_shorter(matchlen, OPT_LOW_SPEED))
+		if (match_is_ignored(matchlen, usermatch, OPT_LOW_SPEED))
 			return 0;
 		curl_low_speed_limit = (long)git_config_int(var, value);
 		return 0;
 	}
 	if (!strcmp("lowspeedtime", key)) {
-		if (new_match_is_shorter(matchlen, OPT_LOW_TIME))
+		if (match_is_ignored(matchlen, usermatch, OPT_LOW_TIME))
 			return 0;
 		curl_low_speed_time = (long)git_config_int(var, value);
 		return 0;
 	}
 
 	if (!strcmp("noepsv", key)) {
-		if (new_match_is_shorter(matchlen, OPT_NO_EPSV))
+		if (match_is_ignored(matchlen, usermatch, OPT_NO_EPSV))
 			return 0;
 		curl_ftp_no_epsv = git_config_bool(var, value);
 		return 0;
 	}
 	if (!strcmp("proxy", key)) {
-		if (new_match_is_shorter(matchlen, OPT_HTTP_PROXY))
+		if (match_is_ignored(matchlen, usermatch, OPT_HTTP_PROXY))
 			return 0;
 		return git_config_string(&curl_http_proxy, var, value);
 	}
 
 	if (!strcmp("cookiefile", key)) {
-		if (new_match_is_shorter(matchlen, OPT_COOKIE_FILE))
+		if (match_is_ignored(matchlen, usermatch, OPT_COOKIE_FILE))
 			return 0;
 		return git_config_string(&curl_cookie_file, var, value);
 	}
 
 	if (!strcmp("postbuffer", key)) {
-		if (new_match_is_shorter(matchlen, OPT_POST_BUFFER))
+		if (match_is_ignored(matchlen, usermatch, OPT_POST_BUFFER))
 			return 0;
 		http_post_buffer = git_config_int(var, value);
 		if (http_post_buffer < LARGE_PACKET_MAX)
@@ -645,7 +832,7 @@ static int http_options(const char *var, const char *value, void *cb)
 	}
 
 	if (!strcmp("useragent", key)) {
-		if (new_match_is_shorter(matchlen, OPT_USER_AGENT))
+		if (match_is_ignored(matchlen, usermatch, OPT_USER_AGENT))
 			return 0;
 		return git_config_string(&user_agent, var, value);
 	}
@@ -774,13 +961,15 @@ void http_init(struct remote *remote, const char *url, int proactive_auth)
 {
 	char *low_speed_limit;
 	char *low_speed_time;
-	char *norm_url = http_options_url_normalize(url);
+	struct url_info info;
 
 	http_is_verbose = 0;
 
 	memset(http_option_max_matched_len, 0, sizeof(http_option_max_matched_len));
-	git_config(http_options, norm_url);
-	free(norm_url);
+	memset(http_option_user_matched, 0, sizeof(http_option_user_matched));
+	http_options_url_normalize(url, &info);
+	git_config(http_options, &info);
+	free(info.url);
 
 	curl_global_init(CURL_GLOBAL_ALL);
 
diff --git a/test-url-normalize.c b/test-url-normalize.c
index d68312d..f325571 100644
--- a/test-url-normalize.c
+++ b/test-url-normalize.c
@@ -9,7 +9,7 @@ int main()
 
 #include "http.c"
 
-#define url_normalize(u) http_options_url_normalize(u)
+#define url_normalize(u,i) http_options_url_normalize(u,i)
 
 int main(int argc, char **argv)
 {
@@ -40,21 +40,22 @@ int main(int argc, char **argv)
 		die(usage);
 
 	if (argc == 2) {
-		url1 = url_normalize(argv[1]);
+		struct url_info info;
+		url1 = url_normalize(argv[1], &info);
 		if (!url1)
 			return 1;
 		if (opt_p)
 			printf("%s\n", url1);
 		if (opt_l)
-			printf("%u\n", (unsigned)strlen(url1));
+			printf("%u\n", (unsigned)info.url_len);
 		return 0;
 	}
 
 	if (opt_p || opt_l)
 		die(usage);
 
-	url1 = url_normalize(argv[1]);
-	url2 = url_normalize(argv[2]);
+	url1 = url_normalize(argv[1], NULL);
+	url2 = url_normalize(argv[2], NULL);
 	return (url1 && url2 && !strcmp(url1, url2)) ? 0 : 1;
 }
 
-- 
1.8.3

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v7 3/4] tests: add new test for the url_normalize function
  2013-07-22  3:18 ` [PATCH v7 3/4] tests: add new test for the url_normalize function Kyle J. McKay
@ 2013-07-22  5:15   ` Junio C Hamano
  2013-07-22 10:21     ` Kyle J. McKay
  2013-07-22  6:45   ` Junio C Hamano
  1 sibling, 1 reply; 11+ messages in thread
From: Junio C Hamano @ 2013-07-22  5:15 UTC (permalink / raw)
  To: Kyle J. McKay
  Cc: git, David Aguilar, Petr Baudis, Richard Hartmann, Jeff King,
	Daniel Knittl-Frank, Jan Krüger, Alejandro Mery,
	Aaron Schrab, Eric Sunshine

"Kyle J. McKay" <mackyle@gmail.com> writes:

> +test_expect_success 'url general escapes' '
> +	! test-url-normalize "http://x.y?%fg" &&
> +	test "$(test-url-normalize -p "X://W/%7e%41^%3a")" = "x://w/~A%5E%3A" &&
> +	test "$(test-url-normalize -p "X://W/:/?#[]@")" = "x://w/:/?#[]@" &&
> +	test "$(test-url-normalize -p "X://W/$&()*+,;=")" = "x://w/$&()*+,;=" &&
> +	test "$(test-url-normalize -p "X://W/'\''")" = "x://w/'\''" &&
> +	test "$(test-url-normalize -p "X://W?'\!'")" = "x://w/?'\!'"
> +';#'

Hmm,... what is the magic on the last line about?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v7 4/4] config: allow http.<url>.* any user matching
  2013-07-22  3:18 ` [PATCH v7 4/4] config: allow http.<url>.* any user matching Kyle J. McKay
@ 2013-07-22  5:28   ` Junio C Hamano
  2013-07-22 12:52     ` Kyle J. McKay
  0 siblings, 1 reply; 11+ messages in thread
From: Junio C Hamano @ 2013-07-22  5:28 UTC (permalink / raw)
  To: Kyle J. McKay
  Cc: git, David Aguilar, Petr Baudis, Richard Hartmann, Jeff King,
	Daniel Knittl-Frank, Jan Krüger, Alejandro Mery,
	Aaron Schrab, Eric Sunshine

"Kyle J. McKay" <mackyle@gmail.com> writes:

> diff --git a/test-url-normalize.c b/test-url-normalize.c
> index d68312d..f325571 100644
> --- a/test-url-normalize.c
> +++ b/test-url-normalize.c
> @@ -9,7 +9,7 @@ int main()
>  
>  #include "http.c"
>  
> -#define url_normalize(u) http_options_url_normalize(u)
> +#define url_normalize(u,i) http_options_url_normalize(u,i)
>  
>  int main(int argc, char **argv)
>  {
> @@ -40,21 +40,22 @@ int main(int argc, char **argv)
>  		die(usage);
>  
>  	if (argc == 2) {
> -		url1 = url_normalize(argv[1]);
> +		struct url_info info;
> +		url1 = url_normalize(argv[1], &info);
>  		if (!url1)
>  			return 1;
>  		if (opt_p)
>  			printf("%s\n", url1);
>  		if (opt_l)
> -			printf("%u\n", (unsigned)strlen(url1));
> +			printf("%u\n", (unsigned)info.url_len);
>  		return 0;
>  	}
>  
>  	if (opt_p || opt_l)
>  		die(usage);
>  
> -	url1 = url_normalize(argv[1]);
> -	url2 = url_normalize(argv[2]);
> +	url1 = url_normalize(argv[1], NULL);
> +	url2 = url_normalize(argv[2], NULL);
>  	return (url1 && url2 && !strcmp(url1, url2)) ? 0 : 1;
>  }

It looks like that this program could also drive http_options()
directly, or alternatively make a call

	git_config(http_options, &info);

in its main() to let the config parser do its work, to make sure
that the machinery picks up the right values from the desired entry
in the configuration file, with a bit more effort.

> +static int match_is_ignored(size_t matchlen, int usermatch, enum http_option_type opt)
>  {
>  	/*
> -	 * Compare matchlen to the last matched length of option opt and
> +	 * Compare matchlen to the last matched path length of option opt and
>  	 * return true if matchlen is shorter than the last matched length
>  	 * (meaning the config setting should be ignored).  Upon seeing the
>  	 * _same_ key (i.e. new key has the same match length which is therefore
> -	 * not shorter) the new setting will override the previous setting.
> -	 * Otherwise return false and record matchlen as the current last
> -	 * matched length of option opt.
> +	 * not shorter) the new setting will override the previous setting
> +	 * unless the new setting did not match the user and the previous match
> +	 * did.  Otherwise return false and record matchlen as the current last
> +	 * matched path length of option opt and usermatch as the last user
> +	 * matching state for option opt.
>  	 */
>  	if (matchlen < http_option_max_matched_len[opt])
>  		return 1;
> -	http_option_max_matched_len[opt] = matchlen;
> +	if (matchlen > http_option_max_matched_len[opt]) {
> +		http_option_max_matched_len[opt] = matchlen;
> +		http_option_user_matched[opt] = usermatch;
> +		return 0;
> +	}
> +	/*
> +	 * If a previous match of the same length explicitly matched the user
> +	 * name, but the current match matched on any user, ignore it.
> +	 */
> +	if (!usermatch && http_option_user_matched[opt])
> +		return 1;
> +	http_option_user_matched[opt] = usermatch;
>  	return 0;
>  }

OK, so if there is a configuration for a generic URL that is longer,
it defeats a shorter URL with a specific username, e.g. when talking
to host.xz/name as kyle, with configuration for these two keys
exist:

	scheme://kyle@host.xz/path
        scheme://host.xz/path/name

the value for the latter one is used.

I am not complaining; just making sure that is what we want to give
users, as it was not quite clear in the Documentation/config.txt
part of the patch.

Thanks.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v7 3/4] tests: add new test for the url_normalize function
  2013-07-22  3:18 ` [PATCH v7 3/4] tests: add new test for the url_normalize function Kyle J. McKay
  2013-07-22  5:15   ` Junio C Hamano
@ 2013-07-22  6:45   ` Junio C Hamano
  2013-07-22 10:29     ` Kyle J. McKay
  1 sibling, 1 reply; 11+ messages in thread
From: Junio C Hamano @ 2013-07-22  6:45 UTC (permalink / raw)
  To: Kyle J. McKay
  Cc: git, David Aguilar, Petr Baudis, Richard Hartmann, Jeff King,
	Daniel Knittl-Frank, Jan Krüger, Alejandro Mery,
	Aaron Schrab, Eric Sunshine

I suspect that files under t/t5200/ were not suiable for e-mail
transmission.  Can you regenerate this after running:

    $ echo '/t/t5200/* binary' >>.git/info/attributes

or better yet with:

diff --git a/t/.gitattributes b/t/.gitattributes
index 1b97c54..6240ed2 100644
--- a/t/.gitattributes
+++ b/t/.gitattributes
@@ -1 +1,2 @@
 t[0-9][0-9][0-9][0-9]/* -whitespace
+t5200/* binary

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v7 3/4] tests: add new test for the url_normalize function
  2013-07-22  5:15   ` Junio C Hamano
@ 2013-07-22 10:21     ` Kyle J. McKay
  0 siblings, 0 replies; 11+ messages in thread
From: Kyle J. McKay @ 2013-07-22 10:21 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, David Aguilar, Petr Baudis, Richard Hartmann, Jeff King,
	Daniel Knittl-Frank, Jan Krüger, Alejandro Mery,
	Aaron Schrab, Eric Sunshine

On Jul 21, 2013, at 22:15, Junio C Hamano wrote:
> "Kyle J. McKay" <mackyle@gmail.com> writes:
>
>> +test_expect_success 'url general escapes' '
>> +	! test-url-normalize "http://x.y?%fg" &&
>> +	test "$(test-url-normalize -p "X://W/%7e%41^%3a")" = "x://w/~A%5E 
>> %3A" &&
>> +	test "$(test-url-normalize -p "X://W/:/?#[]@")" = "x://w/:/?#[]@"  
>> &&
>> +	test "$(test-url-normalize -p "X://W/$&()*+,;=")" = "x://w/$&()* 
>> +,;=" &&
>> +	test "$(test-url-normalize -p "X://W/'\''")" = "x://w/'\''" &&
>> +	test "$(test-url-normalize -p "X://W?'\!'")" = "x://w/?'\!'"
>> +';#'
>
> Hmm,... what is the magic on the last line about?

A left-over concession for my stupid editor. ;)  I'll remove it from  
the new version.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v7 3/4] tests: add new test for the url_normalize function
  2013-07-22  6:45   ` Junio C Hamano
@ 2013-07-22 10:29     ` Kyle J. McKay
  0 siblings, 0 replies; 11+ messages in thread
From: Kyle J. McKay @ 2013-07-22 10:29 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, David Aguilar, Petr Baudis, Richard Hartmann, Jeff King,
	Daniel Knittl-Frank, Jan Krüger, Alejandro Mery,
	Aaron Schrab, Eric Sunshine

I was able to apply the patch I got back.  It was marked as  
charset=ISO-8859-1 so it should be fine.

I wasn't sure how best to handle those though, so I will send a new  
version with the suggested .gitattributes patch included as well.

On Jul 21, 2013, at 23:45, Junio C Hamano wrote:
> I suspect that files under t/t5200/ were not suiable for e-mail
> transmission.  Can you regenerate this after running:
>
>    $ echo '/t/t5200/* binary' >>.git/info/attributes
>
> or better yet with:
>
> diff --git a/t/.gitattributes b/t/.gitattributes
> index 1b97c54..6240ed2 100644
> --- a/t/.gitattributes
> +++ b/t/.gitattributes
> @@ -1 +1,2 @@
> t[0-9][0-9][0-9][0-9]/* -whitespace
> +t5200/* binary

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v7 4/4] config: allow http.<url>.* any user matching
  2013-07-22  5:28   ` Junio C Hamano
@ 2013-07-22 12:52     ` Kyle J. McKay
  0 siblings, 0 replies; 11+ messages in thread
From: Kyle J. McKay @ 2013-07-22 12:52 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, David Aguilar, Petr Baudis, Richard Hartmann, Jeff King,
	Daniel Knittl-Frank, Jan Krüger, Alejandro Mery,
	Aaron Schrab, Eric Sunshine

On Jul 21, 2013, at 22:28, Junio C Hamano wrote:
> "Kyle J. McKay" <mackyle@gmail.com> writes:
>
>> diff --git a/test-url-normalize.c b/test-url-normalize.c
>> index d68312d..f325571 100644
>> --- a/test-url-normalize.c
>> +++ b/test-url-normalize.c
>> @@ -9,7 +9,7 @@ int main()
>
> It looks like that this program could also drive http_options()
> directly, or alternatively make a call
>
> 	git_config(http_options, &info);
>
> in its main() to let the config parser do its work, to make sure
> that the machinery picks up the right values from the desired entry
> in the configuration file, with a bit more effort.

OK.

> OK, so if there is a configuration for a generic URL that is longer,
> it defeats a shorter URL with a specific username, e.g. when talking
> to host.xz/name as kyle, with configuration for these two keys
> exist:
>
> 	scheme://kyle@host.xz/path
>        scheme://host.xz/path/name
>
> the value for the latter one is used.
>
> I am not complaining; just making sure that is what we want to give
> users, as it was not quite clear in the Documentation/config.txt
> part of the patch.

Will add example to Documentation/config.txt to make it clearer.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2013-07-22 12:52 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-22  3:18 [PATCH v7 0/4] config: add support for http.<url>.* settings Kyle J. McKay
2013-07-22  3:18 ` [PATCH v7 1/4] " Kyle J. McKay
2013-07-22  3:18 ` [PATCH v7 2/4] config: improve " Kyle J. McKay
2013-07-22  3:18 ` [PATCH v7 3/4] tests: add new test for the url_normalize function Kyle J. McKay
2013-07-22  5:15   ` Junio C Hamano
2013-07-22 10:21     ` Kyle J. McKay
2013-07-22  6:45   ` Junio C Hamano
2013-07-22 10:29     ` Kyle J. McKay
2013-07-22  3:18 ` [PATCH v7 4/4] config: allow http.<url>.* any user matching Kyle J. McKay
2013-07-22  5:28   ` Junio C Hamano
2013-07-22 12:52     ` Kyle J. McKay

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.