All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/1] http: Add Accept-Language header if possible
@ 2014-07-19 17:58 Yi EungJun
  2014-07-19 17:58 ` [PATCH v4 1/1] " Yi EungJun
                   ` (2 more replies)
  0 siblings, 3 replies; 58+ messages in thread
From: Yi EungJun @ 2014-07-19 17:58 UTC (permalink / raw)
  To: Git List
  Cc: Yi EungJun, Eric Sunshine, Jeff King, Peter Krefting, Junio C Hamano

Changes since v3:

* Fix styles and syntax. (Thanks to Jeff King and Eric Sunshine)
* Cache Accept-Language header. (Thanks to Jeff King)
* Remove floating point numbers. (Thanks to Junio C Hamano)
* Make the for-loop to get the value of the header simpler.
* Add more comments.

Yi EungJun (1):
  http: Add Accept-Language header if possible

 http.c                     | 134 +++++++++++++++++++++++++++++++++++++++++++++
 remote-curl.c              |   2 +
 t/t5550-http-fetch-dumb.sh |  31 +++++++++++
 3 files changed, 167 insertions(+)

-- 
2.0.1.473.g731ddce.dirty

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH v4 1/1] http: Add Accept-Language header if possible
  2014-07-19 17:58 [PATCH v4 0/1] http: Add Accept-Language header if possible Yi EungJun
@ 2014-07-19 17:58 ` Yi EungJun
  2014-07-21 19:01   ` Junio C Hamano
  2014-12-02 12:12 ` [PATCH v5 0/1] " Yi EungJun
  2015-03-06 16:13 ` [PATCH] http: Include locale.h when using setlocale() Ævar Arnfjörð Bjarmason
  2 siblings, 1 reply; 58+ messages in thread
From: Yi EungJun @ 2014-07-19 17:58 UTC (permalink / raw)
  To: Git List
  Cc: Yi EungJun, Eric Sunshine, Jeff King, Peter Krefting, Junio C Hamano

From: Yi EungJun <eungjun.yi@navercorp.com>

Add an Accept-Language header which indicates the user's preferred
languages defined by $LANGUAGE, $LC_ALL, $LC_MESSAGES and $LANG.

Examples:
  LANGUAGE= -> ""
  LANGUAGE=ko:en -> "Accept-Language: ko, en; q=0.9, *; q=0.1"
  LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *; q=0.1"
  LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *; q=0.1"

This gives git servers a chance to display remote error messages in
the user's preferred language.

Signed-off-by: Yi EungJun <eungjun.yi@navercorp.com>
---
 http.c                     | 134 +++++++++++++++++++++++++++++++++++++++++++++
 remote-curl.c              |   2 +
 t/t5550-http-fetch-dumb.sh |  31 +++++++++++
 3 files changed, 167 insertions(+)

diff --git a/http.c b/http.c
index 3a28b21..ed4e8e1 100644
--- a/http.c
+++ b/http.c
@@ -67,6 +67,8 @@ static struct curl_slist *no_pragma_header;
 
 static struct active_request_slot *active_queue_head;
 
+static struct strbuf *cached_accept_language = NULL;
+
 size_t fread_buffer(char *ptr, size_t eltsize, size_t nmemb, void *buffer_)
 {
 	size_t size = eltsize * nmemb;
@@ -512,6 +514,9 @@ void http_cleanup(void)
 		cert_auth.password = NULL;
 	}
 	ssl_cert_password_required = 0;
+
+	if (cached_accept_language)
+		strbuf_release(cached_accept_language);
 }
 
 struct active_request_slot *get_active_slot(void)
@@ -983,6 +988,129 @@ static void extract_content_type(struct strbuf *raw, struct strbuf *type,
 		strbuf_addstr(charset, "ISO-8859-1");
 }
 
+/*
+ * Guess the user's preferred languages from the value in LANGUAGE environment
+ * variable and LC_MESSAGES locale category.
+ *
+ * The result can be a colon-separated list like "ko:ja:en".
+ */
+static const char *get_preferred_languages(void)
+{
+	const char *retval;
+
+	retval = getenv("LANGUAGE");
+	if (retval && *retval)
+		return retval;
+
+	retval = setlocale(LC_MESSAGES, NULL);
+	if (retval && *retval &&
+		strcmp(retval, "C") &&
+		strcmp(retval, "POSIX"))
+		return retval;
+
+	return NULL;
+}
+
+/*
+ * Get an Accept-Language header which indicates user's preferred languages.
+ *
+ * Examples:
+ *   LANGUAGE= -> ""
+ *   LANGUAGE=ko:en -> "Accept-Language: ko, en; q=0.9, *; q=0.1"
+ *   LANGUAGE=ko_KR.UTF-8:sr@latin -> "Accept-Language: ko-KR, sr; q=0.9, *; q=0.1"
+ *   LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *; q=0.1"
+ *   LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *; q=0.1"
+ *   LANGUAGE= LANG=C -> ""
+ */
+static struct strbuf *get_accept_language(void)
+{
+	const char *lang_begin, *pos;
+	int q, max_q;
+	int num_langs;
+	int decimal_places;
+	int is_codeset_or_modifier = 0;
+	static struct strbuf buf = STRBUF_INIT;
+	struct strbuf q_format_buf = STRBUF_INIT;
+	char *q_format;
+
+	if (cached_accept_language)
+		return cached_accept_language;
+
+	lang_begin = get_preferred_languages();
+
+	/* Don't add Accept-Language header if no language is preferred. */
+	if (!(lang_begin && *lang_begin)) {
+		cached_accept_language = &buf;
+		return cached_accept_language;
+	}
+
+	/* Count number of preferred lang_begin to decide precision of q-factor */
+	for (num_langs = 1, pos = lang_begin; *pos; pos++)
+		if (*pos == ':')
+			num_langs++;
+
+	/* Decide the precision for q-factor on number of preferred lang_begin. */
+	num_langs += 1; /* for '*' */
+	decimal_places = 1 + (num_langs > 10) + (num_langs > 100);
+	strbuf_addf(&q_format_buf, "; q=0.%%0%dd", decimal_places);
+	q_format = strbuf_detach(&q_format_buf, NULL);
+	for (max_q = 1; decimal_places-- > 0;) max_q *= 10;
+	q = max_q;
+
+	strbuf_addstr(&buf, "Accept-Language: ");
+
+	/*
+	 * Convert a list of colon-separated locale values [1][2] to a list of
+	 * comma-separated language tags [3] which can be used as a value of
+	 * Accept-Language header.
+	 *
+	 * [1]: http://pubs.opengroup.org/onlinepubs/007908799/xbd/envvar.html
+	 * [2]: http://www.gnu.org/software/libc/manual/html_node/Using-gettextized-software.html
+	 * [3]: http://tools.ietf.org/html/rfc7231#section-5.3.5
+	 */
+	for (pos = lang_begin; ; pos++) {
+		if (*pos == ':' || !*pos) {
+			/* Ignore if this character is the first one. */
+			if (pos == lang_begin)
+				continue;
+
+			is_codeset_or_modifier = 0;
+
+			/* Put a q-factor only if it is less than 1.0. */
+			if (q < max_q)
+				strbuf_addf(&buf, q_format, q);
+
+			if (q > 1)
+				q--;
+
+			/* NULL pos means this is the last language. */
+			if (*pos)
+				strbuf_addstr(&buf, ", ");
+			else
+				break;
+
+		} else if (is_codeset_or_modifier)
+			continue;
+		else if (*pos == '.' || *pos == '@') /* Remove .codeset and @modifier. */
+			is_codeset_or_modifier = 1;
+		else
+			strbuf_addch(&buf, *pos == '_' ? '-' : *pos);
+	}
+
+	/* Don't add Accept-Language header if no language is preferred. */
+	if (q >= max_q) {
+		cached_accept_language = &buf;
+		return cached_accept_language;
+	}
+
+	/* Add '*' with minimum q-factor greater than 0.0. */
+	strbuf_addstr(&buf, ", *");
+	strbuf_addf(&buf, q_format, 1);
+
+	cached_accept_language = &buf;
+	return cached_accept_language;
+}
+
 /* http_request() targets */
 #define HTTP_REQUEST_STRBUF	0
 #define HTTP_REQUEST_FILE	1
@@ -995,6 +1123,7 @@ static int http_request(const char *url,
 	struct slot_results results;
 	struct curl_slist *headers = NULL;
 	struct strbuf buf = STRBUF_INIT;
+	struct strbuf* accept_language;
 	int ret;
 
 	slot = get_active_slot();
@@ -1020,6 +1149,11 @@ static int http_request(const char *url,
 					 fwrite_buffer);
 	}
 
+	accept_language = get_accept_language();
+
+	if (accept_language && accept_language->len > 0)
+		headers = curl_slist_append(headers, accept_language->buf);
+
 	strbuf_addstr(&buf, "Pragma:");
 	if (options && options->no_cache)
 		strbuf_addstr(&buf, " no-cache");
diff --git a/remote-curl.c b/remote-curl.c
index 4493b38..07f2a5d 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -946,6 +946,8 @@ int main(int argc, const char **argv)
 	struct strbuf buf = STRBUF_INIT;
 	int nongit;
 
+	git_setup_gettext();
+
 	git_extract_argv0_path(argv[0]);
 	setup_git_directory_gently(&nongit);
 	if (argc < 2) {
diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh
index ac71418..d2dac44 100755
--- a/t/t5550-http-fetch-dumb.sh
+++ b/t/t5550-http-fetch-dumb.sh
@@ -196,5 +196,36 @@ test_expect_success 'reencoding is robust to whitespace oddities' '
 	grep "this is the error message" stderr
 '
 
+check_language () {
+	echo "Accept-Language: $1\r" >expect &&
+	test_must_fail env \
+		GIT_CURL_VERBOSE=1 \
+		LANGUAGE=$2 \
+		LC_ALL=$3 \
+		LC_MESSAGES=$4 \
+		LANG=$5 \
+		git clone "$HTTPD_URL/accept/language" 2>stderr &&
+	grep -i ^Accept-Language: stderr >actual &&
+	test_cmp expect actual
+}
+
+test_expect_success 'git client sends Accept-Language based on LANGUAGE, LC_ALL, LC_MESSAGES and LANG' '
+	check_language "ko-KR, *; q=0.1" ko_KR.UTF-8 de_DE.UTF-8 ja_JP.UTF-8 en_US.UTF-8 &&
+	check_language "de-DE, *; q=0.1" ""          de_DE.UTF-8 ja_JP.UTF-8 en_US.UTF-8 &&
+	check_language "ja-JP, *; q=0.1" ""          ""          ja_JP.UTF-8 en_US.UTF-8 &&
+	check_language "en-US, *; q=0.1" ""          ""          ""          en_US.UTF-8
+'
+
+test_expect_success 'git client sends Accept-Language with many preferred languages' '
+	check_language "ko-KR, en-US; q=0.99, fr-CA; q=0.98, de; q=0.97, sr; q=0.96, \
+ja; q=0.95, zh; q=0.94, sv; q=0.93, pt; q=0.92, nb; q=0.91, *; q=0.01" \
+		ko_KR.EUC-KR:en_US.UTF-8:fr_CA:de.UTF-8@euro:sr@latin:ja:zh:sv:pt:nb
+'
+
+test_expect_success 'git client does not send Accept-Language' '
+	test_must_fail env GIT_CURL_VERBOSE=1 LANGUAGE= git clone "$HTTPD_URL/accept/language" 2>stderr &&
+	! grep "^Accept-Language:" stderr
+'
+
 stop_httpd
 test_done
-- 
2.0.1.473.g731ddce.dirty

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 1/1] http: Add Accept-Language header if possible
  2014-07-19 17:58 ` [PATCH v4 1/1] " Yi EungJun
@ 2014-07-21 19:01   ` Junio C Hamano
  2014-08-03  7:35     ` Yi, EungJun
  0 siblings, 1 reply; 58+ messages in thread
From: Junio C Hamano @ 2014-07-21 19:01 UTC (permalink / raw)
  To: Yi EungJun; +Cc: Git List, Yi EungJun, Eric Sunshine, Jeff King, Peter Krefting

Yi EungJun <semtlenori@gmail.com> writes:

> From: Yi EungJun <eungjun.yi@navercorp.com>
>
> Add an Accept-Language header which indicates the user's preferred
> languages defined by $LANGUAGE, $LC_ALL, $LC_MESSAGES and $LANG.
>
> Examples:
>   LANGUAGE= -> ""
>   LANGUAGE=ko:en -> "Accept-Language: ko, en; q=0.9, *; q=0.1"
>   LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *; q=0.1"
>   LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *; q=0.1"
>
> This gives git servers a chance to display remote error messages in
> the user's preferred language.
>
> Signed-off-by: Yi EungJun <eungjun.yi@navercorp.com>
> ---
>  http.c                     | 134 +++++++++++++++++++++++++++++++++++++++++++++
>  remote-curl.c              |   2 +
>  t/t5550-http-fetch-dumb.sh |  31 +++++++++++
>  3 files changed, 167 insertions(+)
>
> diff --git a/http.c b/http.c
> index 3a28b21..ed4e8e1 100644
> --- a/http.c
> +++ b/http.c
> @@ -67,6 +67,8 @@ static struct curl_slist *no_pragma_header;
>  
>  static struct active_request_slot *active_queue_head;
>  
> +static struct strbuf *cached_accept_language = NULL;

Please drop " = NULL" that is unnecessary for BSS.

> @@ -512,6 +514,9 @@ void http_cleanup(void)
>  		cert_auth.password = NULL;
>  	}
>  	ssl_cert_password_required = 0;
> +
> +	if (cached_accept_language)
> +		strbuf_release(cached_accept_language);
>  }



> @@ -983,6 +988,129 @@ static void extract_content_type(struct strbuf *raw, struct strbuf *type,
>  		strbuf_addstr(charset, "ISO-8859-1");
>  }
>  
> +/*
> + * Guess the user's preferred languages from the value in LANGUAGE environment
> + * variable and LC_MESSAGES locale category.
> + *
> + * The result can be a colon-separated list like "ko:ja:en".
> + */
> +static const char *get_preferred_languages(void)
> +{
> +	const char *retval;
> +
> +	retval = getenv("LANGUAGE");
> +	if (retval && *retval)
> +		return retval;
> +
> +	retval = setlocale(LC_MESSAGES, NULL);
> +	if (retval && *retval &&
> +		strcmp(retval, "C") &&
> +		strcmp(retval, "POSIX"))
> +		return retval;
> +
> +	return NULL;
> +}
> +
> +/*
> + * Get an Accept-Language header which indicates user's preferred languages.
> + *
> + * Examples:
> + *   LANGUAGE= -> ""
> + *   LANGUAGE=ko:en -> "Accept-Language: ko, en; q=0.9, *; q=0.1"
> + *   LANGUAGE=ko_KR.UTF-8:sr@latin -> "Accept-Language: ko-KR, sr; q=0.9, *; q=0.1"
> + *   LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *; q=0.1"
> + *   LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *; q=0.1"
> + *   LANGUAGE= LANG=C -> ""
> + */
> +static struct strbuf *get_accept_language(void)
> +{
> +	const char *lang_begin, *pos;
> +	int q, max_q;
> +	int num_langs;
> +	int decimal_places;
> +	int is_codeset_or_modifier = 0;
> +	static struct strbuf buf = STRBUF_INIT;
> +	struct strbuf q_format_buf = STRBUF_INIT;
> +	char *q_format;
> +
> +	if (cached_accept_language)
> +		return cached_accept_language;
> +
> +	lang_begin = get_preferred_languages();
> +
> +	/* Don't add Accept-Language header if no language is preferred. */
> +	if (!(lang_begin && *lang_begin)) {

It is not wrong per-se, but given how hard get_preferred_languages()
tries not to return a pointer to an empty string, this seems a bit
overly defensive to me.

> +		cached_accept_language = &buf;
> +		return cached_accept_language;

It is somewhat unconventional to have a static pointer outside to
point at a singleton and then have a singleton actually as a static
structure.  I would have done without "buf" in this function and
instead started this function like so:

	if (cached_accept_language)
        	return cached_accept_language;

	cached_accept_language = xmalloc(sizeof(struct strbuf));
        strbuf_init(cached_accept_language, 0);
        lang_begin =  get_preferred_languages();
	if (!lang_begin)
		return cached_accept_language;

> +	}
> +
> +	/* Count number of preferred lang_begin to decide precision of q-factor */
> +	for (num_langs = 1, pos = lang_begin; *pos; pos++)
> +		if (*pos == ':')
> +			num_langs++;
> +
> +	/* Decide the precision for q-factor on number of preferred lang_begin. */
> +	num_langs += 1; /* for '*' */


> +	decimal_places = 1 + (num_langs > 10) + (num_langs > 100);

What if you got 60000 languages ;-)?  I do not think we want to bend
backwards and make the code list all 60000 of them, assigning a
unique and decreasing q value to each of them, forming an overlong
Accept-Language header, but at the same time, I do not think we want
to show nonsense output because we compute the precision incorrectly
here.

> +	strbuf_addf(&q_format_buf, "; q=0.%%0%dd", decimal_places);
> +	q_format = strbuf_detach(&q_format_buf, NULL);

q_format_buf is an overkill use of strbuf, isn't it?  Just

	char q_format_buf[32];
	sprintf(q_format_buf, ";q=0.%%0%d", decimal_places);

or something should be more than sufficient, no?

> +	for (max_q = 1; decimal_places-- > 0;) max_q *= 10;

As you have to do one loop like this that amounts to computing log10
of num_langs, why not compute decimal_places the same way while at
it?  It may also make sense to cap the number of languages to avoid
spitting out overly long Accept-Language header with practicaly
useless list of many languages.  That is, something along the lines
of ... (note that I may very well have off-by-one or off-by-ten
errors here you may need to tweak to get right):

        if (MAX_LANGS < num_langs)
        	num_langs = MAX_LANGS;
        for (max_q = 1, decimal_places = 1;
             max_q < num_langs;
             decimal_places++, max_q *= 10)
             ;

If you are to use the MAX_LANGS cap, the main loop would also need
to pay attention to it by breaking out of the loop early before you
reach the end of the string, of course.

> +	q = max_q;
> +
> +	strbuf_addstr(&buf, "Accept-Language: ");
> +
> +	/*
> +	 * Convert a list of colon-separated locale values [1][2] to a list of
> +	 * comma-separated language tags [3] which can be used as a value of
> +	 * Accept-Language header.
> +	 *
> +	 * [1]: http://pubs.opengroup.org/onlinepubs/007908799/xbd/envvar.html
> +	 * [2]: http://www.gnu.org/software/libc/manual/html_node/Using-gettextized-software.html
> +	 * [3]: http://tools.ietf.org/html/rfc7231#section-5.3.5
> +	 */
> +	for (pos = lang_begin; ; pos++) {
> +		if (*pos == ':' || !*pos) {
> +			/* Ignore if this character is the first one. */
> +			if (pos == lang_begin)
> +				continue;

By doing this "ignore empty" here, but not doing the same when you
count num_langs, are you potentially miscounting num_langs?

> +			is_codeset_or_modifier = 0;
> +
> +			/* Put a q-factor only if it is less than 1.0. */
> +			if (q < max_q)

... is it the same thing as "do not do this for the first round, but
do so for all the other round"?

> +				strbuf_addf(&buf, q_format, q);
> +
> +			if (q > 1)

Hmm, I am puzzled.  C this ever be an issue (unless you have
off-by-one error or you add "cap num_langs to MAX_LANGS", that is)?

> +				q--;

> +			/* NULL pos means this is the last language. */
> +			if (*pos)
> +				strbuf_addstr(&buf, ", ");
> +			else
> +				break;
> +
> +		} else if (is_codeset_or_modifier)
> +			continue;
> +		else if (*pos == '.' || *pos == '@') /* Remove .codeset and @modifier. */
> +			is_codeset_or_modifier = 1;
> +		else
> +			strbuf_addch(&buf, *pos == '_' ? '-' : *pos);
> +	}
> +
> +	/* Don't add Accept-Language header if no language is preferred. */
> +	if (q >= max_q) {

Can q go over max_q, or is it "q may be max_q"?  In other words, is
this essentially saying "if we did not find any language in the
preferred languages list"?

> +		cached_accept_language = &buf;
> +		return cached_accept_language;
> +	}
> +
> +	/* Add '*' with minimum q-factor greater than 0.0. */
> +	strbuf_addstr(&buf, ", *");
> +	strbuf_addf(&buf, q_format, 1);
> +
> +	cached_accept_language = &buf;
> +	return cached_accept_language;
> +}
> +
>  /* http_request() targets */
>  #define HTTP_REQUEST_STRBUF	0
>  #define HTTP_REQUEST_FILE	1
> @@ -995,6 +1123,7 @@ static int http_request(const char *url,
>  	struct slot_results results;
>  	struct curl_slist *headers = NULL;
>  	struct strbuf buf = STRBUF_INIT;
> +	struct strbuf* accept_language;

As we write in C, not C++, our asterisks stick to the variable, not
the type.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v4 1/1] http: Add Accept-Language header if possible
  2014-07-21 19:01   ` Junio C Hamano
@ 2014-08-03  7:35     ` Yi, EungJun
  0 siblings, 0 replies; 58+ messages in thread
From: Yi, EungJun @ 2014-08-03  7:35 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Git List, Yi EungJun, Eric Sunshine, Jeff King, Peter Krefting

Thanks very much for your detailed review and sorry for late reply.

2014-07-22 4:01 GMT+09:00 Junio C Hamano <gitster@pobox.com>:
> Yi EungJun <semtlenori@gmail.com> writes:
>
>> From: Yi EungJun <eungjun.yi@navercorp.com>
>>
>> Add an Accept-Language header which indicates the user's preferred
>> languages defined by $LANGUAGE, $LC_ALL, $LC_MESSAGES and $LANG.
>>
>> Examples:
>>   LANGUAGE= -> ""
>>   LANGUAGE=ko:en -> "Accept-Language: ko, en; q=0.9, *; q=0.1"
>>   LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *; q=0.1"
>>   LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *; q=0.1"
>>
>> This gives git servers a chance to display remote error messages in
>> the user's preferred language.
>>
>> Signed-off-by: Yi EungJun <eungjun.yi@navercorp.com>
>> ---
>>  http.c                     | 134 +++++++++++++++++++++++++++++++++++++++++++++
>>  remote-curl.c              |   2 +
>>  t/t5550-http-fetch-dumb.sh |  31 +++++++++++
>>  3 files changed, 167 insertions(+)
>>
>> diff --git a/http.c b/http.c
>> index 3a28b21..ed4e8e1 100644
>> --- a/http.c
>> +++ b/http.c
>> @@ -67,6 +67,8 @@ static struct curl_slist *no_pragma_header;
>>
>>  static struct active_request_slot *active_queue_head;
>>
>> +static struct strbuf *cached_accept_language = NULL;
>
> Please drop " = NULL" that is unnecessary for BSS.

Thanks, I'll fix it.

>
>> @@ -512,6 +514,9 @@ void http_cleanup(void)
>>               cert_auth.password = NULL;
>>       }
>>       ssl_cert_password_required = 0;
>> +
>> +     if (cached_accept_language)
>> +             strbuf_release(cached_accept_language);
>>  }
>
>
>
>> @@ -983,6 +988,129 @@ static void extract_content_type(struct strbuf *raw, struct strbuf *type,
>>               strbuf_addstr(charset, "ISO-8859-1");
>>  }
>>
>> +/*
>> + * Guess the user's preferred languages from the value in LANGUAGE environment
>> + * variable and LC_MESSAGES locale category.
>> + *
>> + * The result can be a colon-separated list like "ko:ja:en".
>> + */
>> +static const char *get_preferred_languages(void)
>> +{
>> +     const char *retval;
>> +
>> +     retval = getenv("LANGUAGE");
>> +     if (retval && *retval)
>> +             return retval;
>> +
>> +     retval = setlocale(LC_MESSAGES, NULL);
>> +     if (retval && *retval &&
>> +             strcmp(retval, "C") &&
>> +             strcmp(retval, "POSIX"))
>> +             return retval;
>> +
>> +     return NULL;
>> +}
>> +
>> +/*
>> + * Get an Accept-Language header which indicates user's preferred languages.
>> + *
>> + * Examples:
>> + *   LANGUAGE= -> ""
>> + *   LANGUAGE=ko:en -> "Accept-Language: ko, en; q=0.9, *; q=0.1"
>> + *   LANGUAGE=ko_KR.UTF-8:sr@latin -> "Accept-Language: ko-KR, sr; q=0.9, *; q=0.1"
>> + *   LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *; q=0.1"
>> + *   LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *; q=0.1"
>> + *   LANGUAGE= LANG=C -> ""
>> + */
>> +static struct strbuf *get_accept_language(void)
>> +{
>> +     const char *lang_begin, *pos;
>> +     int q, max_q;
>> +     int num_langs;
>> +     int decimal_places;
>> +     int is_codeset_or_modifier = 0;
>> +     static struct strbuf buf = STRBUF_INIT;
>> +     struct strbuf q_format_buf = STRBUF_INIT;
>> +     char *q_format;
>> +
>> +     if (cached_accept_language)
>> +             return cached_accept_language;
>> +
>> +     lang_begin = get_preferred_languages();
>> +
>> +     /* Don't add Accept-Language header if no language is preferred. */
>> +     if (!(lang_begin && *lang_begin)) {
>
> It is not wrong per-se, but given how hard get_preferred_languages()
> tries not to return a pointer to an empty string, this seems a bit
> overly defensive to me.

Thanks, I'll fix it.

>
>> +             cached_accept_language = &buf;
>> +             return cached_accept_language;
>
> It is somewhat unconventional to have a static pointer outside to
> point at a singleton and then have a singleton actually as a static
> structure.  I would have done without "buf" in this function and
> instead started this function like so:
>
>         if (cached_accept_language)
>                 return cached_accept_language;
>
>         cached_accept_language = xmalloc(sizeof(struct strbuf));
>         strbuf_init(cached_accept_language, 0);
>         lang_begin =  get_preferred_languages();
>         if (!lang_begin)
>                 return cached_accept_language;
>

Thanks, I'll fix it as you suggest.

>> +     }
>> +
>> +     /* Count number of preferred lang_begin to decide precision of q-factor */
>> +     for (num_langs = 1, pos = lang_begin; *pos; pos++)
>> +             if (*pos == ':')
>> +                     num_langs++;
>> +
>> +     /* Decide the precision for q-factor on number of preferred lang_begin. */
>> +     num_langs += 1; /* for '*' */
>
>
>> +     decimal_places = 1 + (num_langs > 10) + (num_langs > 100);
>
> What if you got 60000 languages ;-)?  I do not think we want to bend
> backwards and make the code list all 60000 of them, assigning a
> unique and decreasing q value to each of them, forming an overlong
> Accept-Language header, but at the same time, I do not think we want
> to show nonsense output because we compute the precision incorrectly
> here.

In that case, less-preferred 59,001 languages will have same q-value
of 0.001. Unfortunately, there is no way to represent user's
preferences of over 1,000 languages since the HTTP specification
restricts the range of q-value from 0.001 to 1.000 [1]. I think this
code reflects user's preferences better than Google chrome whose
minimum q-value is 0.2 and Mozilla firefox whose minimum q-value is
0.01 [2].

>
>> +     strbuf_addf(&q_format_buf, "; q=0.%%0%dd", decimal_places);
>> +     q_format = strbuf_detach(&q_format_buf, NULL);
>
> q_format_buf is an overkill use of strbuf, isn't it?  Just
>
>         char q_format_buf[32];
>         sprintf(q_format_buf, ";q=0.%%0%d", decimal_places);
>
> or something should be more than sufficient, no?

Thanks, I'll fix it as you suggest.

>
>> +     for (max_q = 1; decimal_places-- > 0;) max_q *= 10;
>
> As you have to do one loop like this that amounts to computing log10
> of num_langs, why not compute decimal_places the same way while at
> it?  It may also make sense to cap the number of languages to avoid
> spitting out overly long Accept-Language header with practicaly
> useless list of many languages.  That is, something along the lines
> of ... (note that I may very well have off-by-one or off-by-ten
> errors here you may need to tweak to get right):
>
>         if (MAX_LANGS < num_langs)
>                 num_langs = MAX_LANGS;
>         for (max_q = 1, decimal_places = 1;
>              max_q < num_langs;
>              decimal_places++, max_q *= 10)
>              ;
>
> If you are to use the MAX_LANGS cap, the main loop would also need
> to pay attention to it by breaking out of the loop early before you
> reach the end of the string, of course.

Thanks for good point. As you said, we should limit the length of the
value of Accept-Language header because some HTTP servers respond 4xx
Client Error if any header's value is very long (4KB or more) [3].

But MAX_LANGS may not be enough because the header can be too long
even if the number of languages does not exceed MAX_LANGS if language
tag is too long. Many of language tags are 2-3 characters but some are
11 characters; Even it is possible that a user has a very long custom
language tag.

I think this problem can be solved by one of these solutions:

A. Set MAX_LANGS conservatively (100 or less).
B. Limit the length of Accept-Language header directly (4KB or less).
C. Negotiate with server; Resend a request with shorter
Accept-Language header if the server responds 4xx error.

>
>> +     q = max_q;
>> +
>> +     strbuf_addstr(&buf, "Accept-Language: ");
>> +
>> +     /*
>> +      * Convert a list of colon-separated locale values [1][2] to a list of
>> +      * comma-separated language tags [3] which can be used as a value of
>> +      * Accept-Language header.
>> +      *
>> +      * [1]: http://pubs.opengroup.org/onlinepubs/007908799/xbd/envvar.html
>> +      * [2]: http://www.gnu.org/software/libc/manual/html_node/Using-gettextized-software.html
>> +      * [3]: http://tools.ietf.org/html/rfc7231#section-5.3.5
>> +      */
>> +     for (pos = lang_begin; ; pos++) {
>> +             if (*pos == ':' || !*pos) {
>> +                     /* Ignore if this character is the first one. */
>> +                     if (pos == lang_begin)
>> +                             continue;
>
> By doing this "ignore empty" here, but not doing the same when you
> count num_langs, are you potentially miscounting num_langs?

Yes, num_langs can be larger than actual number of languages. For
example, if LANGUAGE="en:" num_langs is 2. I wanted to make the logic
to compute num_langs as simple as possible and thought num_langs does
not need to be very accurate because it is used only to compute the
precision of q-factor (max_q and decimal_places).

Do we need to compute num_langs accurately?

>
>> +                     is_codeset_or_modifier = 0;
>> +
>> +                     /* Put a q-factor only if it is less than 1.0. */
>> +                     if (q < max_q)
>
> ... is it the same thing as "do not do this for the first round, but
> do so for all the other round"?

Yes, q-factor is q/max_q and q-factor of 1.0 is not necessary because:
> if no "q" parameter is present, the default weight is 1.
> -- http://tools.ietf.org/html/rfc7231#section-5.3.1

>
>> +                             strbuf_addf(&buf, q_format, q);
>> +
>> +                     if (q > 1)
>
> Hmm, I am puzzled.  C this ever be an issue (unless you have
> off-by-one error or you add "cap num_langs to MAX_LANGS", that is)?

Yes, it may be an issue if the number of languages is larger than the
max_q. max_q must not be larger than 1000 but the number of languages
may be.

But this if-statement will be not necessary if we limit the number of
languages to 1,000 or less (by MAX_LANGS you suggest).

>
>> +                             q--;
>
>> +                     /* NULL pos means this is the last language. */
>> +                     if (*pos)
>> +                             strbuf_addstr(&buf, ", ");
>> +                     else
>> +                             break;
>> +
>> +             } else if (is_codeset_or_modifier)
>> +                     continue;
>> +             else if (*pos == '.' || *pos == '@') /* Remove .codeset and @modifier. */
>> +                     is_codeset_or_modifier = 1;
>> +             else
>> +                     strbuf_addch(&buf, *pos == '_' ? '-' : *pos);
>> +     }
>> +
>> +     /* Don't add Accept-Language header if no language is preferred. */
>> +     if (q >= max_q) {
>
> Can q go over max_q, or is it "q may be max_q"?  In other words, is
> this essentially saying "if we did not find any language in the
> preferred languages list"?

Yes. q may be max_q if we did not find any language in the preferred
languages list. But there is no chance that q goes over max_q.

But it will be not necessary if num_langs is computed accurately.

>
>> +             cached_accept_language = &buf;
>> +             return cached_accept_language;
>> +     }
>> +
>> +     /* Add '*' with minimum q-factor greater than 0.0. */
>> +     strbuf_addstr(&buf, ", *");
>> +     strbuf_addf(&buf, q_format, 1);
>> +
>> +     cached_accept_language = &buf;
>> +     return cached_accept_language;
>> +}
>> +
>>  /* http_request() targets */
>>  #define HTTP_REQUEST_STRBUF  0
>>  #define HTTP_REQUEST_FILE    1
>> @@ -995,6 +1123,7 @@ static int http_request(const char *url,
>>       struct slot_results results;
>>       struct curl_slist *headers = NULL;
>>       struct strbuf buf = STRBUF_INIT;
>> +     struct strbuf* accept_language;
>
> As we write in C, not C++, our asterisks stick to the variable, not
> the type.

Thanks, I'll fix it.

[1]: http://tools.ietf.org/html/rfc7231#section-5.3.1
[2]: https://hg.mozilla.org/integration/mozilla-inbound/rev/1418f9ce6f8b
[3]: http://stackoverflow.com/questions/686217/maximum-on-http-header-values/8623061#8623061

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH v5 0/1] http: Add Accept-Language header if possible
  2014-07-19 17:58 [PATCH v4 0/1] http: Add Accept-Language header if possible Yi EungJun
  2014-07-19 17:58 ` [PATCH v4 1/1] " Yi EungJun
@ 2014-12-02 12:12 ` Yi EungJun
  2014-12-02 12:12   ` [PATCH v5 1/1] " Yi EungJun
  2015-03-06 16:13 ` [PATCH] http: Include locale.h when using setlocale() Ævar Arnfjörð Bjarmason
  2 siblings, 1 reply; 58+ messages in thread
From: Yi EungJun @ 2014-12-02 12:12 UTC (permalink / raw)
  To: git; +Cc: Yi EungJun, Eric Sunshine, Jeff King, Peter Krefting, Junio C Hamano

Changes since v4

* Fix styles as Junio C Hamano suggested.
* Limit number of languages and length of Accept-Language header.

Yi EungJun (1):
  http: Add Accept-Language header if possible

 http.c                     | 154 +++++++++++++++++++++++++++++++++++++++++++++
 remote-curl.c              |   2 +
 t/t5550-http-fetch-dumb.sh |  31 +++++++++
 3 files changed, 187 insertions(+)

-- 
2.2.0

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH v5 1/1] http: Add Accept-Language header if possible
  2014-12-02 12:12 ` [PATCH v5 0/1] " Yi EungJun
@ 2014-12-02 12:12   ` Yi EungJun
  2014-12-03 18:22     ` Junio C Hamano
  2014-12-03 19:31     ` Eric Sunshine
  0 siblings, 2 replies; 58+ messages in thread
From: Yi EungJun @ 2014-12-02 12:12 UTC (permalink / raw)
  To: git; +Cc: Yi EungJun, Eric Sunshine, Jeff King, Peter Krefting, Junio C Hamano

From: Yi EungJun <eungjun.yi@navercorp.com>

Add an Accept-Language header which indicates the user's preferred
languages defined by $LANGUAGE, $LC_ALL, $LC_MESSAGES and $LANG.

Examples:
  LANGUAGE= -> ""
  LANGUAGE=ko:en -> "Accept-Language: ko, en;q=0.9, *;q=0.1"
  LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *;q=0.1"
  LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *;q=0.1"

This gives git servers a chance to display remote error messages in
the user's preferred language.

Limit the number of languages to 1,000 because q-value must not be
smaller than 0.001, and limit the length of Accept-Language header to
4,000 bytes for some HTTP servers which cannot accept such long header.

Signed-off-by: Yi EungJun <eungjun.yi@navercorp.com>
---
 http.c                     | 154 +++++++++++++++++++++++++++++++++++++++++++++
 remote-curl.c              |   2 +
 t/t5550-http-fetch-dumb.sh |  31 +++++++++
 3 files changed, 187 insertions(+)

diff --git a/http.c b/http.c
index 040f362..69624af 100644
--- a/http.c
+++ b/http.c
@@ -68,6 +68,8 @@ static struct curl_slist *no_pragma_header;
 
 static struct active_request_slot *active_queue_head;
 
+static struct strbuf *cached_accept_language;
+
 size_t fread_buffer(char *ptr, size_t eltsize, size_t nmemb, void *buffer_)
 {
 	size_t size = eltsize * nmemb;
@@ -515,6 +517,9 @@ void http_cleanup(void)
 		cert_auth.password = NULL;
 	}
 	ssl_cert_password_required = 0;
+
+	if (cached_accept_language)
+		strbuf_release(cached_accept_language);
 }
 
 struct active_request_slot *get_active_slot(void)
@@ -986,6 +991,149 @@ static void extract_content_type(struct strbuf *raw, struct strbuf *type,
 		strbuf_addstr(charset, "ISO-8859-1");
 }
 
+/*
+ * Guess the user's preferred languages from the value in LANGUAGE environment
+ * variable and LC_MESSAGES locale category.
+ *
+ * The result can be a colon-separated list like "ko:ja:en".
+ */
+static const char *get_preferred_languages(void)
+{
+	const char *retval;
+
+	retval = getenv("LANGUAGE");
+	if (retval && *retval)
+		return retval;
+
+	retval = setlocale(LC_MESSAGES, NULL);
+	if (retval && *retval &&
+		strcmp(retval, "C") &&
+		strcmp(retval, "POSIX"))
+		return retval;
+
+	return NULL;
+}
+
+/*
+ * Get an Accept-Language header which indicates user's preferred languages.
+ *
+ * Examples:
+ *   LANGUAGE= -> ""
+ *   LANGUAGE=ko:en -> "Accept-Language: ko, en; q=0.9, *; q=0.1"
+ *   LANGUAGE=ko_KR.UTF-8:sr@latin -> "Accept-Language: ko-KR, sr; q=0.9, *; q=0.1"
+ *   LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *; q=0.1"
+ *   LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *; q=0.1"
+ *   LANGUAGE= LANG=C -> ""
+ */
+static struct strbuf *get_accept_language(void)
+{
+	const char *lang_begin, *pos;
+	int q, max_q;
+	int num_langs;
+	int decimal_places;
+	int is_codeset_or_modifier = 0;
+	char q_format[32];
+	/*
+	 * MAX_LANGS must not be larger than 1,000. If it is larger than that,
+	 * q-value will be smaller than 0.001, the minimum q-value the HTTP
+	 * specification allows [1].
+	 *
+	 * [1]: http://tools.ietf.org/html/rfc7231#section-5.3.1
+	 */
+	const int MAX_LANGS = 1000;
+	const int MAX_SIZE_OF_HEADER = 4000;
+	int last_size = 0;
+
+	if (cached_accept_language)
+		return cached_accept_language;
+
+	cached_accept_language = xmalloc(sizeof(struct strbuf));
+	strbuf_init(cached_accept_language, 0);
+	lang_begin = get_preferred_languages();
+
+	/* Don't add Accept-Language header if no language is preferred. */
+	if (!lang_begin)
+		return cached_accept_language;
+
+	/* Count number of preferred lang_begin to decide precision of q-factor. */
+	for (num_langs = 1, pos = lang_begin; *pos; pos++)
+		if (*pos == ':')
+			num_langs++;
+
+	/* Decide the precision for q-factor on number of preferred lang_begin. */
+	num_langs += 1; /* for '*' */
+
+	if (MAX_LANGS < num_langs)
+		num_langs = MAX_LANGS;
+
+	for (max_q = 1, decimal_places = 0;
+		max_q < num_langs;
+		decimal_places++, max_q *= 10);
+
+	sprintf(q_format, ";q=0.%%0%dd", decimal_places);
+
+	q = max_q;
+
+	strbuf_addstr(cached_accept_language, "Accept-Language: ");
+
+	/*
+	 * Convert a list of colon-separated locale values [1][2] to a list of
+	 * comma-separated language tags [3] which can be used as a value of
+	 * Accept-Language header.
+	 *
+	 * [1]: http://pubs.opengroup.org/onlinepubs/007908799/xbd/envvar.html
+	 * [2]: http://www.gnu.org/software/libc/manual/html_node/Using-gettextized-software.html
+	 * [3]: http://tools.ietf.org/html/rfc7231#section-5.3.5
+	 */
+	for (pos = lang_begin; ; pos++) {
+		if (*pos == ':' || !*pos) {
+			/* Ignore if this character is the first one. */
+			if (pos == lang_begin)
+				continue;
+
+			is_codeset_or_modifier = 0;
+
+			/* Put a q-factor only if it is less than 1.0. */
+			if (q < max_q)
+				strbuf_addf(cached_accept_language, q_format, q);
+
+			if (q > 1)
+				q--;
+
+			last_size = cached_accept_language->len;
+
+			/* NULL pos means this is the last language. */
+			if (*pos)
+				strbuf_addstr(cached_accept_language, ", ");
+			else
+				break;
+
+		} else if (is_codeset_or_modifier)
+			continue;
+		else if (*pos == '.' || *pos == '@') /* Remove .codeset and @modifier. */
+			is_codeset_or_modifier = 1;
+		else
+			strbuf_addch(cached_accept_language, *pos == '_' ? '-' : *pos);
+
+		if (cached_accept_language->len > MAX_SIZE_OF_HEADER) {
+			strbuf_remove(cached_accept_language, last_size,
+					cached_accept_language->len - last_size);
+			break;
+		}
+	}
+
+	/* Don't add Accept-Language header if no language is preferred. */
+	if (q >= max_q) {
+		return cached_accept_language;
+	}
+
+	/* Add '*' with minimum q-factor greater than 0.0. */
+	strbuf_addstr(cached_accept_language, ", *");
+	strbuf_addf(cached_accept_language, q_format, 1);
+
+	return cached_accept_language;
+}
+
 /* http_request() targets */
 #define HTTP_REQUEST_STRBUF	0
 #define HTTP_REQUEST_FILE	1
@@ -998,6 +1146,7 @@ static int http_request(const char *url,
 	struct slot_results results;
 	struct curl_slist *headers = NULL;
 	struct strbuf buf = STRBUF_INIT;
+	struct strbuf *accept_language;
 	int ret;
 
 	slot = get_active_slot();
@@ -1023,6 +1172,11 @@ static int http_request(const char *url,
 					 fwrite_buffer);
 	}
 
+	accept_language = get_accept_language();
+
+	if (accept_language && accept_language->len > 0)
+		headers = curl_slist_append(headers, accept_language->buf);
+
 	strbuf_addstr(&buf, "Pragma:");
 	if (options && options->no_cache)
 		strbuf_addstr(&buf, " no-cache");
diff --git a/remote-curl.c b/remote-curl.c
index dd63bc2..04989e5 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -962,6 +962,8 @@ int main(int argc, const char **argv)
 	struct strbuf buf = STRBUF_INIT;
 	int nongit;
 
+	git_setup_gettext();
+
 	git_extract_argv0_path(argv[0]);
 	setup_git_directory_gently(&nongit);
 	if (argc < 2) {
diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh
index ac71418..197c361 100755
--- a/t/t5550-http-fetch-dumb.sh
+++ b/t/t5550-http-fetch-dumb.sh
@@ -196,5 +196,36 @@ test_expect_success 'reencoding is robust to whitespace oddities' '
 	grep "this is the error message" stderr
 '
 
+check_language () {
+	echo "Accept-Language: $1\r" >expect &&
+	test_must_fail env \
+		GIT_CURL_VERBOSE=1 \
+		LANGUAGE=$2 \
+		LC_ALL=$3 \
+		LC_MESSAGES=$4 \
+		LANG=$5 \
+		git clone "$HTTPD_URL/accept/language" 2>stderr &&
+	grep -i ^Accept-Language: stderr >actual &&
+	test_cmp expect actual
+}
+
+test_expect_success 'git client sends Accept-Language based on LANGUAGE, LC_ALL, LC_MESSAGES and LANG' '
+	check_language "ko-KR, *;q=0.1" ko_KR.UTF-8 de_DE.UTF-8 ja_JP.UTF-8 en_US.UTF-8 &&
+	check_language "de-DE, *;q=0.1" ""          de_DE.UTF-8 ja_JP.UTF-8 en_US.UTF-8 &&
+	check_language "ja-JP, *;q=0.1" ""          ""          ja_JP.UTF-8 en_US.UTF-8 &&
+	check_language "en-US, *;q=0.1" ""          ""          ""          en_US.UTF-8
+'
+
+test_expect_success 'git client sends Accept-Language with many preferred languages' '
+	check_language "ko-KR, en-US;q=0.99, fr-CA;q=0.98, de;q=0.97, sr;q=0.96, \
+ja;q=0.95, zh;q=0.94, sv;q=0.93, pt;q=0.92, nb;q=0.91, *;q=0.01" \
+		ko_KR.EUC-KR:en_US.UTF-8:fr_CA:de.UTF-8@euro:sr@latin:ja:zh:sv:pt:nb
+'
+
+test_expect_success 'git client does not send Accept-Language' '
+	test_must_fail env GIT_CURL_VERBOSE=1 LANGUAGE= git clone "$HTTPD_URL/accept/language" 2>stderr &&
+	! grep "^Accept-Language:" stderr
+'
+
 stop_httpd
 test_done
-- 
2.2.0

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 1/1] http: Add Accept-Language header if possible
  2014-12-02 12:12   ` [PATCH v5 1/1] " Yi EungJun
@ 2014-12-03 18:22     ` Junio C Hamano
  2014-12-03 19:31     ` Eric Sunshine
  1 sibling, 0 replies; 58+ messages in thread
From: Junio C Hamano @ 2014-12-03 18:22 UTC (permalink / raw)
  To: Yi EungJun; +Cc: git, Yi EungJun, Eric Sunshine, Jeff King, Peter Krefting

Yi EungJun <semtlenori@gmail.com> writes:

> From: Yi EungJun <eungjun.yi@navercorp.com>
>
> Add an Accept-Language header which indicates the user's preferred
> languages defined by $LANGUAGE, $LC_ALL, $LC_MESSAGES and $LANG.
>
> Examples:
>   LANGUAGE= -> ""
>   LANGUAGE=ko:en -> "Accept-Language: ko, en;q=0.9, *;q=0.1"
>   LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *;q=0.1"
>   LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *;q=0.1"
>
> This gives git servers a chance to display remote error messages in
> the user's preferred language.
>
> Limit the number of languages to 1,000 because q-value must not be
> smaller than 0.001, and limit the length of Accept-Language header to
> 4,000 bytes for some HTTP servers which cannot accept such long header.
>
> Signed-off-by: Yi EungJun <eungjun.yi@navercorp.com>
> ---
>  http.c                     | 154 +++++++++++++++++++++++++++++++++++++++++++++
>  remote-curl.c              |   2 +
>  t/t5550-http-fetch-dumb.sh |  31 +++++++++
>  3 files changed, 187 insertions(+)
>
> diff --git a/http.c b/http.c
> index 040f362..69624af 100644
> --- a/http.c
> +++ b/http.c
> @@ -68,6 +68,8 @@ static struct curl_slist *no_pragma_header;
>  
>  static struct active_request_slot *active_queue_head;
>  
> +static struct strbuf *cached_accept_language;
> +
>  size_t fread_buffer(char *ptr, size_t eltsize, size_t nmemb, void *buffer_)
>  {
>  	size_t size = eltsize * nmemb;
> @@ -515,6 +517,9 @@ void http_cleanup(void)
>  		cert_auth.password = NULL;
>  	}
>  	ssl_cert_password_required = 0;
> +
> +	if (cached_accept_language)
> +		strbuf_release(cached_accept_language);

Is this correct?

You still keep cached_accept_language pointer itself, so the next
call to get_accept_language() would say "Ah, cached_accept_language
is there, so its contents (which is empty because we released it
here) must be valid and to be reused".  Perhaps you want to free it,
too, i.e.

	if (cached_accept_language) {
		strbuf_release(cached_accept_language);
                free(cached_accept_language);
                cached_accept_language = NULL;
	}

or something?

> +static struct strbuf *get_accept_language(void)
> +{
> + ...
> +	if (cached_accept_language)
> +		return cached_accept_language;
> +
> +	cached_accept_language = xmalloc(sizeof(struct strbuf));
> + ...
> +	for (max_q = 1, decimal_places = 0;
> +		max_q < num_langs;
> +		decimal_places++, max_q *= 10);

Have that "empty statement" on its own separate line, i.e.

	for (a, counter = 0;
             b;
             c, counter++)
             ; /* just counting */

Alternatively, you can make it more obvious that the purpose of loop
is to count, i.e.

	for (a, counter = 0;
             b;
             c)
             counter++;

> +test_expect_success 'git client does not send Accept-Language' '
> +	test_must_fail env GIT_CURL_VERBOSE=1 LANGUAGE= git clone "$HTTPD_URL/accept/language" 2>stderr &&
> +	! grep "^Accept-Language:" stderr
> +'

Hmph, this test smells a bit brittle.  What is the reason you expect
"git clone" to fail?  Is it because there is no repository at the
named URL at "$HTTPD_URL/accept/language"?  Is that the only plausible
reason for a failure?

It might be better to use the URL to a repository that is expected
to be served by the server started in this test and expect success.
If it bothers you that "clone" creates a new copy that is otherwise
unused by this test, you can use something like "ls-remote" instead,
I would think.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 1/1] http: Add Accept-Language header if possible
  2014-12-02 12:12   ` [PATCH v5 1/1] " Yi EungJun
  2014-12-03 18:22     ` Junio C Hamano
@ 2014-12-03 19:31     ` Eric Sunshine
  2014-12-03 21:37       ` Junio C Hamano
  1 sibling, 1 reply; 58+ messages in thread
From: Eric Sunshine @ 2014-12-03 19:31 UTC (permalink / raw)
  To: Yi EungJun
  Cc: Git List, Yi EungJun, Jeff King, Peter Krefting, Junio C Hamano

On Tue, Dec 2, 2014 at 7:12 AM, Yi EungJun <semtlenori@gmail.com> wrote:
> From: Yi EungJun <eungjun.yi@navercorp.com>
>
> Add an Accept-Language header which indicates the user's preferred
> languages defined by $LANGUAGE, $LC_ALL, $LC_MESSAGES and $LANG.
>
> Examples:
>   LANGUAGE= -> ""
>   LANGUAGE=ko:en -> "Accept-Language: ko, en;q=0.9, *;q=0.1"
>   LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *;q=0.1"
>   LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *;q=0.1"
>
> This gives git servers a chance to display remote error messages in
> the user's preferred language.
>
> Limit the number of languages to 1,000 because q-value must not be
> smaller than 0.001, and limit the length of Accept-Language header to
> 4,000 bytes for some HTTP servers which cannot accept such long header.
>
> Signed-off-by: Yi EungJun <eungjun.yi@navercorp.com>
> ---
> diff --git a/http.c b/http.c
> index 040f362..69624af 100644
> --- a/http.c
> +++ b/http.c
> @@ -68,6 +68,8 @@ static struct curl_slist *no_pragma_header;
>
>  static struct active_request_slot *active_queue_head;
>
> +static struct strbuf *cached_accept_language;

Is there a reason this needs to be a pointer to a strbuf rather than
just a strbuf? That is wouldn't this work?

    static struct strbuf cached_accept_language = STRBUF_INIT;

>  size_t fread_buffer(char *ptr, size_t eltsize, size_t nmemb, void *buffer_)
>  {
>         size_t size = eltsize * nmemb;
> @@ -515,6 +517,9 @@ void http_cleanup(void)
>                 cert_auth.password = NULL;
>         }
>         ssl_cert_password_required = 0;
> +
> +       if (cached_accept_language)
> +               strbuf_release(cached_accept_language);

Junio already mentioned that this is leaking the memory of the strbuf
struct itself which was xmalloc()'d by get_accept_language().

>  }
>
>  struct active_request_slot *get_active_slot(void)
> @@ -986,6 +991,149 @@ static void extract_content_type(struct strbuf *raw, struct strbuf *type,
>                 strbuf_addstr(charset, "ISO-8859-1");
>  }
>
> +/*
> + * Guess the user's preferred languages from the value in LANGUAGE environment
> + * variable and LC_MESSAGES locale category.
> + *
> + * The result can be a colon-separated list like "ko:ja:en".
> + */
> +static const char *get_preferred_languages(void)
> +{
> +       const char *retval;
> +
> +       retval = getenv("LANGUAGE");
> +       if (retval && *retval)
> +               return retval;
> +
> +       retval = setlocale(LC_MESSAGES, NULL);
> +       if (retval && *retval &&
> +               strcmp(retval, "C") &&
> +               strcmp(retval, "POSIX"))
> +               return retval;
> +
> +       return NULL;

Mental note: This function will never return an empty string "", even
if LANGUAGE or LC_MESSAGES is empty.

> +}
> +
> +/*
> + * Get an Accept-Language header which indicates user's preferred languages.
> + *
> + * Examples:
> + *   LANGUAGE= -> ""
> + *   LANGUAGE=ko:en -> "Accept-Language: ko, en; q=0.9, *; q=0.1"
> + *   LANGUAGE=ko_KR.UTF-8:sr@latin -> "Accept-Language: ko-KR, sr; q=0.9, *; q=0.1"
> + *   LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *; q=0.1"
> + *   LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *; q=0.1"
> + *   LANGUAGE= LANG=C -> ""
> + */
> +static struct strbuf *get_accept_language(void)

I find this API a bit strange. Use of strbuf to construct the returned
string is an implementation detail of this function. From the caller's
point of view, it should just be receiving a constant string: one
which it needs neither to modify nor free. Also, if the caller were to
modify the returned strbuf for some reason, then that modification
would impact all future calls to get_accept_language() since the
strbuf is 'static' and not recomputed. Instead, I would expect the
declaration to be:

    static const char *get_accept_language(void)

> +{
> +       const char *lang_begin, *pos;
> +       int q, max_q;
> +       int num_langs;
> +       int decimal_places;
> +       int is_codeset_or_modifier = 0;
> +       char q_format[32];
> +       const int MAX_LANGS = 1000;
> +       const int MAX_SIZE_OF_HEADER = 4000;
> +       int last_size = 0;
> +
> +       if (cached_accept_language)
> +               return cached_accept_language;
> +
> +       cached_accept_language = xmalloc(sizeof(struct strbuf));
> +       strbuf_init(cached_accept_language, 0);
> +       lang_begin = get_preferred_languages();

Mental note: lang_begin will never be the empty string "".

> +       /* Don't add Accept-Language header if no language is preferred. */
> +       if (!lang_begin)
> +               return cached_accept_language;
> +
> +       /* Count number of preferred lang_begin to decide precision of q-factor. */
> +       for (num_langs = 1, pos = lang_begin; *pos; pos++)
> +               if (*pos == ':')
> +                       num_langs++;
> +
> +       /* Decide the precision for q-factor on number of preferred lang_begin. */
> +       num_langs += 1; /* for '*' */
> +
> +       if (MAX_LANGS < num_langs)
> +               num_langs = MAX_LANGS;
> +
> +       for (max_q = 1, decimal_places = 0;
> +               max_q < num_langs;
> +               decimal_places++, max_q *= 10);
> +
> +       sprintf(q_format, ";q=0.%%0%dd", decimal_places);
> +
> +       q = max_q;
> +
> +       strbuf_addstr(cached_accept_language, "Accept-Language: ");
> +
> +       for (pos = lang_begin; ; pos++) {
> +               if (*pos == ':' || !*pos) {
> +                       /* Ignore if this character is the first one. */
> +                       if (pos == lang_begin)
> +                               continue;

If lang_begin were ever to point at an empty string "", then this
logic would access memory beyond the end-of-string. Since
get_preferred_languages() won't return an empty string, this case is
safe, but it's not obvious to the casual reader and some person
modifying the code in the future might not notice this restriction. I
personally would feel more comfortable if the no-empty-string
assumption was documented formally with an assert(*lang_begin) or
die() before entering the loop; or if the code was rewritten to be
less fragile.

> +                       is_codeset_or_modifier = 0;
> +
> +                       /* Put a q-factor only if it is less than 1.0. */
> +                       if (q < max_q)
> +                               strbuf_addf(cached_accept_language, q_format, q);
> +
> +                       if (q > 1)
> +                               q--;
> +
> +                       last_size = cached_accept_language->len;
> +
> +                       /* NULL pos means this is the last language. */

Nit: This feels somewhat backward. It sounds like the comment is
explaining the 'then' part of the 'if' rather than the 'else' part.
Perhaps rephrase like this:

    /* non-NULL pos means more languages */

or, better yet, just drop the comment since the code is self-explanatory.

> +                       if (*pos)
> +                               strbuf_addstr(cached_accept_language, ", ");
> +                       else
> +                               break;
> +
> +               } else if (is_codeset_or_modifier)
> +                       continue;
> +               else if (*pos == '.' || *pos == '@') /* Remove .codeset and @modifier. */
> +                       is_codeset_or_modifier = 1;
> +               else
> +                       strbuf_addch(cached_accept_language, *pos == '_' ? '-' : *pos);
> +
> +               if (cached_accept_language->len > MAX_SIZE_OF_HEADER) {

Mental note: Here you respect MAX_SIZE_OF_HEADER.

> +                       strbuf_remove(cached_accept_language, last_size,
> +                                       cached_accept_language->len - last_size);
> +                       break;
> +               }
> +       }
> +
> +       /* Don't add Accept-Language header if no language is preferred. */

Is this comment correct? The "Accept-Language:" header was already
added to cached_accept_language much earlier in the function.

> +       if (q >= max_q) {
> +               return cached_accept_language;
> +       }

Style: Unnecessary braces.

> +       /* Add '*' with minimum q-factor greater than 0.0. */
> +       strbuf_addstr(cached_accept_language, ", *");
> +       strbuf_addf(cached_accept_language, q_format, 1);

Here you don't respect MAX_SIZE_OF_HEADER.

> +
> +       return cached_accept_language;
> +}
> +
>  /* http_request() targets */
>  #define HTTP_REQUEST_STRBUF    0
>  #define HTTP_REQUEST_FILE      1
> @@ -998,6 +1146,7 @@ static int http_request(const char *url,
>         struct slot_results results;
>         struct curl_slist *headers = NULL;
>         struct strbuf buf = STRBUF_INIT;
> +       struct strbuf *accept_language;
>         int ret;
>
>         slot = get_active_slot();
> @@ -1023,6 +1172,11 @@ static int http_request(const char *url,
>                                          fwrite_buffer);
>         }
>
> +       accept_language = get_accept_language();
> +
> +       if (accept_language && accept_language->len > 0)
> +               headers = curl_slist_append(headers, accept_language->buf);
> +
>         strbuf_addstr(&buf, "Pragma:");
>         if (options && options->no_cache)
>                 strbuf_addstr(&buf, " no-cache");

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 1/1] http: Add Accept-Language header if possible
  2014-12-03 19:31     ` Eric Sunshine
@ 2014-12-03 21:37       ` Junio C Hamano
  2014-12-03 22:00         ` Michael Blume
  0 siblings, 1 reply; 58+ messages in thread
From: Junio C Hamano @ 2014-12-03 21:37 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Yi EungJun, Git List, Yi EungJun, Jeff King, Peter Krefting

Eric Sunshine <sunshine@sunshineco.com> writes:

>> @@ -515,6 +517,9 @@ void http_cleanup(void)
>>                 cert_auth.password = NULL;
>>         }
>>         ssl_cert_password_required = 0;
>> +
>> +       if (cached_accept_language)
>> +               strbuf_release(cached_accept_language);
>
> Junio already mentioned that this is leaking the memory of the strbuf
> struct itself which was xmalloc()'d by get_accept_language().

I actually didn't ;-)  A singleton cached_accept_language strbuf
itself being kept around, with its reuse by get_accept_language(),
is fine and is not a leak.  But clearing the strbuf alone will
introduce correctness problem---the second HTTP connection will see
an empty strbuf, get_accept_language() will say "we've already
computed and the header we must issue is an empty string", which is
not correct.

In the fix-up "SQUASH???" commit I queued on top of this patch on
'pu', I had to run "sort -u" on the output to the standard error
stream, as there seemed to be two HTTP connections and the actual
output had two headers, even though the test expected only one in
the output.  I suspect that it is a fallout from this bug that the
original code passed that test that expects only one.

>> +static struct strbuf *get_accept_language(void)
>
> I find this API a bit strange. Use of strbuf to construct the returned
> string is an implementation detail of this function. From the caller's
> point of view, it should just be receiving a constant string: one
> which it needs neither to modify nor free. Also, if the caller were to
> modify the returned strbuf for some reason, then that modification
> would impact all future calls to get_accept_language() since the
> strbuf is 'static' and not recomputed. Instead, I would expect the
> declaration to be:
>
>     static const char *get_accept_language(void)

Makes sense to me.

>> +                       /* Put a q-factor only if it is less than 1.0. */
>> +                       if (q < max_q)
>> +                               strbuf_addf(cached_accept_language, q_format, q);
>> +
>> +                       if (q > 1)
>> +                               q--;

I didn't mention this but if q ever goes below 1, wouldn't it mean
that there is no point continuing this loop?

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 1/1] http: Add Accept-Language header if possible
  2014-12-03 21:37       ` Junio C Hamano
@ 2014-12-03 22:00         ` Michael Blume
  2014-12-03 22:06           ` Michael Blume
  0 siblings, 1 reply; 58+ messages in thread
From: Michael Blume @ 2014-12-03 22:00 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Eric Sunshine, Yi EungJun, Git List, Yi EungJun, Jeff King,
	Peter Krefting

On Wed, Dec 3, 2014 at 1:37 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Eric Sunshine <sunshine@sunshineco.com> writes:
>
>>> @@ -515,6 +517,9 @@ void http_cleanup(void)
>>>                 cert_auth.password = NULL;
>>>         }
>>>         ssl_cert_password_required = 0;
>>> +
>>> +       if (cached_accept_language)
>>> +               strbuf_release(cached_accept_language);
>>
>> Junio already mentioned that this is leaking the memory of the strbuf
>> struct itself which was xmalloc()'d by get_accept_language().
>
> I actually didn't ;-)  A singleton cached_accept_language strbuf
> itself being kept around, with its reuse by get_accept_language(),
> is fine and is not a leak.  But clearing the strbuf alone will
> introduce correctness problem---the second HTTP connection will see
> an empty strbuf, get_accept_language() will say "we've already
> computed and the header we must issue is an empty string", which is
> not correct.
>
> In the fix-up "SQUASH???" commit I queued on top of this patch on
> 'pu', I had to run "sort -u" on the output to the standard error
> stream, as there seemed to be two HTTP connections and the actual
> output had two headers, even though the test expected only one in
> the output.  I suspect that it is a fallout from this bug that the
> original code passed that test that expects only one.
>
>>> +static struct strbuf *get_accept_language(void)
>>
>> I find this API a bit strange. Use of strbuf to construct the returned
>> string is an implementation detail of this function. From the caller's
>> point of view, it should just be receiving a constant string: one
>> which it needs neither to modify nor free. Also, if the caller were to
>> modify the returned strbuf for some reason, then that modification
>> would impact all future calls to get_accept_language() since the
>> strbuf is 'static' and not recomputed. Instead, I would expect the
>> declaration to be:
>>
>>     static const char *get_accept_language(void)
>
> Makes sense to me.
>
>>> +                       /* Put a q-factor only if it is less than 1.0. */
>>> +                       if (q < max_q)
>>> +                               strbuf_addf(cached_accept_language, q_format, q);
>>> +
>>> +                       if (q > 1)
>>> +                               q--;
>
> I didn't mention this but if q ever goes below 1, wouldn't it mean
> that there is no point continuing this loop?
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
This seems to be failing under Mac OS for me

not ok 25 - git client sends Accept-Language based on LANGUAGE,
LC_ALL, LC_MESSAGES and LANG
#
# check_language "ko-KR, *;q=0.1" ko_KR.UTF-8 de_DE.UTF-8 ja_JP.UTF-8
en_US.UTF-8 &&
# check_language "de-DE, *;q=0.1" ""          de_DE.UTF-8 ja_JP.UTF-8
en_US.UTF-8 &&
# check_language "ja-JP, *;q=0.1" ""          ""          ja_JP.UTF-8
en_US.UTF-8 &&
# check_language "en-US, *;q=0.1" ""          ""          ""
en_US.UTF-8
#

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 1/1] http: Add Accept-Language header if possible
  2014-12-03 22:00         ` Michael Blume
@ 2014-12-03 22:06           ` Michael Blume
  2014-12-22 16:44             ` [PATCH v6 0/1] " Yi EungJun
  0 siblings, 1 reply; 58+ messages in thread
From: Michael Blume @ 2014-12-03 22:06 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Eric Sunshine, Yi EungJun, Git List, Yi EungJun, Jeff King,
	Peter Krefting

On Wed, Dec 3, 2014 at 2:00 PM, Michael Blume <blume.mike@gmail.com> wrote:
> On Wed, Dec 3, 2014 at 1:37 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> Eric Sunshine <sunshine@sunshineco.com> writes:
>>
>>>> @@ -515,6 +517,9 @@ void http_cleanup(void)
>>>>                 cert_auth.password = NULL;
>>>>         }
>>>>         ssl_cert_password_required = 0;
>>>> +
>>>> +       if (cached_accept_language)
>>>> +               strbuf_release(cached_accept_language);
>>>
>>> Junio already mentioned that this is leaking the memory of the strbuf
>>> struct itself which was xmalloc()'d by get_accept_language().
>>
>> I actually didn't ;-)  A singleton cached_accept_language strbuf
>> itself being kept around, with its reuse by get_accept_language(),
>> is fine and is not a leak.  But clearing the strbuf alone will
>> introduce correctness problem---the second HTTP connection will see
>> an empty strbuf, get_accept_language() will say "we've already
>> computed and the header we must issue is an empty string", which is
>> not correct.
>>
>> In the fix-up "SQUASH???" commit I queued on top of this patch on
>> 'pu', I had to run "sort -u" on the output to the standard error
>> stream, as there seemed to be two HTTP connections and the actual
>> output had two headers, even though the test expected only one in
>> the output.  I suspect that it is a fallout from this bug that the
>> original code passed that test that expects only one.
>>
>>>> +static struct strbuf *get_accept_language(void)
>>>
>>> I find this API a bit strange. Use of strbuf to construct the returned
>>> string is an implementation detail of this function. From the caller's
>>> point of view, it should just be receiving a constant string: one
>>> which it needs neither to modify nor free. Also, if the caller were to
>>> modify the returned strbuf for some reason, then that modification
>>> would impact all future calls to get_accept_language() since the
>>> strbuf is 'static' and not recomputed. Instead, I would expect the
>>> declaration to be:
>>>
>>>     static const char *get_accept_language(void)
>>
>> Makes sense to me.
>>
>>>> +                       /* Put a q-factor only if it is less than 1.0. */
>>>> +                       if (q < max_q)
>>>> +                               strbuf_addf(cached_accept_language, q_format, q);
>>>> +
>>>> +                       if (q > 1)
>>>> +                               q--;
>>
>> I didn't mention this but if q ever goes below 1, wouldn't it mean
>> that there is no point continuing this loop?
>> --
>> To unsubscribe from this list: send the line "unsubscribe git" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> This seems to be failing under Mac OS for me
>
> not ok 25 - git client sends Accept-Language based on LANGUAGE,
> LC_ALL, LC_MESSAGES and LANG
> #
> # check_language "ko-KR, *;q=0.1" ko_KR.UTF-8 de_DE.UTF-8 ja_JP.UTF-8
> en_US.UTF-8 &&
> # check_language "de-DE, *;q=0.1" ""          de_DE.UTF-8 ja_JP.UTF-8
> en_US.UTF-8 &&
> # check_language "ja-JP, *;q=0.1" ""          ""          ja_JP.UTF-8
> en_US.UTF-8 &&
> # check_language "en-US, *;q=0.1" ""          ""          ""
> en_US.UTF-8
> #


verbose results

Initialized empty Git repository in
/Users/Shared/Jenkins/Home/jobs/git/workspace/t/trash
directory.t5550-http-fetch-dumb/.git/
expecting success:
git config push.default matching &&
echo content1 >file &&
git add file &&
git commit -m one
echo content2 >file &&
git add file &&
git commit -m two

[master (root-commit) f5983e9] one
 Author: A U Thor <author@example.com>
 1 file changed, 1 insertion(+)
 create mode 100644 file
[master 2ff8a06] two
 Author: A U Thor <author@example.com>
 1 file changed, 1 insertion(+), 1 deletion(-)
ok 1 - setup repository

expecting success:
cp -R .git "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" &&
(cd "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" &&
git config core.bare true &&
mkdir -p hooks &&
echo "exec git update-server-info" >hooks/post-update &&
chmod +x hooks/post-update &&
hooks/post-update
) &&
git remote add public "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" &&
git push public master:master

Everything up-to-date
ok 2 - create http-accessible bare repository with loose objects

expecting success:
git clone $HTTPD_URL/dumb/repo.git clone-tmpl &&
cp -R clone-tmpl clone &&
test_cmp file clone/file

Cloning into 'clone-tmpl'...
ok 3 - clone http repository

expecting success:
mkdir -p "$HTTPD_DOCUMENT_ROOT_PATH/auth/dumb/" &&
cp -Rf "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" \
      "$HTTPD_DOCUMENT_ROOT_PATH/auth/dumb/repo.git"

ok 4 - create password-protected repository

expecting success:
write_script "$TRASH_DIRECTORY/askpass" <<-\EOF &&
echo >>"$TRASH_DIRECTORY/askpass-query" "askpass: $*" &&
case "$*" in
*Username*)
what=user
;;
*Password*)
what=pass
;;
esac &&
cat "$TRASH_DIRECTORY/askpass-$what"
EOF
GIT_ASKPASS="$TRASH_DIRECTORY/askpass" &&
export GIT_ASKPASS &&
export TRASH_DIRECTORY

ok 5 - setup askpass helper

expecting success:
set_askpass wrong &&
test_must_fail git clone "$HTTPD_URL/auth/dumb/repo.git" clone-auth-fail &&
expect_askpass both wrong

Cloning into 'clone-auth-fail'...
fatal: Authentication failed for 'http://127.0.0.1:5550/auth/dumb/repo.git/'
ok 6 - cloning password-protected repository can fail

expecting success:
set_askpass wrong &&
git clone "$HTTPD_URL_USER_PASS/auth/dumb/repo.git" clone-auth-none &&
expect_askpass none

Cloning into 'clone-auth-none'...
ok 7 - http auth can use user/pass in URL

expecting success:
set_askpass wrong pass@host &&
git clone "$HTTPD_URL_USER/auth/dumb/repo.git" clone-auth-pass &&
expect_askpass pass user@host

Cloning into 'clone-auth-pass'...
ok 8 - http auth can use just user in URL

expecting success:
set_askpass user@host pass@host &&
git clone "$HTTPD_URL/auth/dumb/repo.git" clone-auth-both &&
expect_askpass both user@host

Cloning into 'clone-auth-both'...
ok 9 - http auth can request both user and pass

expecting success:
test_config_global credential.helper "!f() {
cat >/dev/null
echo username=user@host
echo password=pass@host
}; f" &&
set_askpass wrong &&
git clone "$HTTPD_URL/auth/dumb/repo.git" clone-auth-helper &&
expect_askpass none

Cloning into 'clone-auth-helper'...
ok 10 - http auth respects credential helper config

expecting success:
test_config_global "credential.$HTTPD_URL.username" user@host &&
set_askpass wrong pass@host &&
git clone "$HTTPD_URL/auth/dumb/repo.git" clone-auth-user &&
expect_askpass pass user@host

Cloning into 'clone-auth-user'...
ok 11 - http auth can get username from config

expecting success:
test_config_global "credential.$HTTPD_URL.username" wrong &&
set_askpass wrong pass@host &&
git clone "$HTTPD_URL_USER/auth/dumb/repo.git" clone-auth-user2 &&
expect_askpass pass user@host

Cloning into 'clone-auth-user2'...
ok 12 - configured username does not override URL

expecting success:
echo content >>file &&
git commit -a -m two &&
git push public &&
(cd clone && git pull) &&
test_cmp file clone/file

[master d4af499] two
 Author: A U Thor <author@example.com>
 1 file changed, 1 insertion(+)
To /Users/Shared/Jenkins/Home/jobs/git/workspace/t/trash
directory.t5550-http-fetch-dumb/httpd/www/repo.git
   2ff8a06..d4af499  master -> master
From http://127.0.0.1:5550/dumb/repo
   2ff8a06..d4af499  master     -> origin/master
Updating 2ff8a06..d4af499
Fast-forward
 file | 1 +
 1 file changed, 1 insertion(+)
ok 13 - fetch changes via http

expecting success:
cp -R clone-tmpl clone2 &&

HEAD=$(git rev-parse --verify HEAD) &&
(cd clone2 &&
git http-fetch -a -w heads/master-new $HEAD $(git config remote.origin.url) &&
git checkout master-new &&
test $HEAD = $(git rev-parse --verify HEAD)) &&
test_cmp file clone2/file

Switched to branch 'master-new'
ok 14 - fetch changes via manual http-fetch

expecting success:
git push public master:other &&
(cd clone &&
git remote set-head origin -d &&
git remote set-head origin -a &&
git symbolic-ref refs/remotes/origin/HEAD > output &&
echo refs/remotes/origin/master > expect &&
test_cmp expect output
)

To /Users/Shared/Jenkins/Home/jobs/git/workspace/t/trash
directory.t5550-http-fetch-dumb/httpd/www/repo.git
 * [new branch]      master -> other
origin/HEAD set to master
ok 15 - http remote detects correct HEAD

expecting success:
cp -R "$HTTPD_DOCUMENT_ROOT_PATH"/repo.git
"$HTTPD_DOCUMENT_ROOT_PATH"/repo_pack.git &&
(cd "$HTTPD_DOCUMENT_ROOT_PATH"/repo_pack.git &&
git --bare repack -a -d
) &&
git clone $HTTPD_URL/dumb/repo_pack.git

Cloning into 'repo_pack'...
ok 16 - fetch packed objects

expecting success:
cp -R "$HTTPD_DOCUMENT_ROOT_PATH"/repo_pack.git
"$HTTPD_DOCUMENT_ROOT_PATH"/repo_bad1.git &&
(cd "$HTTPD_DOCUMENT_ROOT_PATH"/repo_bad1.git &&
p=`ls objects/pack/pack-*.pack` &&
chmod u+w $p &&
printf %0256d 0 | dd of=$p bs=256 count=1 seek=1 conv=notrunc
) &&
mkdir repo_bad1.git &&
(cd repo_bad1.git &&
git --bare init &&
test_must_fail git --bare fetch $HTTPD_URL/dumb/repo_bad1.git &&
test 0 = `ls objects/pack/pack-*.pack | wc -l`
)

1+0 records in
1+0 records out
256 bytes transferred in 0.000014 secs (18512790 bytes/sec)
Initialized empty Git repository in
/Users/Shared/Jenkins/Home/jobs/git/workspace/t/trash
directory.t5550-http-fetch-dumb/repo_bad1.git/
fatal: pack has bad object at offset 168: inflate returned -5
error: Unable to find d4af499a00b28bc0ab78fa94cc6a449fae19b08d under
http://127.0.0.1:5550/dumb/repo_bad1.git
Cannot obtain needed object d4af499a00b28bc0ab78fa94cc6a449fae19b08d
error: fetch failed.
ls: objects/pack/pack-*.pack: No such file or directory
ok 17 - fetch notices corrupt pack

expecting success:
cp -R "$HTTPD_DOCUMENT_ROOT_PATH"/repo_pack.git
"$HTTPD_DOCUMENT_ROOT_PATH"/repo_bad2.git &&
(cd "$HTTPD_DOCUMENT_ROOT_PATH"/repo_bad2.git &&
p=`ls objects/pack/pack-*.idx` &&
chmod u+w $p &&
printf %0256d 0 | dd of=$p bs=256 count=1 seek=1 conv=notrunc
) &&
mkdir repo_bad2.git &&
(cd repo_bad2.git &&
git --bare init &&
test_must_fail git --bare fetch $HTTPD_URL/dumb/repo_bad2.git &&
test 0 = `ls objects/pack | wc -l`
)

1+0 records in
1+0 records out
256 bytes transferred in 0.000013 secs (19522579 bytes/sec)
Initialized empty Git repository in
/Users/Shared/Jenkins/Home/jobs/git/workspace/t/trash
directory.t5550-http-fetch-dumb/repo_bad2.git/
error: non-monotonic index
/Users/Shared/Jenkins/Home/jobs/git/workspace/t/trash
directory.t5550-http-fetch-dumb/repo_bad2.git/objects/pack/pack-7244949d8d6e59a30923b3fff7801f159ef4ba5d.idx.temp
error: Unable to find d4af499a00b28bc0ab78fa94cc6a449fae19b08d under
http://127.0.0.1:5550/dumb/repo_bad2.git
Cannot obtain needed object d4af499a00b28bc0ab78fa94cc6a449fae19b08d
error: fetch failed.
ok 18 - fetch notices corrupt idx

expecting success:
grep /git-upload-pack <"$HTTPD_ROOT_PATH"/access.log >act
: >exp
test_cmp exp act

ok 19 - did not use upload-pack service

expecting success:
test_must_fail git clone "$HTTPD_URL/error/text" 2>stderr &&
grep "this is the error message" stderr

remote: this is the error message
ok 20 - git client shows text/plain errors

expecting success:
test_must_fail git clone "$HTTPD_URL/error/html" 2>stderr &&
! grep "this is the error message" stderr

ok 21 - git client does not show html errors

expecting success:
test_must_fail git clone "$HTTPD_URL/error/charset" 2>stderr &&
grep "this is the error message" stderr

remote: this is the error message
ok 22 - git client shows text/plain with a charset

expecting success:
test_must_fail git clone "$HTTPD_URL/error/utf16" 2>stderr &&
grep "this is the error message" stderr

remote: this is the error message
ok 23 - http error messages are reencoded

expecting success:
test_must_fail git clone "$HTTPD_URL/error/odd-spacing" 2>stderr &&
grep "this is the error message" stderr

remote: this is the error message
ok 24 - reencoding is robust to whitespace oddities

expecting success:
check_language "ko-KR, *;q=0.1" ko_KR.UTF-8 de_DE.UTF-8 ja_JP.UTF-8
en_US.UTF-8 &&
check_language "de-DE, *;q=0.1" ""          de_DE.UTF-8 ja_JP.UTF-8
en_US.UTF-8 &&
check_language "ja-JP, *;q=0.1" ""          ""          ja_JP.UTF-8
en_US.UTF-8 &&
check_language "en-US, *;q=0.1" ""          ""          ""          en_US.UTF-8

not ok 25 - git client sends Accept-Language based on LANGUAGE,
LC_ALL, LC_MESSAGES and LANG
#
# check_language "ko-KR, *;q=0.1" ko_KR.UTF-8 de_DE.UTF-8 ja_JP.UTF-8
en_US.UTF-8 &&
# check_language "de-DE, *;q=0.1" ""          de_DE.UTF-8 ja_JP.UTF-8
en_US.UTF-8 &&
# check_language "ja-JP, *;q=0.1" ""          ""          ja_JP.UTF-8
en_US.UTF-8 &&
# check_language "en-US, *;q=0.1" ""          ""          ""
en_US.UTF-8
#

expecting success:
check_language "ko-KR, en-US;q=0.99, fr-CA;q=0.98, de;q=0.97, sr;q=0.96, \
ja;q=0.95, zh;q=0.94, sv;q=0.93, pt;q=0.92, nb;q=0.91, *;q=0.01" \
ko_KR.EUC-KR:en_US.UTF-8:fr_CA:de.UTF-8@euro:sr@latin:ja:zh:sv:pt:nb

ok 26 - git client sends Accept-Language with many preferred languages

expecting success:
GIT_CURL_VERBOSE=1 LANGUAGE= git ls-remote "$HTTPD_URL/dumb/repo.git"
2>stderr &&
! grep "^Accept-Language:" stderr

d4af499a00b28bc0ab78fa94cc6a449fae19b08d HEAD
d4af499a00b28bc0ab78fa94cc6a449fae19b08d refs/heads/master
d4af499a00b28bc0ab78fa94cc6a449fae19b08d refs/heads/other
ok 27 - git client does not send an empty Accept-Language

# failed 1 among 27 test(s)
1..27

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH v6 0/1] http: Add Accept-Language header if possible
  2014-12-03 22:06           ` Michael Blume
@ 2014-12-22 16:44             ` Yi EungJun
  2014-12-22 16:44               ` [PATCH v6 1/1] " Yi EungJun
  0 siblings, 1 reply; 58+ messages in thread
From: Yi EungJun @ 2014-12-22 16:44 UTC (permalink / raw)
  To: git
  Cc: Yi EungJun, Eric Sunshine, Jeff King, Peter Krefting,
	Junio C Hamano, Michael Blume

Changes since v5

>From Junio C Hamano's review:

* The tests use `ls-remote` instead of `clone` for tests; I copied the test
  code from ba8e63dc30a80656fddc616f714fb217ad220c04.

* Set cached_accept_langauge to NULL after free it.

>From Eric Sunshine's review:

* get_accept_language() returns a pointer to const char instead of strbuf; the
  type of cached_accept_language also has been changed to char* from strbuf*

* write_accept_language(), which is extracted from get_accept_language(),
  respects MAX_SIZE_OF_HEADER.

* The for-loop in write_accept_language() works correctly if lang_begin points
  an empty string.

>From Jeff King's advice:

* get_preferred_languages() considers LC_MESSAGES only if NO_GETTEXT is not
  defined.

* Remove the tests for LC_MESSAGES, LANG and LC_ALL.

Yi EungJun (1):
  http: Add Accept-Language header if possible

 http.c                     | 173 +++++++++++++++++++++++++++++++++++++++++++++
 remote-curl.c              |   2 +
 t/t5550-http-fetch-dumb.sh |  32 +++++++++
 3 files changed, 207 insertions(+)

-- 
2.2.0.375.gcd18ce6.dirty

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH v6 1/1] http: Add Accept-Language header if possible
  2014-12-22 16:44             ` [PATCH v6 0/1] " Yi EungJun
@ 2014-12-22 16:44               ` Yi EungJun
  2014-12-22 19:34                 ` Junio C Hamano
  2014-12-24 20:35                 ` Eric Sunshine
  0 siblings, 2 replies; 58+ messages in thread
From: Yi EungJun @ 2014-12-22 16:44 UTC (permalink / raw)
  To: git
  Cc: Yi EungJun, Eric Sunshine, Jeff King, Peter Krefting,
	Junio C Hamano, Michael Blume

From: Yi EungJun <eungjun.yi@navercorp.com>

Add an Accept-Language header which indicates the user's preferred
languages defined by $LANGUAGE, $LC_ALL, $LC_MESSAGES and $LANG.

Examples:
  LANGUAGE= -> ""
  LANGUAGE=ko:en -> "Accept-Language: ko, en;q=0.9, *;q=0.1"
  LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *;q=0.1"
  LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *;q=0.1"

This gives git servers a chance to display remote error messages in
the user's preferred language.

Limit the number of languages to 1,000 because q-value must not be
smaller than 0.001, and limit the length of Accept-Language header to
4,000 bytes for some HTTP servers which cannot accept such long header.

Signed-off-by: Yi EungJun <eungjun.yi@navercorp.com>
---
 http.c                     | 173 +++++++++++++++++++++++++++++++++++++++++++++
 remote-curl.c              |   2 +
 t/t5550-http-fetch-dumb.sh |  32 +++++++++
 3 files changed, 207 insertions(+)

diff --git a/http.c b/http.c
index 040f362..7a77708 100644
--- a/http.c
+++ b/http.c
@@ -68,6 +68,8 @@ static struct curl_slist *no_pragma_header;
 
 static struct active_request_slot *active_queue_head;
 
+static char *cached_accept_language;
+
 size_t fread_buffer(char *ptr, size_t eltsize, size_t nmemb, void *buffer_)
 {
 	size_t size = eltsize * nmemb;
@@ -515,6 +517,11 @@ void http_cleanup(void)
 		cert_auth.password = NULL;
 	}
 	ssl_cert_password_required = 0;
+
+	if (cached_accept_language) {
+		free(cached_accept_language);
+		cached_accept_language = NULL;
+	}
 }
 
 struct active_request_slot *get_active_slot(void)
@@ -986,6 +993,166 @@ static void extract_content_type(struct strbuf *raw, struct strbuf *type,
 		strbuf_addstr(charset, "ISO-8859-1");
 }
 
+/*
+ * Guess the user's preferred languages from the value in LANGUAGE environment
+ * variable and LC_MESSAGES locale category if NO_GETTEXT is not defined.
+ *
+ * The result can be a colon-separated list like "ko:ja:en".
+ */
+static const char *get_preferred_languages(void)
+{
+	const char *retval;
+
+	retval = getenv("LANGUAGE");
+	if (retval && *retval)
+		return retval;
+
+#ifndef NO_GETTEXT
+	retval = setlocale(LC_MESSAGES, NULL);
+	if (retval && *retval &&
+		strcmp(retval, "C") &&
+		strcmp(retval, "POSIX"))
+		return retval;
+#endif
+
+	return NULL;
+}
+
+static void write_accept_language(struct strbuf *buf)
+{
+	const char *lang_begin, *pos;
+	int q, max_q;
+	int num_langs;
+	int decimal_places;
+	const int CODESET_OR_MODIFIER = 1;
+	const int LANGUAGE_TAG = 2;
+	const int SEPARATOR = 3;
+	int is_q_factor_required = 0;
+	int parse_state = 0;
+	char q_format[32];
+	/*
+	 * MAX_LANGS must not be larger than 1,000. If it is larger than that,
+	 * q-value will be smaller than 0.001, the minimum q-value the HTTP
+	 * specification allows [1].
+	 *
+	 * [1]: http://tools.ietf.org/html/rfc7231#section-5.3.1
+	 */
+	const int MAX_LANGS = 1000;
+	const int MAX_SIZE_OF_HEADER = 4000;
+	const int MAX_SIZE_OF_ASTERISK_ELEMENT = 11; /* for ", *;q=0.001" */
+	int last_size = 0;
+
+	lang_begin = get_preferred_languages();
+
+	/* Don't add Accept-Language header if no language is preferred. */
+	if (!lang_begin)
+		return;
+
+	/* Count number of preferred lang_begin to decide precision of q-factor. */
+	for (num_langs = 1, pos = lang_begin; *pos; pos++)
+		if (*pos == ':')
+			num_langs++;
+
+	/* Decide the precision for q-factor on number of preferred lang_begin. */
+	num_langs += 1; /* for '*' */
+
+	if (MAX_LANGS < num_langs)
+		num_langs = MAX_LANGS;
+
+	for (max_q = 1, decimal_places = 0;
+		max_q < num_langs;
+		decimal_places++, max_q *= 10)
+		;
+
+	sprintf(q_format, ";q=0.%%0%dd", decimal_places);
+
+	q = max_q;
+
+	strbuf_addstr(buf, "Accept-Language: ");
+
+	/*
+	 * Convert a list of colon-separated locale values [1][2] to a list of
+	 * comma-separated language tags [3] which can be used as a value of
+	 * Accept-Language header.
+	 *
+	 * [1]: http://pubs.opengroup.org/onlinepubs/007908799/xbd/envvar.html
+	 * [2]: http://www.gnu.org/software/libc/manual/html_node/Using-gettextized-software.html
+	 * [3]: http://tools.ietf.org/html/rfc7231#section-5.3.5
+	 */
+	for (pos = lang_begin; ; pos++) {
+		if (!*pos || *pos == ':') {
+			if (is_q_factor_required) {
+				/* Put a q-factor only if it is less than 1.0. */
+				if (q < max_q)
+					strbuf_addf(buf, q_format, q);
+
+				if (q > 1)
+					q--;
+
+				last_size = buf->len;
+
+				is_q_factor_required = 0;
+			}
+			parse_state = SEPARATOR;
+		} else if (parse_state == CODESET_OR_MODIFIER)
+			continue;
+		else if (*pos == ' ') /* Ignore whitespace character */
+			continue;
+		else if (*pos == '.' || *pos == '@') /* Remove .codeset and @modifier. */
+			parse_state = CODESET_OR_MODIFIER;
+		else {
+			if (parse_state != LANGUAGE_TAG && q < max_q)
+				strbuf_addstr(buf, ", ");
+			strbuf_addch(buf, *pos == '_' ? '-' : *pos);
+			is_q_factor_required = 1;
+			parse_state = LANGUAGE_TAG;
+		}
+
+		if (buf->len > MAX_SIZE_OF_HEADER - MAX_SIZE_OF_ASTERISK_ELEMENT) {
+			strbuf_remove(buf, last_size, buf->len - last_size);
+			break;
+		}
+
+		if (!*pos)
+			break;
+	}
+
+	/* Don't add Accept-Language header if no language is preferred. */
+	if (q >= max_q) {
+		strbuf_reset(buf);
+		return;
+	}
+
+	/* Add '*' with minimum q-factor greater than 0.0. */
+	strbuf_addstr(buf, ", *");
+	strbuf_addf(buf, q_format, 1);
+}
+
+/*
+ * Get an Accept-Language header which indicates user's preferred languages.
+ *
+ * This function always return non-NULL string as strbuf_detach() does.
+ *
+ * Examples:
+ *   LANGUAGE= -> ""
+ *   LANGUAGE=ko:en -> "Accept-Language: ko, en; q=0.9, *; q=0.1"
+ *   LANGUAGE=ko_KR.UTF-8:sr@latin -> "Accept-Language: ko-KR, sr; q=0.9, *; q=0.1"
+ *   LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *; q=0.1"
+ *   LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *; q=0.1"
+ *   LANGUAGE= LANG=C -> ""
+ */
+static const char *get_accept_language(void)
+{
+	if (!cached_accept_language) {
+		struct strbuf buf = STRBUF_INIT;
+		write_accept_language(&buf);
+		cached_accept_language = strbuf_detach(&buf, NULL);
+		strbuf_release(&buf);
+	}
+
+	return cached_accept_language;
+}
+
 /* http_request() targets */
 #define HTTP_REQUEST_STRBUF	0
 #define HTTP_REQUEST_FILE	1
@@ -998,6 +1165,7 @@ static int http_request(const char *url,
 	struct slot_results results;
 	struct curl_slist *headers = NULL;
 	struct strbuf buf = STRBUF_INIT;
+	const char *accept_language;
 	int ret;
 
 	slot = get_active_slot();
@@ -1023,6 +1191,11 @@ static int http_request(const char *url,
 					 fwrite_buffer);
 	}
 
+	accept_language = get_accept_language();
+
+	if (strlen(accept_language) > 0)
+		headers = curl_slist_append(headers, accept_language);
+
 	strbuf_addstr(&buf, "Pragma:");
 	if (options && options->no_cache)
 		strbuf_addstr(&buf, " no-cache");
diff --git a/remote-curl.c b/remote-curl.c
index dd63bc2..04989e5 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -962,6 +962,8 @@ int main(int argc, const char **argv)
 	struct strbuf buf = STRBUF_INIT;
 	int nongit;
 
+	git_setup_gettext();
+
 	git_extract_argv0_path(argv[0]);
 	setup_git_directory_gently(&nongit);
 	if (argc < 2) {
diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh
index ac71418..1a58b97 100755
--- a/t/t5550-http-fetch-dumb.sh
+++ b/t/t5550-http-fetch-dumb.sh
@@ -196,5 +196,37 @@ test_expect_success 'reencoding is robust to whitespace oddities' '
 	grep "this is the error message" stderr
 '
 
+check_language () {
+	echo "Accept-Language: $1" >expect &&
+	GIT_CURL_VERBOSE=1 \
+	LANGUAGE=$2 \
+	git ls-remote "$HTTPD_URL/dumb/repo.git" 2>&1 |
+	tr -d '\015' |
+	sort -u >stderr &&
+	grep -i ^Accept-Language: stderr >actual &&
+	test_cmp expect actual
+}
+
+test_expect_success 'git client sends Accept-Language based on LANGUAGE' '
+	check_language "ko-KR, *;q=0.1" ko_KR.UTF-8'
+
+test_expect_success 'git client sends Accept-Language correctly with unordinary LANGUAGE' '
+	check_language "ko-KR, *;q=0.1" "ko_KR:" &&
+	check_language "ko-KR, *;q=0.1" " ko_KR" &&
+	check_language "ko-KR, en-US;q=0.9, *;q=0.1" "ko_KR: en_US" &&
+	check_language "ko-KR, en-US;q=0.9, *;q=0.1" "ko_KR::en_US" &&
+	check_language "ko-KR, *;q=0.1" ":::ko_KR"'
+
+test_expect_success 'git client sends Accept-Language with many preferred languages' '
+	check_language "ko-KR, en-US;q=0.99, fr-CA;q=0.98, de;q=0.97, sr;q=0.96, \
+ja;q=0.95, zh;q=0.94, sv;q=0.93, pt;q=0.92, nb;q=0.91, *;q=0.01" \
+		ko_KR.EUC-KR:en_US.UTF-8:fr_CA:de.UTF-8@euro:sr@latin:ja:zh:sv:pt:nb
+'
+
+test_expect_success 'git client does not send an empty Accept-Language' '
+	GIT_CURL_VERBOSE=1 LANGUAGE= git ls-remote "$HTTPD_URL/dumb/repo.git" 2>stderr &&
+	! grep "^Accept-Language:" stderr
+'
+
 stop_httpd
 test_done
-- 
2.2.0.375.gcd18ce6.dirty

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: [PATCH v6 1/1] http: Add Accept-Language header if possible
  2014-12-22 16:44               ` [PATCH v6 1/1] " Yi EungJun
@ 2014-12-22 19:34                 ` Junio C Hamano
  2014-12-24 20:35                 ` Eric Sunshine
  1 sibling, 0 replies; 58+ messages in thread
From: Junio C Hamano @ 2014-12-22 19:34 UTC (permalink / raw)
  To: Yi EungJun
  Cc: git, Yi EungJun, Eric Sunshine, Jeff King, Peter Krefting, Michael Blume

Yi EungJun <semtlenori@gmail.com> writes:

> From: Yi EungJun <eungjun.yi@navercorp.com>
>
> Add an Accept-Language header which indicates the user's preferred
> languages defined by $LANGUAGE, $LC_ALL, $LC_MESSAGES and $LANG.
>
> Examples:
>   LANGUAGE= -> ""
>   LANGUAGE=ko:en -> "Accept-Language: ko, en;q=0.9, *;q=0.1"
>   LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *;q=0.1"
>   LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *;q=0.1"
>
> This gives git servers a chance to display remote error messages in
> the user's preferred language.
>
> Limit the number of languages to 1,000 because q-value must not be
> smaller than 0.001, and limit the length of Accept-Language header to
> 4,000 bytes for some HTTP servers which cannot accept such long header.
>
> Signed-off-by: Yi EungJun <eungjun.yi@navercorp.com>

Overall, this one is a much more pleasant read than the previous
rounds.

> @@ -986,6 +993,166 @@ static void extract_content_type(struct strbuf *raw, struct strbuf *type,
>  		strbuf_addstr(charset, "ISO-8859-1");
>  }
>  
> +/*
> + * Guess the user's preferred languages from the value in LANGUAGE environment
> + * variable and LC_MESSAGES locale category if NO_GETTEXT is not defined.
> + *
> + * The result can be a colon-separated list like "ko:ja:en".
> + */
> +static const char *get_preferred_languages(void)
> +{
> +	const char *retval;
> +
> +	retval = getenv("LANGUAGE");
> +	if (retval && *retval)
> +		return retval;
> +
> +#ifndef NO_GETTEXT
> +	retval = setlocale(LC_MESSAGES, NULL);
> +	if (retval && *retval &&
> +		strcmp(retval, "C") &&
> +		strcmp(retval, "POSIX"))
> +		return retval;
> +#endif

A tangent.

I wonder if we want to have something silly like this:

	#ifndef NO_GETTEXT
        #define setlocale(x, y) NULL /* or "C"??? */
        #endif

in a common header (e.g. gettext.h) to avoid sprinkling #ifdefs in
our code.  While I do not think we call setlocale() that often to
warrant such a trick, we already do something very similar to make
git_setup_gettext() a no-op in NO_GETTEXT builds in that header
file, and the change in this patch to remote-curl.c does take
advantage of it already, so...

> +static void write_accept_language(struct strbuf *buf)
> +{
> +	const char *lang_begin, *pos;
> +	int q, max_q;
> +	int num_langs;
> +	int decimal_places;
> +	const int CODESET_OR_MODIFIER = 1;
> +	const int LANGUAGE_TAG = 2;
> +	const int SEPARATOR = 3;

Another tangent, but I think we tend to use either #define or enum
for constants, not "const int", in our codebase, for symbolic
constants.  In order to define a set of symbolic constants limited
to a function scope, the "const int" way may be nicer than the other
two methods we have traditionally used.  Perhaps we should promote
such use more widely, write our new code following this example, and
migrate existing ones over time?  I dunno.

> ...
> +	/* Decide the precision for q-factor on number of preferred lang_begin. */
> +	num_langs += 1; /* for '*' */
> +
> +	if (MAX_LANGS < num_langs)
> +		num_langs = MAX_LANGS;
> +
> +	for (max_q = 1, decimal_places = 0;
> +		max_q < num_langs;
> +		decimal_places++, max_q *= 10)
> +		;

So, if we have 10 languages (num_langs == 10), decimal_places
becomes 1, max_q becomes 10 ...

> +	sprintf(q_format, ";q=0.%%0%dd", decimal_places);

... and we will use "q=0.%01d" as the format.  This is OK because
the first one is given without q= so we use the format only nine
times, counting down from 0.9 to 0.1 in 0.1 increments.

Sounds OK to me (I always miscount this kind of loop and need to
think aloud while doing a sanity check).

> +	for (pos = lang_begin; ; pos++) {
> +		if (!*pos || *pos == ':') {
> +			if (is_q_factor_required) {
> +				/* Put a q-factor only if it is less than 1.0. */
> +				if (q < max_q)
> +					strbuf_addf(buf, q_format, q);
> +
> +				if (q > 1)
> +					q--;

When does this "if" statement not trigger?  It seems to me that it
will stop decrementing only if you have very many languages (e.g.
num_langs was clipped to MAX_LANGS), and at that point you would not
want to scan and add more languages---is there a reason why you keep
going in such a case and not break out of the loop, i.e.

	if (q-- < 1)
		break;

or something like that?

> + ...
> +		if (buf->len > MAX_SIZE_OF_HEADER - MAX_SIZE_OF_ASTERISK_ELEMENT) {
> +			strbuf_remove(buf, last_size, buf->len - last_size);
> +			break;
> +		}
> +
> +		if (!*pos)
> +			break;

Alternatively use one of these breaks when q goes below 1, perhaps?

> +/*
> + * Get an Accept-Language header which indicates user's preferred languages.
> + *
> + * This function always return non-NULL string as strbuf_detach() does.
> + *
> + * Examples:
> + *   LANGUAGE= -> ""
> + *   LANGUAGE=ko:en -> "Accept-Language: ko, en; q=0.9, *; q=0.1"
> + *   LANGUAGE=ko_KR.UTF-8:sr@latin -> "Accept-Language: ko-KR, sr; q=0.9, *; q=0.1"
> + *   LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *; q=0.1"
> + *   LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *; q=0.1"
> + *   LANGUAGE= LANG=C -> ""
> + */
> +static const char *get_accept_language(void)
> +{
> +	if (!cached_accept_language) {
> +		struct strbuf buf = STRBUF_INIT;
> +		write_accept_language(&buf);
> +		cached_accept_language = strbuf_detach(&buf, NULL);
> +		strbuf_release(&buf);

If you detached the associated string from the strbuf, you have
already released the resource from it; no need to release it, I
would think.

> diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh
> index ac71418..1a58b97 100755
> --- a/t/t5550-http-fetch-dumb.sh
> +++ b/t/t5550-http-fetch-dumb.sh
> @@ -196,5 +196,37 @@ test_expect_success 'reencoding is robust to whitespace oddities' '
>  	grep "this is the error message" stderr
>  '
>  
> +check_language () {
> +	echo "Accept-Language: $1" >expect &&
> +	GIT_CURL_VERBOSE=1 \
> +	LANGUAGE=$2 \
> +	git ls-remote "$HTTPD_URL/dumb/repo.git" 2>&1 |
> +	tr -d '\015' |
> +	sort -u >stderr &&
> +	grep -i ^Accept-Language: stderr >actual &&
> +	test_cmp expect actual
> +}

This makes it hard to test a case where no Accept-Language: header
should be issued in the request, because at that point we would be
expecting no matching string in the output.

	case "$2" in
        '')
        	>expect
                ;;
	?*)
        	echo "Accept-Language: $1" >expect
                ;;
	esac &&
	git ls-remote "$HTTPD_URL/dumb/repo.git" >output 2>&1 &&
	tr -d '\015' <output |
        sort -u |
        sed -ne '/^Accept-Language:/' >actual &&
        test_cmp expect actual

or something like that, perhaps?

And I can see below that we are not testing that "negative" case.

After writing a new shiny feature, it always is tempting to show off
that it triggers when it is expected to and gives an expected
result, but it is equally important to have tests that make sure
that the feature does not trigger when it should not.


> +test_expect_success 'git client sends Accept-Language based on LANGUAGE' '
> +	check_language "ko-KR, *;q=0.1" ko_KR.UTF-8'
> +
> +test_expect_success 'git client sends Accept-Language correctly with unordinary LANGUAGE' '
> +	check_language "ko-KR, *;q=0.1" "ko_KR:" &&
> +	check_language "ko-KR, *;q=0.1" " ko_KR" &&
> +	check_language "ko-KR, en-US;q=0.9, *;q=0.1" "ko_KR: en_US" &&
> +	check_language "ko-KR, en-US;q=0.9, *;q=0.1" "ko_KR::en_US" &&
> +	check_language "ko-KR, *;q=0.1" ":::ko_KR"'
> +
> +test_expect_success 'git client sends Accept-Language with many preferred languages' '
> +	check_language "ko-KR, en-US;q=0.99, fr-CA;q=0.98, de;q=0.97, sr;q=0.96, \
> +ja;q=0.95, zh;q=0.94, sv;q=0.93, pt;q=0.92, nb;q=0.91, *;q=0.01" \
> +		ko_KR.EUC-KR:en_US.UTF-8:fr_CA:de.UTF-8@euro:sr@latin:ja:zh:sv:pt:nb
> +'

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v6 1/1] http: Add Accept-Language header if possible
  2014-12-22 16:44               ` [PATCH v6 1/1] " Yi EungJun
  2014-12-22 19:34                 ` Junio C Hamano
@ 2014-12-24 20:35                 ` Eric Sunshine
  2014-12-29 16:18                   ` Junio C Hamano
  1 sibling, 1 reply; 58+ messages in thread
From: Eric Sunshine @ 2014-12-24 20:35 UTC (permalink / raw)
  To: Yi EungJun
  Cc: Git List, Yi EungJun, Jeff King, Peter Krefting, Junio C Hamano,
	Michael Blume

On Mon, Dec 22, 2014 at 11:44 AM, Yi EungJun <semtlenori@gmail.com> wrote:
> From: Yi EungJun <eungjun.yi@navercorp.com>
>
> Add an Accept-Language header which indicates the user's preferred
> languages defined by $LANGUAGE, $LC_ALL, $LC_MESSAGES and $LANG.
>
> Examples:
>   LANGUAGE= -> ""
>   LANGUAGE=ko:en -> "Accept-Language: ko, en;q=0.9, *;q=0.1"
>   LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *;q=0.1"
>   LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *;q=0.1"
>
> This gives git servers a chance to display remote error messages in
> the user's preferred language.
>
> Limit the number of languages to 1,000 because q-value must not be
> smaller than 0.001, and limit the length of Accept-Language header to
> 4,000 bytes for some HTTP servers which cannot accept such long header.

Just a few comments and observations below. Alone, they are not
necessarily worth a re-roll, but if you happen to re-roll for some
other reason, perhaps take them into consideration.

> Signed-off-by: Yi EungJun <eungjun.yi@navercorp.com>
> ---
> diff --git a/http.c b/http.c
> index 040f362..7a77708 100644
> --- a/http.c
> +++ b/http.c
> @@ -986,6 +993,166 @@ static void extract_content_type(struct strbuf *raw, struct strbuf *type,
>                 strbuf_addstr(charset, "ISO-8859-1");
>  }
>
> +static void write_accept_language(struct strbuf *buf)
> +{
> + [...]
> +       /*
> +        * MAX_LANGS must not be larger than 1,000. If it is larger than that,
> +        * q-value will be smaller than 0.001, the minimum q-value the HTTP
> +        * specification allows [1].
> +        *
> +        * [1]: http://tools.ietf.org/html/rfc7231#section-5.3.1
> +        */
> +       const int MAX_LANGS = 1000;
> +       const int MAX_SIZE_OF_HEADER = 4000;
> +       const int MAX_SIZE_OF_ASTERISK_ELEMENT = 11; /* for ", *;q=0.001" */

These two MAX_SIZE_* constants are never used individually, but rather
only as (MAX_SIZE_OF_HEADER - MAX_SIZE_OF_ASTERISK_ELEMENT). It might
be a bit more readable to compute the final value here, with a
suitable comment, rather than at point-of-use. Perhaps something like:

    /* limit of some HTTP servers is 4000 - strlen(", *;q=0.001") */
    const int MAX_HEADER_SIZE = 4000 - 11;

More below.

> + [...]
> +       /*
> +        * Convert a list of colon-separated locale values [1][2] to a list of
> +        * comma-separated language tags [3] which can be used as a value of
> +        * Accept-Language header.
> + [...]
> +        */
> +       for (pos = lang_begin; ; pos++) {
> +               if (!*pos || *pos == ':') {
> +                       if (is_q_factor_required) {
> +                               /* Put a q-factor only if it is less than 1.0. */
> +                               if (q < max_q)
> +                                       strbuf_addf(buf, q_format, q);
> +
> +                               if (q > 1)
> +                                       q--;
> +
> +                               last_size = buf->len;
> +
> +                               is_q_factor_required = 0;
> +                       }
> +                       parse_state = SEPARATOR;
> +               } else if (parse_state == CODESET_OR_MODIFIER)
> +                       continue;
> +               else if (*pos == ' ') /* Ignore whitespace character */
> +                       continue;
> +               else if (*pos == '.' || *pos == '@') /* Remove .codeset and @modifier. */
> +                       parse_state = CODESET_OR_MODIFIER;
> +               else {
> +                       if (parse_state != LANGUAGE_TAG && q < max_q)
> +                               strbuf_addstr(buf, ", ");
> +                       strbuf_addch(buf, *pos == '_' ? '-' : *pos);
> +                       is_q_factor_required = 1;
> +                       parse_state = LANGUAGE_TAG;
> +               }
> +
> +               if (buf->len > MAX_SIZE_OF_HEADER - MAX_SIZE_OF_ASTERISK_ELEMENT) {
> +                       strbuf_remove(buf, last_size, buf->len - last_size);
> +                       break;
> +               }
> +
> +               if (!*pos)
> +                       break;
> +       }

Although often suitable when parsing complex inputs, state machines
demand high cognitive load. The input you're parsing, on the other
hand, is straightforward and can easily be processed with a simple
sequential parser, which is easier to reason about and review for
correctness. For instance, something like this:

    while (*s) {
        /* collect language tag */
        for (; *s && *s != '.' && *s != '@' && *s != ':'; s++)
            strbuf_addch(buf, *s == '_' ? '-' : *s);

        /* skip .codeset and @modifier */
        while (*s && *s != ':')
            s++;

        strbuf_addf(buf, q_format, q);
        ... other bookkeeping ...

        if (*s == ':')
            s++;
    }

This example is intentionally simplified but illustrates the general
idea. It lacks comma insertion (left as an exercise for the reader)
and empty language tag handling (":en", "en::ko"); and doesn't take
whitespace into consideration since it wasn't clear why your v6 parser
pays attention to embedded spaces, whereas your earlier versions did
not.

> +       /* Don't add Accept-Language header if no language is preferred. */
> +       if (q >= max_q) {
> +               strbuf_reset(buf);
> +               return;
> +       }
> +
> +       /* Add '*' with minimum q-factor greater than 0.0. */
> +       strbuf_addstr(buf, ", *");
> +       strbuf_addf(buf, q_format, 1);
> +}
> +
> +/*
> + * Get an Accept-Language header which indicates user's preferred languages.
> + *
> + * This function always return non-NULL string as strbuf_detach() does.

A couple comments:

It's not necessary to explain the public API in terms of an
implementation detail. Callers of this function don't care and don't
need to know that the value was constructed via strbuf, nor that it is
somehow dependent upon the behavior of the underlying implementation
of strbuf_detach().

This is a somewhat unusual contract. It's much more common and
idiomatic in C to return NULL as an indication of "no preference" (or
"failure") than to return an empty string.

> + *
> + * Examples:
> + *   LANGUAGE= -> ""
> + *   LANGUAGE=ko:en -> "Accept-Language: ko, en; q=0.9, *; q=0.1"
> + *   LANGUAGE=ko_KR.UTF-8:sr@latin -> "Accept-Language: ko-KR, sr; q=0.9, *; q=0.1"
> + *   LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *; q=0.1"
> + *   LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *; q=0.1"
> + *   LANGUAGE= LANG=C -> ""
> + */
> +static const char *get_accept_language(void)
> +{
> +       if (!cached_accept_language) {
> +               struct strbuf buf = STRBUF_INIT;
> +               write_accept_language(&buf);
> +               cached_accept_language = strbuf_detach(&buf, NULL);
> +               strbuf_release(&buf);

Junio already mentioned that strbuf_release() is unnecessary following
strbuf_detach().

> +       }
> +
> +       return cached_accept_language;
> +}
> +
>  /* http_request() targets */
>  #define HTTP_REQUEST_STRBUF    0
>  #define HTTP_REQUEST_FILE      1

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v6 1/1] http: Add Accept-Language header if possible
  2014-12-24 20:35                 ` Eric Sunshine
@ 2014-12-29 16:18                   ` Junio C Hamano
  2015-01-18 12:23                     ` [PATCH v7 0/1] " Yi EungJun
  0 siblings, 1 reply; 58+ messages in thread
From: Junio C Hamano @ 2014-12-29 16:18 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: Yi EungJun, Git List, Yi EungJun, Jeff King, Peter Krefting,
	Michael Blume

Eric Sunshine <sunshine@sunshineco.com> writes:

> Just a few comments and observations below. Alone, they are not
> necessarily worth a re-roll, but if you happen to re-roll for some
> other reason, perhaps take them into consideration.

I actually think everything you said in this review makes sense and
will make the resulting code a lot better (especially the part on
the parsing loop).

Thanks, as usual, for a careful reading.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH v7 0/1] http: Add Accept-Language header if possible
  2014-12-29 16:18                   ` Junio C Hamano
@ 2015-01-18 12:23                     ` Yi EungJun
  2015-01-18 12:26                       ` [PATCH v7 1/1] " Yi EungJun
  0 siblings, 1 reply; 58+ messages in thread
From: Yi EungJun @ 2015-01-18 12:23 UTC (permalink / raw)
  To: Git List
  Cc: Yi EungJun, Junio C Hamano, Eric Sunshine, Jeff King,
	Peter Krefting, Michael Blume

From: Yi EungJun <eungjun.yi@navercorp.com>

Changes since v6

>From Junio C Hamano's review:

* Fix check_language() in t5550-http-fetch-dumb.sh as his suggestion.

>From Eric Sunshine's review:

* Rewrite the parser without state.

Yi EungJun (1):
  http: Add Accept-Language header if possible

 http.c                     | 152 +++++++++++++++++++++++++++++++++++++++++++++
 remote-curl.c              |   2 +
 t/t5550-http-fetch-dumb.sh |  42 +++++++++++++
 3 files changed, 196 insertions(+)

-- 
2.2.0.44.g37b3e56.dirty

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH v7 1/1] http: Add Accept-Language header if possible
  2015-01-18 12:23                     ` [PATCH v7 0/1] " Yi EungJun
@ 2015-01-18 12:26                       ` Yi EungJun
  2015-01-18 15:14                         ` Torsten Bögershausen
                                           ` (2 more replies)
  0 siblings, 3 replies; 58+ messages in thread
From: Yi EungJun @ 2015-01-18 12:26 UTC (permalink / raw)
  To: Git List
  Cc: Yi EungJun, Junio C Hamano, Eric Sunshine, Jeff King,
	Peter Krefting, Michael Blume

From: Yi EungJun <eungjun.yi@navercorp.com>

Add an Accept-Language header which indicates the user's preferred
languages defined by $LANGUAGE, $LC_ALL, $LC_MESSAGES and $LANG.

Examples:
  LANGUAGE= -> ""
  LANGUAGE=ko:en -> "Accept-Language: ko, en;q=0.9, *;q=0.1"
  LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *;q=0.1"
  LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *;q=0.1"

This gives git servers a chance to display remote error messages in
the user's preferred language.

Limit the number of languages to 1,000 because q-value must not be
smaller than 0.001, and limit the length of Accept-Language header to
4,000 bytes for some HTTP servers which cannot accept such long header.

Signed-off-by: Yi EungJun <eungjun.yi@navercorp.com>
---
 http.c                     | 152 +++++++++++++++++++++++++++++++++++++++++++++
 remote-curl.c              |   2 +
 t/t5550-http-fetch-dumb.sh |  42 +++++++++++++
 3 files changed, 196 insertions(+)

diff --git a/http.c b/http.c
index 040f362..349b033 100644
--- a/http.c
+++ b/http.c
@@ -68,6 +68,8 @@ static struct curl_slist *no_pragma_header;
 
 static struct active_request_slot *active_queue_head;
 
+static char *cached_accept_language;
+
 size_t fread_buffer(char *ptr, size_t eltsize, size_t nmemb, void *buffer_)
 {
 	size_t size = eltsize * nmemb;
@@ -515,6 +517,11 @@ void http_cleanup(void)
 		cert_auth.password = NULL;
 	}
 	ssl_cert_password_required = 0;
+
+	if (cached_accept_language) {
+		free(cached_accept_language);
+		cached_accept_language = NULL;
+	}
 }
 
 struct active_request_slot *get_active_slot(void)
@@ -986,6 +993,145 @@ static void extract_content_type(struct strbuf *raw, struct strbuf *type,
 		strbuf_addstr(charset, "ISO-8859-1");
 }
 
+/*
+ * Guess the user's preferred languages from the value in LANGUAGE environment
+ * variable and LC_MESSAGES locale category if NO_GETTEXT is not defined.
+ *
+ * The result can be a colon-separated list like "ko:ja:en".
+ */
+static const char *get_preferred_languages(void)
+{
+	const char *retval;
+
+	retval = getenv("LANGUAGE");
+	if (retval && *retval)
+		return retval;
+
+#ifndef NO_GETTEXT
+	retval = setlocale(LC_MESSAGES, NULL);
+	if (retval && *retval &&
+		strcmp(retval, "C") &&
+		strcmp(retval, "POSIX"))
+		return retval;
+#endif
+
+	return NULL;
+}
+
+static void write_accept_language(struct strbuf *buf)
+{
+	/*
+	 * MAX_DECIMAL_PLACES must not be larger than 3. If it is larger than
+	 * that, q-value will be smaller than 0.001, the minimum q-value the
+	 * HTTP specification allows. See
+	 * http://tools.ietf.org/html/rfc7231#section-5.3.1 for q-value.
+	 */
+	const int MAX_DECIMAL_PLACES = 3;
+	const int MAX_LANGUAGE_TAGS = 1000;
+	const int MAX_ACCEPT_LANGUAGE_HEADER_SIZE = 4000;
+	struct strbuf *language_tags = NULL;
+	int num_langs;
+	const char *s = get_preferred_languages();
+
+	/* Don't add Accept-Language header if no language is preferred. */
+	if (!s)
+		return;
+
+	/*
+	 * Split the colon-separated string of preferred languages into
+	 * language_tags array.
+	 */
+	do {
+		/* increase language_tags array to add new language tag */
+		REALLOC_ARRAY(language_tags, num_langs + 1);
+		strbuf_init(&language_tags[num_langs], 0);
+
+		/* collect language tag */
+		for (; *s && (isalnum(*s) || *s == '_'); s++)
+			strbuf_addch(&language_tags[num_langs], *s == '_' ? '-' : *s);
+
+		/* skip .codeset, @modifier and any other unnecessary parts */
+		while (*s && *s != ':')
+			s++;
+
+		if (language_tags[num_langs].len > 0) {
+			num_langs++;
+			if (num_langs >= MAX_LANGUAGE_TAGS - 1) /* -1 for '*' */
+				break;
+		}
+	} while (*s++);
+
+	/* write Accept-Language header into buf */
+	if (num_langs >= 1) {
+		int i;
+		int last_buf_len;
+		int max_q;
+		int decimal_places;
+		char q_format[32];
+
+		/* add '*' */
+		REALLOC_ARRAY(language_tags, num_langs + 1);
+		strbuf_init(&language_tags[num_langs], 0);
+		strbuf_addstr(&language_tags[num_langs++], "*");
+
+		/* compute decimal_places */
+		for (max_q = 1, decimal_places = 0;
+				max_q < num_langs && decimal_places <= MAX_DECIMAL_PLACES;
+				decimal_places++, max_q *= 10)
+			;
+
+		sprintf(q_format, ";q=0.%%0%dd", decimal_places);
+
+		strbuf_addstr(buf, "Accept-Language: ");
+
+		for(i = 0; i < num_langs; i++) {
+			if (language_tags[i].len == 0)
+				continue;
+
+			if (i > 0)
+				strbuf_addstr(buf, ", ");
+
+			strbuf_addstr(buf, strbuf_detach(&language_tags[i], NULL));
+
+			if (i > 0)
+				strbuf_addf(buf, q_format, max_q - i);
+
+			if (buf->len > MAX_ACCEPT_LANGUAGE_HEADER_SIZE) {
+				strbuf_remove(buf, last_buf_len, buf->len - last_buf_len);
+				break;
+			}
+
+			last_buf_len = buf->len;
+		}
+	}
+
+	free(language_tags);
+}
+
+/*
+ * Get an Accept-Language header which indicates user's preferred languages.
+ *
+ * This function always return non-NULL string as strbuf_detach() does.
+ *
+ * Examples:
+ *   LANGUAGE= -> ""
+ *   LANGUAGE=ko:en -> "Accept-Language: ko, en; q=0.9, *; q=0.1"
+ *   LANGUAGE=ko_KR.UTF-8:sr@latin -> "Accept-Language: ko-KR, sr; q=0.9, *; q=0.1"
+ *   LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *; q=0.1"
+ *   LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *; q=0.1"
+ *   LANGUAGE= LANG=C -> ""
+ */
+static const char *get_accept_language(void)
+{
+	if (!cached_accept_language) {
+		struct strbuf buf = STRBUF_INIT;
+		write_accept_language(&buf);
+		cached_accept_language = strbuf_detach(&buf, NULL);
+	}
+
+	return cached_accept_language;
+}
+
 /* http_request() targets */
 #define HTTP_REQUEST_STRBUF	0
 #define HTTP_REQUEST_FILE	1
@@ -998,6 +1144,7 @@ static int http_request(const char *url,
 	struct slot_results results;
 	struct curl_slist *headers = NULL;
 	struct strbuf buf = STRBUF_INIT;
+	const char *accept_language;
 	int ret;
 
 	slot = get_active_slot();
@@ -1023,6 +1170,11 @@ static int http_request(const char *url,
 					 fwrite_buffer);
 	}
 
+	accept_language = get_accept_language();
+
+	if (strlen(accept_language) > 0)
+		headers = curl_slist_append(headers, accept_language);
+
 	strbuf_addstr(&buf, "Pragma:");
 	if (options && options->no_cache)
 		strbuf_addstr(&buf, " no-cache");
diff --git a/remote-curl.c b/remote-curl.c
index dd63bc2..04989e5 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -962,6 +962,8 @@ int main(int argc, const char **argv)
 	struct strbuf buf = STRBUF_INIT;
 	int nongit;
 
+	git_setup_gettext();
+
 	git_extract_argv0_path(argv[0]);
 	setup_git_directory_gently(&nongit);
 	if (argc < 2) {
diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh
index ac71418..e1e2938 100755
--- a/t/t5550-http-fetch-dumb.sh
+++ b/t/t5550-http-fetch-dumb.sh
@@ -196,5 +196,47 @@ test_expect_success 'reencoding is robust to whitespace oddities' '
 	grep "this is the error message" stderr
 '
 
+check_language () {
+	case "$2" in
+	'')
+		>expect
+		;;
+	?*)
+		echo "Accept-Language: $1" >expect
+		;;
+	esac &&
+	GIT_CURL_VERBOSE=1 \
+	LANGUAGE=$2 \
+	git ls-remote "$HTTPD_URL/dumb/repo.git" >output 2>&1 &&
+	tr -d '\015' <output |
+	sort -u |
+	sed -ne '/^Accept-Language:/ p' >actual &&
+	test_cmp expect actual
+}
+
+test_expect_success 'git client sends Accept-Language based on LANGUAGE' '
+	check_language "ko-KR, *;q=0.9" ko_KR.UTF-8'
+
+test_expect_success 'git client sends Accept-Language correctly with unordinary LANGUAGE' '
+	check_language "ko-KR, *;q=0.9" "ko_KR:" &&
+	check_language "ko-KR, en-US;q=0.9, *;q=0.8" "ko_KR::en_US" &&
+	check_language "ko-KR, *;q=0.9" ":::ko_KR" &&
+	check_language "ko-KR, en-US;q=0.9, *;q=0.8" "ko_KR!!:en_US" &&
+	check_language "ko-KR, ja-JP;q=0.9, *;q=0.8" "ko_KR en_US:ja_JP"'
+
+test_expect_success 'git client sends Accept-Language with many preferred languages' '
+	check_language "ko-KR, en-US;q=0.9, fr-CA;q=0.8, de;q=0.7, sr;q=0.6, \
+ja;q=0.5, zh;q=0.4, sv;q=0.3, pt;q=0.2, *;q=0.1" \
+		ko_KR.EUC-KR:en_US.UTF-8:fr_CA:de.UTF-8@euro:sr@latin:ja:zh:sv:pt &&
+	check_language "ko-KR, en-US;q=0.99, fr-CA;q=0.98, de;q=0.97, sr;q=0.96, \
+ja;q=0.95, zh;q=0.94, sv;q=0.93, pt;q=0.92, nb;q=0.91, *;q=0.90" \
+		ko_KR.EUC-KR:en_US.UTF-8:fr_CA:de.UTF-8@euro:sr@latin:ja:zh:sv:pt:nb
+'
+
+test_expect_success 'git client does not send an empty Accept-Language' '
+	GIT_CURL_VERBOSE=1 LANGUAGE= git ls-remote "$HTTPD_URL/dumb/repo.git" 2>stderr &&
+	! grep "^Accept-Language:" stderr
+'
+
 stop_httpd
 test_done
-- 
2.2.0.44.g37b3e56.dirty

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: [PATCH v7 1/1] http: Add Accept-Language header if possible
  2015-01-18 12:26                       ` [PATCH v7 1/1] " Yi EungJun
@ 2015-01-18 15:14                         ` Torsten Bögershausen
  2015-01-19 20:21                         ` [PATCH v6 0/1] " Eric Sunshine
  2015-01-22  7:54                         ` [PATCH v7 1/1] " Junio C Hamano
  2 siblings, 0 replies; 58+ messages in thread
From: Torsten Bögershausen @ 2015-01-18 15:14 UTC (permalink / raw)
  To: Yi EungJun, Git List
  Cc: Yi EungJun, Junio C Hamano, Eric Sunshine, Jeff King,
	Peter Krefting, Michael Blume

On 18.01.15 13:26, Yi EungJun wrote:
> From: Yi EungJun <eungjun.yi@navercorp.com>

> diff --git a/http.c b/http.c
> index 040f362..349b033 100644
> --- a/http.c
> +++ b/http.c
> @@ -68,6 +68,8 @@ static struct curl_slist *no_pragma_header;
>  
>  static struct active_request_slot *active_queue_head;
>  
> +static char *cached_accept_language;
> +
>  size_t fread_buffer(char *ptr, size_t eltsize, size_t nmemb, void *buffer_)
>  {
>  	size_t size = eltsize * nmemb;
> @@ -515,6 +517,11 @@ void http_cleanup(void)
>  		cert_auth.password = NULL;
>  	}
>  	ssl_cert_password_required = 0;
> +
> +	if (cached_accept_language) {
> +		free(cached_accept_language);
> +		cached_accept_language = NULL;
> +	}

Minor remark:
free(NULL) is legal and does nothing.
We can simplify the code somewhat:

  	ssl_cert_password_required = 0;
 
 	free(cached_accept_language);
 	cached_accept_language = NULL;




>  }
>  
>  struct active_request_slot *get_active_slot(void)
> @@ -986,6 +993,145 @@ static void extract_content_type(struct strbuf *raw, struct strbuf *type,
>  		strbuf_addstr(charset, "ISO-8859-1");
>  }
>  
> +/*
> + * Guess the user's preferred languages from the value in LANGUAGE environment
> + * variable and LC_MESSAGES locale category if NO_GETTEXT is not defined.
> + *
> + * The result can be a colon-separated list like "ko:ja:en".
> + */
> +static const char *get_preferred_languages(void)
> +{
> +	const char *retval;
> +
> +	retval = getenv("LANGUAGE");
> +	if (retval && *retval)
> +		return retval;
> +
> +#ifndef NO_GETTEXT
> +	retval = setlocale(LC_MESSAGES, NULL);
> +	if (retval && *retval &&
> +		strcmp(retval, "C") &&
> +		strcmp(retval, "POSIX"))
> +		return retval;
> +#endif
> +
> +	return NULL;
> +}
> +
> +static void write_accept_language(struct strbuf *buf)
> +{
> +	/*
> +	 * MAX_DECIMAL_PLACES must not be larger than 3. If it is larger than
> +	 * that, q-value will be smaller than 0.001, the minimum q-value the
> +	 * HTTP specification allows. See
> +	 * http://tools.ietf.org/html/rfc7231#section-5.3.1 for q-value.
> +	 */
> +	const int MAX_DECIMAL_PLACES = 3;
> +	const int MAX_LANGUAGE_TAGS = 1000;
> +	const int MAX_ACCEPT_LANGUAGE_HEADER_SIZE = 4000;
> +	struct strbuf *language_tags = NULL;
> +	int num_langs;
> +	const char *s = get_preferred_languages();
> +
> +	/* Don't add Accept-Language header if no language is preferred. */
> +	if (!s)
> +		return;
> +
> +	/*
> +	 * Split the colon-separated string of preferred languages into
> +	 * language_tags array.
> +	 */
> +	do {
> +		/* increase language_tags array to add new language tag */
> +		REALLOC_ARRAY(language_tags, num_langs + 1);
> +		strbuf_init(&language_tags[num_langs], 0);
> +
> +		/* collect language tag */
> +		for (; *s && (isalnum(*s) || *s == '_'); s++)
> +			strbuf_addch(&language_tags[num_langs], *s == '_' ? '-' : *s);
> +
> +		/* skip .codeset, @modifier and any other unnecessary parts */
> +		while (*s && *s != ':')
> +			s++;
> +
> +		if (language_tags[num_langs].len > 0) {
> +			num_langs++;
> +			if (num_langs >= MAX_LANGUAGE_TAGS - 1) /* -1 for '*' */
> +				break;
> +		}
> +	} while (*s++);
> +
> +	/* write Accept-Language header into buf */
> +	if (num_langs >= 1) {
> +		int i;
> +		int last_buf_len;
> +		int max_q;
> +		int decimal_places;
> +		char q_format[32];
> +
> +		/* add '*' */
> +		REALLOC_ARRAY(language_tags, num_langs + 1);
> +		strbuf_init(&language_tags[num_langs], 0);
> +		strbuf_addstr(&language_tags[num_langs++], "*");
> +
> +		/* compute decimal_places */
> +		for (max_q = 1, decimal_places = 0;
> +				max_q < num_langs && decimal_places <= MAX_DECIMAL_PLACES;
> +				decimal_places++, max_q *= 10)
> +			;
> +
> +		sprintf(q_format, ";q=0.%%0%dd", decimal_places);
> +
> +		strbuf_addstr(buf, "Accept-Language: ");
> +
> +		for(i = 0; i < num_langs; i++) {
> +			if (language_tags[i].len == 0)
> +				continue;
> +
> +			if (i > 0)
> +				strbuf_addstr(buf, ", ");
> +
> +			strbuf_addstr(buf, strbuf_detach(&language_tags[i], NULL));
> +
> +			if (i > 0)
> +				strbuf_addf(buf, q_format, max_q - i);
> +
> +			if (buf->len > MAX_ACCEPT_LANGUAGE_HEADER_SIZE) {
> +				strbuf_remove(buf, last_buf_len, buf->len - last_buf_len);
> +				break;
> +			}
> +
> +			last_buf_len = buf->len;
> +		}
> +	}
> +
> +	free(language_tags);
> +}
> +
> +/*
> + * Get an Accept-Language header which indicates user's preferred languages.
> + *
> + * This function always return non-NULL string as strbuf_detach() does.
> + *
> + * Examples:
> + *   LANGUAGE= -> ""
> + *   LANGUAGE=ko:en -> "Accept-Language: ko, en; q=0.9, *; q=0.1"
> + *   LANGUAGE=ko_KR.UTF-8:sr@latin -> "Accept-Language: ko-KR, sr; q=0.9, *; q=0.1"
> + *   LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *; q=0.1"
> + *   LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *; q=0.1"
> + *   LANGUAGE= LANG=C -> ""
> + */
> +static const char *get_accept_language(void)
> +{
> +	if (!cached_accept_language) {
> +		struct strbuf buf = STRBUF_INIT;
> +		write_accept_language(&buf);
> +		cached_accept_language = strbuf_detach(&buf, NULL);
> +	}
> +
> +	return cached_accept_language;
> +}
> +
>  /* http_request() targets */
>  #define HTTP_REQUEST_STRBUF	0
>  #define HTTP_REQUEST_FILE	1
> @@ -998,6 +1144,7 @@ static int http_request(const char *url,
>  	struct slot_results results;
>  	struct curl_slist *headers = NULL;
>  	struct strbuf buf = STRBUF_INIT;
> +	const char *accept_language;
>  	int ret;
>  
>  	slot = get_active_slot();
> @@ -1023,6 +1170,11 @@ static int http_request(const char *url,
>  					 fwrite_buffer);
>  	}
>  
> +	accept_language = get_accept_language();
> +
> +	if (strlen(accept_language) > 0)
> +		headers = curl_slist_append(headers, accept_language);
> +
>  	strbuf_addstr(&buf, "Pragma:");
>  	if (options && options->no_cache)
>  		strbuf_addstr(&buf, " no-cache");
> diff --git a/remote-curl.c b/remote-curl.c
> index dd63bc2..04989e5 100644
> --- a/remote-curl.c
> +++ b/remote-curl.c
> @@ -962,6 +962,8 @@ int main(int argc, const char **argv)
>  	struct strbuf buf = STRBUF_INIT;
>  	int nongit;
>  
> +	git_setup_gettext();
> +
>  	git_extract_argv0_path(argv[0]);
>  	setup_git_directory_gently(&nongit);
>  	if (argc < 2) {
> diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh
> index ac71418..e1e2938 100755
> --- a/t/t5550-http-fetch-dumb.sh
> +++ b/t/t5550-http-fetch-dumb.sh
> @@ -196,5 +196,47 @@ test_expect_success 'reencoding is robust to whitespace oddities' '
>  	grep "this is the error message" stderr
>  '
>  
> +check_language () {
> +	case "$2" in
> +	'')
> +		>expect
> +		;;
> +	?*)
> +		echo "Accept-Language: $1" >expect
> +		;;
> +	esac &&
> +	GIT_CURL_VERBOSE=1 \
> +	LANGUAGE=$2 \
> +	git ls-remote "$HTTPD_URL/dumb/repo.git" >output 2>&1 &&
> +	tr -d '\015' <output |
> +	sort -u |
> +	sed -ne '/^Accept-Language:/ p' >actual &&
> +	test_cmp expect actual
> +}
> +
> +test_expect_success 'git client sends Accept-Language based on LANGUAGE' '
> +	check_language "ko-KR, *;q=0.9" ko_KR.UTF-8'
> +
> +test_expect_success 'git client sends Accept-Language correctly with unordinary LANGUAGE' '
> +	check_language "ko-KR, *;q=0.9" "ko_KR:" &&
> +	check_language "ko-KR, en-US;q=0.9, *;q=0.8" "ko_KR::en_US" &&
> +	check_language "ko-KR, *;q=0.9" ":::ko_KR" &&
> +	check_language "ko-KR, en-US;q=0.9, *;q=0.8" "ko_KR!!:en_US" &&
> +	check_language "ko-KR, ja-JP;q=0.9, *;q=0.8" "ko_KR en_US:ja_JP"'
> +
> +test_expect_success 'git client sends Accept-Language with many preferred languages' '
> +	check_language "ko-KR, en-US;q=0.9, fr-CA;q=0.8, de;q=0.7, sr;q=0.6, \
> +ja;q=0.5, zh;q=0.4, sv;q=0.3, pt;q=0.2, *;q=0.1" \
> +		ko_KR.EUC-KR:en_US.UTF-8:fr_CA:de.UTF-8@euro:sr@latin:ja:zh:sv:pt &&
> +	check_language "ko-KR, en-US;q=0.99, fr-CA;q=0.98, de;q=0.97, sr;q=0.96, \
> +ja;q=0.95, zh;q=0.94, sv;q=0.93, pt;q=0.92, nb;q=0.91, *;q=0.90" \
> +		ko_KR.EUC-KR:en_US.UTF-8:fr_CA:de.UTF-8@euro:sr@latin:ja:zh:sv:pt:nb
> +'
> +
> +test_expect_success 'git client does not send an empty Accept-Language' '
> +	GIT_CURL_VERBOSE=1 LANGUAGE= git ls-remote "$HTTPD_URL/dumb/repo.git" 2>stderr &&
> +	! grep "^Accept-Language:" stderr
> +'
> +
>  stop_httpd
>  test_done
> 

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v6 0/1] http: Add Accept-Language header if possible
  2015-01-18 12:26                       ` [PATCH v7 1/1] " Yi EungJun
  2015-01-18 15:14                         ` Torsten Bögershausen
@ 2015-01-19 20:21                         ` Eric Sunshine
  2015-01-22  7:54                         ` [PATCH v7 1/1] " Junio C Hamano
  2 siblings, 0 replies; 58+ messages in thread
From: Eric Sunshine @ 2015-01-19 20:21 UTC (permalink / raw)
  To: Yi EungJun
  Cc: Git List, Yi EungJun, Junio C Hamano, Jeff King, Peter Krefting,
	Michael Blume

On Sunday, January 18, 2015, Yi EungJun <semtlenori@gmail.com> wrote:
> Add an Accept-Language header which indicates the user's preferred
> languages defined by $LANGUAGE, $LC_ALL, $LC_MESSAGES and $LANG.
>
> Examples:
>   LANGUAGE= -> ""
>   LANGUAGE=ko:en -> "Accept-Language: ko, en;q=0.9, *;q=0.1"
>   LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *;q=0.1"
>   LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *;q=0.1"
>
> This gives git servers a chance to display remote error messages in
> the user's preferred language.
>
> Limit the number of languages to 1,000 because q-value must not be
> smaller than 0.001, and limit the length of Accept-Language header to
> 4,000 bytes for some HTTP servers which cannot accept such long header.
>
> Signed-off-by: Yi EungJun <eungjun.yi@navercorp.com>
> ---
> diff --git a/http.c b/http.c
> index 040f362..349b033 100644
> --- a/http.c
> +++ b/http.c
> @@ -986,6 +993,145 @@ static void extract_content_type(struct strbuf *raw, struct strbuf *type,
>                 strbuf_addstr(charset, "ISO-8859-1");
>  }
>
> +static void write_accept_language(struct strbuf *buf)
> +{
> +       [...]
> +       const int MAX_DECIMAL_PLACES = 3;
> +       const int MAX_LANGUAGE_TAGS = 1000;
> +       const int MAX_ACCEPT_LANGUAGE_HEADER_SIZE = 4000;
> +       struct strbuf *language_tags = NULL;
> +       int num_langs;

Mental note: 'num_langs' is not initialized.

> +       const char *s = get_preferred_languages();
> +
> +       /* Don't add Accept-Language header if no language is preferred. */
> +       if (!s)
> +               return;
> +
> +       /*
> +        * Split the colon-separated string of preferred languages into
> +        * language_tags array.
> +        */
> +       do {
> +               /* increase language_tags array to add new language tag */
> +               REALLOC_ARRAY(language_tags, num_langs + 1);

'num_langs' is used here uninitialized.

> +               strbuf_init(&language_tags[num_langs], 0);
> +
> +               /* collect language tag */
> +               for (; *s && (isalnum(*s) || *s == '_'); s++)
> +                       strbuf_addch(&language_tags[num_langs], *s == '_' ? '-' : *s);
> +
> +               /* skip .codeset, @modifier and any other unnecessary parts */
> +               while (*s && *s != ':')
> +                       s++;
> +
> +               if (language_tags[num_langs].len > 0) {

Mental note: An empty ("") language tag is never allowed in language_tags[].

 > +                       num_langs++;

This is a little bit ugly. At the top of the loop, you allocate space
in the array for a strbuf and initialize it. However, if the language
tag is empty (""), then 'num_langs' is never incremented, so the next
time through the loop, strbuf_init() is invoked on the same block of
memory (assuming the realloc was a no-op since the allocation size did
not change), overwriting whatever was there and possibly leaking
memory. In this particular case, by examining the parser code closely,
we can see that nothing was added to the strbuf, so nothing is being
leaked the next time around, given the current implementation of
strbuf.

However, this is potentially fragile. A change to the implementation
of strbuf in the future (for instance, if strbuf_init() allocates
memory immediately) could result in a leak here. Moreover, this
no-leak situation only holds true if no text at all has been added to
the strbuf after strbuf_init(). If someone changes the parser in the
future to operate a bit differently so that some text is added and
then removed from the strbuf, even though the end result still has
length is 0, then it will start leaking.

One way to make this more robust would be to have a separate strbuf
for collecting the language tag. When you encounter a non-empty tag,
only then grow the array and initialize the new strbuf in the array.
Finally, use strbuf_swap() to swap the collected language tag into the
new array position. Something like this:

    struct strbuf tag = STRBUF_INIT;
    do {
         for (; *s && (isalnum(*s) || *s == '_'); s++)
             strbuf_addch(&tag, *s == '_' ? '-' : *s);

        [...]

        if (tag.len) {
            num_langs++;
            REALLOC_ARRAY(language_tags, num_langs);
            strbuf_init(&language_tags[num_langs], 0);
            strbuf_swap(&tag, &language_tags[num_langs]);

            if (num_langs >= ...)
                break;
        }
    while (...);
    strbuf_release(&tag);

> +                       if (num_langs >= MAX_LANGUAGE_TAGS - 1) /* -1 for '*' */
> +                               break;
> +               }
> +       } while (*s++);
> +
> +       /* write Accept-Language header into buf */
> +       if (num_langs >= 1) {
> +               int i;
> +               int last_buf_len;

Mental note: 'last_buf_len' is not initialized.

> +               int max_q;
> +               int decimal_places;
> +               char q_format[32];
> +
> +               /* add '*' */
> +               REALLOC_ARRAY(language_tags, num_langs + 1);
> +               strbuf_init(&language_tags[num_langs], 0);
> +               strbuf_addstr(&language_tags[num_langs++], "*");
> +
> +               /* compute decimal_places */
> +               for (max_q = 1, decimal_places = 0;
> +                               max_q < num_langs && decimal_places <= MAX_DECIMAL_PLACES;
> +                               decimal_places++, max_q *= 10)
> +                       ;
> +
> +               sprintf(q_format, ";q=0.%%0%dd", decimal_places);
> +
> +               strbuf_addstr(buf, "Accept-Language: ");
> +
> +               for(i = 0; i < num_langs; i++) {
> +                       if (language_tags[i].len == 0)
> +                               continue;

The parsing code does not allow empty tags ("") in language_tags[], so
this conditional is useless, isn't it?

> +
> +                       if (i > 0)
> +                               strbuf_addstr(buf, ", ");
> +
> +                       strbuf_addstr(buf, strbuf_detach(&language_tags[i], NULL));

This leaks the string detached from 'language_tag[i]' since
strbuf_addstr() does not take ownership of it.

> +                       if (i > 0)
> +                               strbuf_addf(buf, q_format, max_q - i);
> +
> +                       if (buf->len > MAX_ACCEPT_LANGUAGE_HEADER_SIZE) {
> +                               strbuf_remove(buf, last_buf_len, buf->len - last_buf_len);

'last_buf_len' is (potentially) used here uninitialized the first time
through loop.

> +                               break;
> +                       }
> +
> +                       last_buf_len = buf->len;
> +               }
> +       }
> +
> +       free(language_tags);

This _seems_ to be okay since strbuf_detach() was invoked for each
strbuf in language_tags[], however, those strings were in fact leaked
(as noted above), so it's not actually correct.

> +}
> +
> +/*
> + * Get an Accept-Language header which indicates user's preferred languages.
> + *
> + * This function always return non-NULL string as strbuf_detach() does.

Repeating from [1]: It's not good form to describe the published API
in terms of an implementation detail (strbuf_detach). Also, it would
be more idiomatic in C to return NULL rather than empty string.

[1]: http://article.gmane.org/gmane.comp.version-control.git/261810/

> + *
> + * Examples:
> + *   LANGUAGE= -> ""
> + *   LANGUAGE=ko:en -> "Accept-Language: ko, en; q=0.9, *; q=0.1"
> + *   LANGUAGE=ko_KR.UTF-8:sr@latin -> "Accept-Language: ko-KR, sr; q=0.9, *; q=0.1"
> + *   LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *; q=0.1"
> + *   LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *; q=0.1"
> + *   LANGUAGE= LANG=C -> ""
> + */
> +static const char *get_accept_language(void)
> +{
> +       if (!cached_accept_language) {
> +               struct strbuf buf = STRBUF_INIT;
> +               write_accept_language(&buf);
> +               cached_accept_language = strbuf_detach(&buf, NULL);
> +       }
> +
> +       return cached_accept_language;
> +}
> +

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v7 1/1] http: Add Accept-Language header if possible
  2015-01-18 12:26                       ` [PATCH v7 1/1] " Yi EungJun
  2015-01-18 15:14                         ` Torsten Bögershausen
  2015-01-19 20:21                         ` [PATCH v6 0/1] " Eric Sunshine
@ 2015-01-22  7:54                         ` Junio C Hamano
  2015-01-27 15:51                           ` [PATCH v8 0/1] " Yi EungJun
  2 siblings, 1 reply; 58+ messages in thread
From: Junio C Hamano @ 2015-01-22  7:54 UTC (permalink / raw)
  To: Yi EungJun
  Cc: Git List, Yi EungJun, Eric Sunshine, Jeff King, Peter Krefting,
	Michael Blume

Yi EungJun <semtlenori@gmail.com> writes:

> +static void write_accept_language(struct strbuf *buf)
> +{
> +	/*
> +	 * MAX_DECIMAL_PLACES must not be larger than 3. If it is larger than
> +	 * that, q-value will be smaller than 0.001, the minimum q-value the
> +	 * HTTP specification allows. See
> +	 * http://tools.ietf.org/html/rfc7231#section-5.3.1 for q-value.
> +	 */
> +	const int MAX_DECIMAL_PLACES = 3;
> +	const int MAX_LANGUAGE_TAGS = 1000;
> +	const int MAX_ACCEPT_LANGUAGE_HEADER_SIZE = 4000;
> +	struct strbuf *language_tags = NULL;
> +	int num_langs;

No initial value given to this variable, but...

> +	const char *s = get_preferred_languages();
> +
> +	/* Don't add Accept-Language header if no language is preferred. */
> +	if (!s)
> +		return;
> +
> +	/*
> +	 * Split the colon-separated string of preferred languages into
> +	 * language_tags array.
> +	 */
> +	do {
> +		/* increase language_tags array to add new language tag */
> +		REALLOC_ARRAY(language_tags, num_langs + 1);

... it is nevertheless used.  I think it was meant to start at 0?

> +	/* write Accept-Language header into buf */
> +	if (num_langs >= 1) {
> +		int i;
> +		int last_buf_len;

This is uninitialized...

> +		int max_q;
> +		int decimal_places;
> +		char q_format[32];
> +
> +...
> +			if (buf->len > MAX_ACCEPT_LANGUAGE_HEADER_SIZE) {
> +				strbuf_remove(buf, last_buf_len, buf->len - last_buf_len);

... and then it is used here.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH v8 0/1] http: Add Accept-Language header if possible
  2015-01-22  7:54                         ` [PATCH v7 1/1] " Junio C Hamano
@ 2015-01-27 15:51                           ` Yi EungJun
  2015-01-27 15:51                             ` [PATCH] " Yi EungJun
  0 siblings, 1 reply; 58+ messages in thread
From: Yi EungJun @ 2015-01-27 15:51 UTC (permalink / raw)
  To: Git List
  Cc: Yi EungJun, Junio C Hamano, Jeff King, Peter Krefting,
	Michael Blume, Torsten Bögershausen

From: Yi EungJun <eungjun.yi@navercorp.com>

Change since v7

From Torsten Bögershausen's review:

	* remove unnecessary if-statement

From Eric Sunshine's review:

	* fix memory leaks and uninitialized variables
	* remove unnecessary if-statement

From Junio C Hamano's review:

	* fix uninitialized variables

Yi EungJun (1):
  http: Add Accept-Language header if possible

 http.c                     | 151 +++++++++++++++++++++++++++++++++++++++++++++
 remote-curl.c              |   2 +
 t/t5550-http-fetch-dumb.sh |  42 +++++++++++++
 3 files changed, 195 insertions(+)

-- 
2.3.0.rc1.32.ga3df1c7

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH] http: Add Accept-Language header if possible
  2015-01-27 15:51                           ` [PATCH v8 0/1] " Yi EungJun
@ 2015-01-27 15:51                             ` Yi EungJun
  2015-01-27 23:34                               ` Junio C Hamano
  0 siblings, 1 reply; 58+ messages in thread
From: Yi EungJun @ 2015-01-27 15:51 UTC (permalink / raw)
  To: Git List
  Cc: Yi EungJun, Junio C Hamano, Jeff King, Peter Krefting,
	Michael Blume, Torsten Bögershausen

From: Yi EungJun <eungjun.yi@navercorp.com>

Add an Accept-Language header which indicates the user's preferred
languages defined by $LANGUAGE, $LC_ALL, $LC_MESSAGES and $LANG.

Examples:
  LANGUAGE= -> ""
  LANGUAGE=ko:en -> "Accept-Language: ko, en;q=0.9, *;q=0.1"
  LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *;q=0.1"
  LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *;q=0.1"

This gives git servers a chance to display remote error messages in
the user's preferred language.

Limit the number of languages to 1,000 because q-value must not be
smaller than 0.001, and limit the length of Accept-Language header to
4,000 bytes for some HTTP servers which cannot accept such long header.

Signed-off-by: Yi EungJun <eungjun.yi@navercorp.com>
---
 http.c                     | 151 +++++++++++++++++++++++++++++++++++++++++++++
 remote-curl.c              |   2 +
 t/t5550-http-fetch-dumb.sh |  42 +++++++++++++
 3 files changed, 195 insertions(+)

diff --git a/http.c b/http.c
index 040f362..6111c6a 100644
--- a/http.c
+++ b/http.c
@@ -68,6 +68,8 @@ static struct curl_slist *no_pragma_header;
 
 static struct active_request_slot *active_queue_head;
 
+static char *cached_accept_language;
+
 size_t fread_buffer(char *ptr, size_t eltsize, size_t nmemb, void *buffer_)
 {
 	size_t size = eltsize * nmemb;
@@ -515,6 +517,9 @@ void http_cleanup(void)
 		cert_auth.password = NULL;
 	}
 	ssl_cert_password_required = 0;
+
+	free(cached_accept_language);
+	cached_accept_language = NULL;
 }
 
 struct active_request_slot *get_active_slot(void)
@@ -986,6 +991,146 @@ static void extract_content_type(struct strbuf *raw, struct strbuf *type,
 		strbuf_addstr(charset, "ISO-8859-1");
 }
 
+/*
+ * Guess the user's preferred languages from the value in LANGUAGE environment
+ * variable and LC_MESSAGES locale category if NO_GETTEXT is not defined.
+ *
+ * The result can be a colon-separated list like "ko:ja:en".
+ */
+static const char *get_preferred_languages(void)
+{
+	const char *retval;
+
+	retval = getenv("LANGUAGE");
+	if (retval && *retval)
+		return retval;
+
+#ifndef NO_GETTEXT
+	retval = setlocale(LC_MESSAGES, NULL);
+	if (retval && *retval &&
+		strcmp(retval, "C") &&
+		strcmp(retval, "POSIX"))
+		return retval;
+#endif
+
+	return NULL;
+}
+
+static void write_accept_language(struct strbuf *buf)
+{
+	/*
+	 * MAX_DECIMAL_PLACES must not be larger than 3. If it is larger than
+	 * that, q-value will be smaller than 0.001, the minimum q-value the
+	 * HTTP specification allows. See
+	 * http://tools.ietf.org/html/rfc7231#section-5.3.1 for q-value.
+	 */
+	const int MAX_DECIMAL_PLACES = 3;
+	const int MAX_LANGUAGE_TAGS = 1000;
+	const int MAX_ACCEPT_LANGUAGE_HEADER_SIZE = 4000;
+	struct strbuf *language_tags = NULL;
+	int num_langs = 0;
+	const char *s = get_preferred_languages();
+	int i;
+	struct strbuf tag = STRBUF_INIT;
+
+	/* Don't add Accept-Language header if no language is preferred. */
+	if (!s)
+		return;
+
+	/*
+	 * Split the colon-separated string of preferred languages into
+	 * language_tags array.
+	 */
+	do {
+		/* collect language tag */
+		for (; *s && (isalnum(*s) || *s == '_'); s++)
+			strbuf_addch(&tag, *s == '_' ? '-' : *s);
+
+		/* skip .codeset, @modifier and any other unnecessary parts */
+		while (*s && *s != ':')
+			s++;
+
+		if (tag.len) {
+			num_langs++;
+			REALLOC_ARRAY(language_tags, num_langs);
+			strbuf_init(&language_tags[num_langs - 1], 0);
+			strbuf_swap(&tag, &language_tags[num_langs - 1]);
+
+			if (num_langs >= MAX_LANGUAGE_TAGS - 1) /* -1 for '*' */
+				break;
+		}
+	} while (*s++);
+
+	/* write Accept-Language header into buf */
+	if (num_langs >= 1) {
+		int last_buf_len = 0;
+		int max_q;
+		int decimal_places;
+		char q_format[32];
+
+		/* add '*' */
+		REALLOC_ARRAY(language_tags, num_langs + 1);
+		strbuf_init(&language_tags[num_langs], 0);
+		strbuf_addstr(&language_tags[num_langs++], "*");
+
+		/* compute decimal_places */
+		for (max_q = 1, decimal_places = 0;
+				max_q < num_langs && decimal_places <= MAX_DECIMAL_PLACES;
+				decimal_places++, max_q *= 10)
+			;
+
+		sprintf(q_format, ";q=0.%%0%dd", decimal_places);
+
+		strbuf_addstr(buf, "Accept-Language: ");
+
+		for(i = 0; i < num_langs; i++) {
+			if (i > 0)
+				strbuf_addstr(buf, ", ");
+
+			strbuf_addstr(buf, strbuf_detach(&language_tags[i], NULL));
+
+			if (i > 0)
+				strbuf_addf(buf, q_format, max_q - i);
+
+			if (buf->len > MAX_ACCEPT_LANGUAGE_HEADER_SIZE) {
+				strbuf_remove(buf, last_buf_len, buf->len - last_buf_len);
+				break;
+			}
+
+			last_buf_len = buf->len;
+		}
+	}
+
+	/* free language tags */
+	for(i = 0; i < num_langs; i++) {
+		strbuf_release(&language_tags[i]);
+	}
+	free(language_tags);
+}
+
+/*
+ * Get an Accept-Language header which indicates user's preferred languages.
+ *
+ * Examples:
+ *   LANGUAGE= -> ""
+ *   LANGUAGE=ko:en -> "Accept-Language: ko, en; q=0.9, *; q=0.1"
+ *   LANGUAGE=ko_KR.UTF-8:sr@latin -> "Accept-Language: ko-KR, sr; q=0.9, *; q=0.1"
+ *   LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *; q=0.1"
+ *   LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *; q=0.1"
+ *   LANGUAGE= LANG=C -> ""
+ */
+static const char *get_accept_language(void)
+{
+	if (!cached_accept_language) {
+		struct strbuf buf = STRBUF_INIT;
+		write_accept_language(&buf);
+		if (buf.len > 0)
+			cached_accept_language = strbuf_detach(&buf, NULL);
+	}
+
+	return cached_accept_language;
+}
+
 /* http_request() targets */
 #define HTTP_REQUEST_STRBUF	0
 #define HTTP_REQUEST_FILE	1
@@ -998,6 +1143,7 @@ static int http_request(const char *url,
 	struct slot_results results;
 	struct curl_slist *headers = NULL;
 	struct strbuf buf = STRBUF_INIT;
+	const char *accept_language;
 	int ret;
 
 	slot = get_active_slot();
@@ -1023,6 +1169,11 @@ static int http_request(const char *url,
 					 fwrite_buffer);
 	}
 
+	accept_language = get_accept_language();
+
+	if (accept_language)
+		headers = curl_slist_append(headers, accept_language);
+
 	strbuf_addstr(&buf, "Pragma:");
 	if (options && options->no_cache)
 		strbuf_addstr(&buf, " no-cache");
diff --git a/remote-curl.c b/remote-curl.c
index dd63bc2..04989e5 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -962,6 +962,8 @@ int main(int argc, const char **argv)
 	struct strbuf buf = STRBUF_INIT;
 	int nongit;
 
+	git_setup_gettext();
+
 	git_extract_argv0_path(argv[0]);
 	setup_git_directory_gently(&nongit);
 	if (argc < 2) {
diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh
index ac71418..e1e2938 100755
--- a/t/t5550-http-fetch-dumb.sh
+++ b/t/t5550-http-fetch-dumb.sh
@@ -196,5 +196,47 @@ test_expect_success 'reencoding is robust to whitespace oddities' '
 	grep "this is the error message" stderr
 '
 
+check_language () {
+	case "$2" in
+	'')
+		>expect
+		;;
+	?*)
+		echo "Accept-Language: $1" >expect
+		;;
+	esac &&
+	GIT_CURL_VERBOSE=1 \
+	LANGUAGE=$2 \
+	git ls-remote "$HTTPD_URL/dumb/repo.git" >output 2>&1 &&
+	tr -d '\015' <output |
+	sort -u |
+	sed -ne '/^Accept-Language:/ p' >actual &&
+	test_cmp expect actual
+}
+
+test_expect_success 'git client sends Accept-Language based on LANGUAGE' '
+	check_language "ko-KR, *;q=0.9" ko_KR.UTF-8'
+
+test_expect_success 'git client sends Accept-Language correctly with unordinary LANGUAGE' '
+	check_language "ko-KR, *;q=0.9" "ko_KR:" &&
+	check_language "ko-KR, en-US;q=0.9, *;q=0.8" "ko_KR::en_US" &&
+	check_language "ko-KR, *;q=0.9" ":::ko_KR" &&
+	check_language "ko-KR, en-US;q=0.9, *;q=0.8" "ko_KR!!:en_US" &&
+	check_language "ko-KR, ja-JP;q=0.9, *;q=0.8" "ko_KR en_US:ja_JP"'
+
+test_expect_success 'git client sends Accept-Language with many preferred languages' '
+	check_language "ko-KR, en-US;q=0.9, fr-CA;q=0.8, de;q=0.7, sr;q=0.6, \
+ja;q=0.5, zh;q=0.4, sv;q=0.3, pt;q=0.2, *;q=0.1" \
+		ko_KR.EUC-KR:en_US.UTF-8:fr_CA:de.UTF-8@euro:sr@latin:ja:zh:sv:pt &&
+	check_language "ko-KR, en-US;q=0.99, fr-CA;q=0.98, de;q=0.97, sr;q=0.96, \
+ja;q=0.95, zh;q=0.94, sv;q=0.93, pt;q=0.92, nb;q=0.91, *;q=0.90" \
+		ko_KR.EUC-KR:en_US.UTF-8:fr_CA:de.UTF-8@euro:sr@latin:ja:zh:sv:pt:nb
+'
+
+test_expect_success 'git client does not send an empty Accept-Language' '
+	GIT_CURL_VERBOSE=1 LANGUAGE= git ls-remote "$HTTPD_URL/dumb/repo.git" 2>stderr &&
+	! grep "^Accept-Language:" stderr
+'
+
 stop_httpd
 test_done
-- 
2.3.0.rc1.32.ga3df1c7

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: [PATCH] http: Add Accept-Language header if possible
  2015-01-27 15:51                             ` [PATCH] " Yi EungJun
@ 2015-01-27 23:34                               ` Junio C Hamano
  2015-01-28  6:15                                 ` Junio C Hamano
  0 siblings, 1 reply; 58+ messages in thread
From: Junio C Hamano @ 2015-01-27 23:34 UTC (permalink / raw)
  To: Yi EungJun
  Cc: Git List, Yi EungJun, Jeff King, Peter Krefting, Michael Blume,
	Torsten Bögershausen

Yi EungJun <semtlenori@gmail.com> writes:

> +static void write_accept_language(struct strbuf *buf)
> +{
> +	/*
> +	 * MAX_DECIMAL_PLACES must not be larger than 3. If it is larger than
> +	 * that, q-value will be smaller than 0.001, the minimum q-value the
> +	 * HTTP specification allows. See
> +	 * http://tools.ietf.org/html/rfc7231#section-5.3.1 for q-value.
> +	 */
> +	const int MAX_DECIMAL_PLACES = 3;
> +	const int MAX_LANGUAGE_TAGS = 1000;
> +	const int MAX_ACCEPT_LANGUAGE_HEADER_SIZE = 4000;
> +	struct strbuf *language_tags = NULL;
> +	int num_langs = 0;
> +	const char *s = get_preferred_languages();
> +	int i;
> +	struct strbuf tag = STRBUF_INIT;
> +
> +	/* Don't add Accept-Language header if no language is preferred. */
> +	if (!s)
> +		return;
> +
> +	/*
> +	 * Split the colon-separated string of preferred languages into
> +	 * language_tags array.
> +	 */
> +	do {
> +		/* collect language tag */
> +		for (; *s && (isalnum(*s) || *s == '_'); s++)
> +			strbuf_addch(&tag, *s == '_' ? '-' : *s);
> +
> +		/* skip .codeset, @modifier and any other unnecessary parts */
> +		while (*s && *s != ':')
> +			s++;
> +
> +		if (tag.len) {
> +			num_langs++;
> +			REALLOC_ARRAY(language_tags, num_langs);
> +			strbuf_init(&language_tags[num_langs - 1], 0);
> +			strbuf_swap(&tag, &language_tags[num_langs - 1]);
> +
> +			if (num_langs >= MAX_LANGUAGE_TAGS - 1) /* -1 for '*' */
> +				break;
> +		}
> +	} while (*s++);

The structure of this function is much easier to understand than any
of the previous rounds.  You collect up to the max you are going to
support, and then you format up to the max you are going to send.

Very straight-forward and simple.

> +	/* write Accept-Language header into buf */
> +	if (num_langs >= 1) {

micronit: should be OK to just say "if (num_langs)".

> +		int last_buf_len = 0;
> +		int max_q;
> +		int decimal_places;
> +		char q_format[32];
> +
> +		/* add '*' */
> +		REALLOC_ARRAY(language_tags, num_langs + 1);
> +		strbuf_init(&language_tags[num_langs], 0);
> +		strbuf_addstr(&language_tags[num_langs++], "*");
> +
> +		/* compute decimal_places */
> +		for (max_q = 1, decimal_places = 0;
> +				max_q < num_langs && decimal_places <= MAX_DECIMAL_PLACES;
> +				decimal_places++, max_q *= 10)
> +			;

micronit: the second and the third line are indented too deeply and
made me wonder if this has an overlong first line (i.e. the set-up
part to enter the for-loop) split into multiple lines.

> +
> +		sprintf(q_format, ";q=0.%%0%dd", decimal_places);
> +
> +		strbuf_addstr(buf, "Accept-Language: ");
> +
> +		for(i = 0; i < num_langs; i++) {
> +			if (i > 0)
> +				strbuf_addstr(buf, ", ");
> +
> +			strbuf_addstr(buf, strbuf_detach(&language_tags[i], NULL));

This is not wrong per-se, but it looks somewhat convoluted to me.

You can just peek language_tags[i].buf here without detaching, as
you will free the strbufs after you exit the loop anyway.  Your
version is not wrong and it does not result in double-freeing,
because detach clears the strbuf, but at the same time, makes the
responsibility to free language_tags[] strbuf split into here (for
elements up to the ones that are used to fill buf) and the cleanup
loop (for elements that are not used in this loop).

> +			if (i > 0)
> +				strbuf_addf(buf, q_format, max_q - i);
> +
> +			if (buf->len > MAX_ACCEPT_LANGUAGE_HEADER_SIZE) {
> +				strbuf_remove(buf, last_buf_len, buf->len - last_buf_len);
> +				break;
> +			}
> +
> +			last_buf_len = buf->len;
> +		}
> +	}
> +
> +	/* free language tags */
> +	for(i = 0; i < num_langs; i++) {
> +		strbuf_release(&language_tags[i]);
> +	}
> +	free(language_tags);
> +}

I am wondering if using strbuf for each of the language_tags[] is
even necessary.  How about doing it this way instead?

 http.c | 22 +++++++++-------------
 1 file changed, 9 insertions(+), 13 deletions(-)

diff --git a/http.c b/http.c
index 6111c6a..db591b3 100644
--- a/http.c
+++ b/http.c
@@ -1027,7 +1027,7 @@ static void write_accept_language(struct strbuf *buf)
 	const int MAX_DECIMAL_PLACES = 3;
 	const int MAX_LANGUAGE_TAGS = 1000;
 	const int MAX_ACCEPT_LANGUAGE_HEADER_SIZE = 4000;
-	struct strbuf *language_tags = NULL;
+	char **language_tags = NULL;
 	int num_langs = 0;
 	const char *s = get_preferred_languages();
 	int i;
@@ -1053,9 +1053,7 @@ static void write_accept_language(struct strbuf *buf)
 		if (tag.len) {
 			num_langs++;
 			REALLOC_ARRAY(language_tags, num_langs);
-			strbuf_init(&language_tags[num_langs - 1], 0);
-			strbuf_swap(&tag, &language_tags[num_langs - 1]);
-
+			language_tags[num_langs - 1] = strbuf_detach(&tag, NULL);
 			if (num_langs >= MAX_LANGUAGE_TAGS - 1) /* -1 for '*' */
 				break;
 		}
@@ -1070,13 +1068,12 @@ static void write_accept_language(struct strbuf *buf)
 
 		/* add '*' */
 		REALLOC_ARRAY(language_tags, num_langs + 1);
-		strbuf_init(&language_tags[num_langs], 0);
-		strbuf_addstr(&language_tags[num_langs++], "*");
+		language_tags[num_langs++] = "*"; /* it's OK; this won't be freed */
 
 		/* compute decimal_places */
 		for (max_q = 1, decimal_places = 0;
-				max_q < num_langs && decimal_places <= MAX_DECIMAL_PLACES;
-				decimal_places++, max_q *= 10)
+		     max_q < num_langs && decimal_places <= MAX_DECIMAL_PLACES;
+		     decimal_places++, max_q *= 10)
 			;
 
 		sprintf(q_format, ";q=0.%%0%dd", decimal_places);
@@ -1087,7 +1084,7 @@ static void write_accept_language(struct strbuf *buf)
 			if (i > 0)
 				strbuf_addstr(buf, ", ");
 
-			strbuf_addstr(buf, strbuf_detach(&language_tags[i], NULL));
+			strbuf_addstr(buf, language_tags[i]);
 
 			if (i > 0)
 				strbuf_addf(buf, q_format, max_q - i);
@@ -1101,10 +1098,9 @@ static void write_accept_language(struct strbuf *buf)
 		}
 	}
 
-	/* free language tags */
-	for(i = 0; i < num_langs; i++) {
-		strbuf_release(&language_tags[i]);
-	}
+	/* free language tags -- last one is a static '*' */
+	for(i = 0; i < num_langs - 1; i++)
+		free(language_tags[i]);
 	free(language_tags);
 }
 

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: [PATCH] http: Add Accept-Language header if possible
  2015-01-27 23:34                               ` Junio C Hamano
@ 2015-01-28  6:15                                 ` Junio C Hamano
  2015-01-28 11:59                                   ` Yi, EungJun
  2015-01-28 12:04                                   ` [PATCH v9 0/1] " Yi EungJun
  0 siblings, 2 replies; 58+ messages in thread
From: Junio C Hamano @ 2015-01-28  6:15 UTC (permalink / raw)
  To: Yi EungJun
  Cc: Git List, Yi EungJun, Jeff King, Peter Krefting, Michael Blume,
	Torsten Bögershausen

On Tue, Jan 27, 2015 at 3:34 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Yi EungJun <semtlenori@gmail.com> writes:
>
>> +
>> +             sprintf(q_format, ";q=0.%%0%dd", decimal_places);
>> +
>> +             strbuf_addstr(buf, "Accept-Language: ");
>> +
>> +             for(i = 0; i < num_langs; i++) {
>> +                     if (i > 0)
>> +                             strbuf_addstr(buf, ", ");
>> +
>> +                     strbuf_addstr(buf, strbuf_detach(&language_tags[i], NULL));
>
> This is not wrong per-se, but it looks somewhat convoluted to me.
> ...

Actually, this is wrong, isn't it?

strbuf_detach() removes the language_tags[i].buf from the strbuf,
and the caller now owns that piece of memory. Then strbuf_addstr()
appends a copy of that string to buf, and the piece of memory
that was originally held by language_tags[i].buf is now lost forever.

This is leaking.

>> +     /* free language tags */
>> +     for(i = 0; i < num_langs; i++) {
>> +             strbuf_release(&language_tags[i]);
>> +     }

... because this loop does not free memory for earlier parts of language_tags[].

> I am wondering if using strbuf for each of the language_tags[] is
> even necessary.  How about doing it this way instead?

And I think my counter-proposal does not leak (as it does not us strbuf for
language_tags[] anymore).

>
>  http.c | 22 +++++++++-------------
>  1 file changed, 9 insertions(+), 13 deletions(-)
>
> diff --git a/http.c b/http.c
> index 6111c6a..db591b3 100644
> --- a/http.c
> +++ b/http.c
> @@ -1027,7 +1027,7 @@ static void write_accept_language(struct strbuf *buf)
>         const int MAX_DECIMAL_PLACES = 3;
>         const int MAX_LANGUAGE_TAGS = 1000;
>         const int MAX_ACCEPT_LANGUAGE_HEADER_SIZE = 4000;
> -       struct strbuf *language_tags = NULL;
> +       char **language_tags = NULL;
>         int num_langs = 0;
>         const char *s = get_preferred_languages();
>         int i;
> @@ -1053,9 +1053,7 @@ static void write_accept_language(struct strbuf *buf)
>                 if (tag.len) {
>                         num_langs++;
>                         REALLOC_ARRAY(language_tags, num_langs);
> -                       strbuf_init(&language_tags[num_langs - 1], 0);
> -                       strbuf_swap(&tag, &language_tags[num_langs - 1]);
> -
> +                       language_tags[num_langs - 1] = strbuf_detach(&tag, NULL);
>                         if (num_langs >= MAX_LANGUAGE_TAGS - 1) /* -1 for '*' */
>                                 break;
>                 }
> @@ -1070,13 +1068,12 @@ static void write_accept_language(struct strbuf *buf)
>
>                 /* add '*' */
>                 REALLOC_ARRAY(language_tags, num_langs + 1);
> -               strbuf_init(&language_tags[num_langs], 0);
> -               strbuf_addstr(&language_tags[num_langs++], "*");
> +               language_tags[num_langs++] = "*"; /* it's OK; this won't be freed */
>
>                 /* compute decimal_places */
>                 for (max_q = 1, decimal_places = 0;
> -                               max_q < num_langs && decimal_places <= MAX_DECIMAL_PLACES;
> -                               decimal_places++, max_q *= 10)
> +                    max_q < num_langs && decimal_places <= MAX_DECIMAL_PLACES;
> +                    decimal_places++, max_q *= 10)
>                         ;
>
>                 sprintf(q_format, ";q=0.%%0%dd", decimal_places);
> @@ -1087,7 +1084,7 @@ static void write_accept_language(struct strbuf *buf)
>                         if (i > 0)
>                                 strbuf_addstr(buf, ", ");
>
> -                       strbuf_addstr(buf, strbuf_detach(&language_tags[i], NULL));
> +                       strbuf_addstr(buf, language_tags[i]);
>
>                         if (i > 0)
>                                 strbuf_addf(buf, q_format, max_q - i);
> @@ -1101,10 +1098,9 @@ static void write_accept_language(struct strbuf *buf)
>                 }
>         }
>
> -       /* free language tags */
> -       for(i = 0; i < num_langs; i++) {
> -               strbuf_release(&language_tags[i]);
> -       }
> +       /* free language tags -- last one is a static '*' */
> +       for(i = 0; i < num_langs - 1; i++)
> +               free(language_tags[i]);
>         free(language_tags);
>  }
>

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] http: Add Accept-Language header if possible
  2015-01-28  6:15                                 ` Junio C Hamano
@ 2015-01-28 11:59                                   ` Yi, EungJun
  2015-01-28 12:04                                   ` [PATCH v9 0/1] " Yi EungJun
  1 sibling, 0 replies; 58+ messages in thread
From: Yi, EungJun @ 2015-01-28 11:59 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Git List, Yi EungJun, Jeff King, Peter Krefting, Michael Blume,
	Torsten Bögershausen

I agree that a list of char* is enough for language_tags.

Thanks for your review and patch. I'll apply your patch and send v9.

On Wed, Jan 28, 2015 at 3:15 PM, Junio C Hamano <gitster@pobox.com> wrote:
> On Tue, Jan 27, 2015 at 3:34 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> Yi EungJun <semtlenori@gmail.com> writes:
>>
>>> +
>>> +             sprintf(q_format, ";q=0.%%0%dd", decimal_places);
>>> +
>>> +             strbuf_addstr(buf, "Accept-Language: ");
>>> +
>>> +             for(i = 0; i < num_langs; i++) {
>>> +                     if (i > 0)
>>> +                             strbuf_addstr(buf, ", ");
>>> +
>>> +                     strbuf_addstr(buf, strbuf_detach(&language_tags[i], NULL));
>>
>> This is not wrong per-se, but it looks somewhat convoluted to me.
>> ...
>
> Actually, this is wrong, isn't it?
>
> strbuf_detach() removes the language_tags[i].buf from the strbuf,
> and the caller now owns that piece of memory. Then strbuf_addstr()
> appends a copy of that string to buf, and the piece of memory
> that was originally held by language_tags[i].buf is now lost forever.
>
> This is leaking.
>
>>> +     /* free language tags */
>>> +     for(i = 0; i < num_langs; i++) {
>>> +             strbuf_release(&language_tags[i]);
>>> +     }
>
> ... because this loop does not free memory for earlier parts of language_tags[].
>
>> I am wondering if using strbuf for each of the language_tags[] is
>> even necessary.  How about doing it this way instead?
>
> And I think my counter-proposal does not leak (as it does not us strbuf for
> language_tags[] anymore).
>
>>
>>  http.c | 22 +++++++++-------------
>>  1 file changed, 9 insertions(+), 13 deletions(-)
>>
>> diff --git a/http.c b/http.c
>> index 6111c6a..db591b3 100644
>> --- a/http.c
>> +++ b/http.c
>> @@ -1027,7 +1027,7 @@ static void write_accept_language(struct strbuf *buf)
>>         const int MAX_DECIMAL_PLACES = 3;
>>         const int MAX_LANGUAGE_TAGS = 1000;
>>         const int MAX_ACCEPT_LANGUAGE_HEADER_SIZE = 4000;
>> -       struct strbuf *language_tags = NULL;
>> +       char **language_tags = NULL;
>>         int num_langs = 0;
>>         const char *s = get_preferred_languages();
>>         int i;
>> @@ -1053,9 +1053,7 @@ static void write_accept_language(struct strbuf *buf)
>>                 if (tag.len) {
>>                         num_langs++;
>>                         REALLOC_ARRAY(language_tags, num_langs);
>> -                       strbuf_init(&language_tags[num_langs - 1], 0);
>> -                       strbuf_swap(&tag, &language_tags[num_langs - 1]);
>> -
>> +                       language_tags[num_langs - 1] = strbuf_detach(&tag, NULL);
>>                         if (num_langs >= MAX_LANGUAGE_TAGS - 1) /* -1 for '*' */
>>                                 break;
>>                 }
>> @@ -1070,13 +1068,12 @@ static void write_accept_language(struct strbuf *buf)
>>
>>                 /* add '*' */
>>                 REALLOC_ARRAY(language_tags, num_langs + 1);
>> -               strbuf_init(&language_tags[num_langs], 0);
>> -               strbuf_addstr(&language_tags[num_langs++], "*");
>> +               language_tags[num_langs++] = "*"; /* it's OK; this won't be freed */
>>
>>                 /* compute decimal_places */
>>                 for (max_q = 1, decimal_places = 0;
>> -                               max_q < num_langs && decimal_places <= MAX_DECIMAL_PLACES;
>> -                               decimal_places++, max_q *= 10)
>> +                    max_q < num_langs && decimal_places <= MAX_DECIMAL_PLACES;
>> +                    decimal_places++, max_q *= 10)
>>                         ;
>>
>>                 sprintf(q_format, ";q=0.%%0%dd", decimal_places);
>> @@ -1087,7 +1084,7 @@ static void write_accept_language(struct strbuf *buf)
>>                         if (i > 0)
>>                                 strbuf_addstr(buf, ", ");
>>
>> -                       strbuf_addstr(buf, strbuf_detach(&language_tags[i], NULL));
>> +                       strbuf_addstr(buf, language_tags[i]);
>>
>>                         if (i > 0)
>>                                 strbuf_addf(buf, q_format, max_q - i);
>> @@ -1101,10 +1098,9 @@ static void write_accept_language(struct strbuf *buf)
>>                 }
>>         }
>>
>> -       /* free language tags */
>> -       for(i = 0; i < num_langs; i++) {
>> -               strbuf_release(&language_tags[i]);
>> -       }
>> +       /* free language tags -- last one is a static '*' */
>> +       for(i = 0; i < num_langs - 1; i++)
>> +               free(language_tags[i]);
>>         free(language_tags);
>>  }
>>

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH v9 0/1] http: Add Accept-Language header if possible
  2015-01-28  6:15                                 ` Junio C Hamano
  2015-01-28 11:59                                   ` Yi, EungJun
@ 2015-01-28 12:04                                   ` Yi EungJun
  2015-01-28 12:04                                     ` [PATCH v9 1/1] " Yi EungJun
  2015-01-29  6:19                                     ` [PATCH v9 0/1] " Junio C Hamano
  1 sibling, 2 replies; 58+ messages in thread
From: Yi EungJun @ 2015-01-28 12:04 UTC (permalink / raw)
  To: Git List
  Cc: Yi EungJun, Junio C Hamano, Jeff King, Peter Krefting,
	Michael Blume, Torsten Bögershausen

From: Yi EungJun <eungjun.yi@navercorp.com>

Change since v8

Apply Junio's patch: Use an array of char* instead of strbuf for language_tags.

Yi EungJun (1):
  http: Add Accept-Language header if possible

 http.c                     | 147 +++++++++++++++++++++++++++++++++++++++++++++
 remote-curl.c              |   2 +
 t/t5550-http-fetch-dumb.sh |  42 +++++++++++++
 3 files changed, 191 insertions(+)

-- 
2.3.0.rc1.32.g7a36c04

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH v9 1/1] http: Add Accept-Language header if possible
  2015-01-28 12:04                                   ` [PATCH v9 0/1] " Yi EungJun
@ 2015-01-28 12:04                                     ` Yi EungJun
  2015-02-25 22:52                                       ` Junio C Hamano
  2015-01-29  6:19                                     ` [PATCH v9 0/1] " Junio C Hamano
  1 sibling, 1 reply; 58+ messages in thread
From: Yi EungJun @ 2015-01-28 12:04 UTC (permalink / raw)
  To: Git List
  Cc: Yi EungJun, Junio C Hamano, Jeff King, Peter Krefting,
	Michael Blume, Torsten Bögershausen

From: Yi EungJun <eungjun.yi@navercorp.com>

Add an Accept-Language header which indicates the user's preferred
languages defined by $LANGUAGE, $LC_ALL, $LC_MESSAGES and $LANG.

Examples:
  LANGUAGE= -> ""
  LANGUAGE=ko:en -> "Accept-Language: ko, en;q=0.9, *;q=0.1"
  LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *;q=0.1"
  LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *;q=0.1"

This gives git servers a chance to display remote error messages in
the user's preferred language.

Limit the number of languages to 1,000 because q-value must not be
smaller than 0.001, and limit the length of Accept-Language header to
4,000 bytes for some HTTP servers which cannot accept such long header.

Signed-off-by: Yi EungJun <eungjun.yi@navercorp.com>
---
 http.c                     | 147 +++++++++++++++++++++++++++++++++++++++++++++
 remote-curl.c              |   2 +
 t/t5550-http-fetch-dumb.sh |  42 +++++++++++++
 3 files changed, 191 insertions(+)

diff --git a/http.c b/http.c
index 040f362..b2ad2a8 100644
--- a/http.c
+++ b/http.c
@@ -68,6 +68,8 @@ static struct curl_slist *no_pragma_header;
 
 static struct active_request_slot *active_queue_head;
 
+static char *cached_accept_language;
+
 size_t fread_buffer(char *ptr, size_t eltsize, size_t nmemb, void *buffer_)
 {
 	size_t size = eltsize * nmemb;
@@ -515,6 +517,9 @@ void http_cleanup(void)
 		cert_auth.password = NULL;
 	}
 	ssl_cert_password_required = 0;
+
+	free(cached_accept_language);
+	cached_accept_language = NULL;
 }
 
 struct active_request_slot *get_active_slot(void)
@@ -986,6 +991,142 @@ static void extract_content_type(struct strbuf *raw, struct strbuf *type,
 		strbuf_addstr(charset, "ISO-8859-1");
 }
 
+/*
+ * Guess the user's preferred languages from the value in LANGUAGE environment
+ * variable and LC_MESSAGES locale category if NO_GETTEXT is not defined.
+ *
+ * The result can be a colon-separated list like "ko:ja:en".
+ */
+static const char *get_preferred_languages(void)
+{
+	const char *retval;
+
+	retval = getenv("LANGUAGE");
+	if (retval && *retval)
+		return retval;
+
+#ifndef NO_GETTEXT
+	retval = setlocale(LC_MESSAGES, NULL);
+	if (retval && *retval &&
+		strcmp(retval, "C") &&
+		strcmp(retval, "POSIX"))
+		return retval;
+#endif
+
+	return NULL;
+}
+
+static void write_accept_language(struct strbuf *buf)
+{
+	/*
+	 * MAX_DECIMAL_PLACES must not be larger than 3. If it is larger than
+	 * that, q-value will be smaller than 0.001, the minimum q-value the
+	 * HTTP specification allows. See
+	 * http://tools.ietf.org/html/rfc7231#section-5.3.1 for q-value.
+	 */
+	const int MAX_DECIMAL_PLACES = 3;
+	const int MAX_LANGUAGE_TAGS = 1000;
+	const int MAX_ACCEPT_LANGUAGE_HEADER_SIZE = 4000;
+	char **language_tags = NULL;
+	int num_langs = 0;
+	const char *s = get_preferred_languages();
+	int i;
+	struct strbuf tag = STRBUF_INIT;
+
+	/* Don't add Accept-Language header if no language is preferred. */
+	if (!s)
+		return;
+
+	/*
+	 * Split the colon-separated string of preferred languages into
+	 * language_tags array.
+	 */
+	do {
+		/* collect language tag */
+		for (; *s && (isalnum(*s) || *s == '_'); s++)
+			strbuf_addch(&tag, *s == '_' ? '-' : *s);
+
+		/* skip .codeset, @modifier and any other unnecessary parts */
+		while (*s && *s != ':')
+			s++;
+
+		if (tag.len) {
+			num_langs++;
+			REALLOC_ARRAY(language_tags, num_langs);
+			language_tags[num_langs - 1] = strbuf_detach(&tag, NULL);
+			if (num_langs >= MAX_LANGUAGE_TAGS - 1) /* -1 for '*' */
+				break;
+		}
+	} while (*s++);
+
+	/* write Accept-Language header into buf */
+	if (num_langs) {
+		int last_buf_len = 0;
+		int max_q;
+		int decimal_places;
+		char q_format[32];
+
+		/* add '*' */
+		REALLOC_ARRAY(language_tags, num_langs + 1);
+		language_tags[num_langs++] = "*"; /* it's OK; this won't be freed */
+
+		/* compute decimal_places */
+		for (max_q = 1, decimal_places = 0;
+		     max_q < num_langs && decimal_places <= MAX_DECIMAL_PLACES;
+		     decimal_places++, max_q *= 10)
+			;
+
+		sprintf(q_format, ";q=0.%%0%dd", decimal_places);
+
+		strbuf_addstr(buf, "Accept-Language: ");
+
+		for(i = 0; i < num_langs; i++) {
+			if (i > 0)
+				strbuf_addstr(buf, ", ");
+
+			strbuf_addstr(buf, language_tags[i]);
+
+			if (i > 0)
+				strbuf_addf(buf, q_format, max_q - i);
+
+			if (buf->len > MAX_ACCEPT_LANGUAGE_HEADER_SIZE) {
+				strbuf_remove(buf, last_buf_len, buf->len - last_buf_len);
+				break;
+			}
+
+			last_buf_len = buf->len;
+		}
+	}
+
+	/* free language tags -- last one is a static '*' */
+	for(i = 0; i < num_langs - 1; i++)
+		free(language_tags[i]);
+	free(language_tags);
+}
+
+/*
+ * Get an Accept-Language header which indicates user's preferred languages.
+ *
+ * Examples:
+ *   LANGUAGE= -> ""
+ *   LANGUAGE=ko:en -> "Accept-Language: ko, en; q=0.9, *; q=0.1"
+ *   LANGUAGE=ko_KR.UTF-8:sr@latin -> "Accept-Language: ko-KR, sr; q=0.9, *; q=0.1"
+ *   LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *; q=0.1"
+ *   LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *; q=0.1"
+ *   LANGUAGE= LANG=C -> ""
+ */
+static const char *get_accept_language(void)
+{
+	if (!cached_accept_language) {
+		struct strbuf buf = STRBUF_INIT;
+		write_accept_language(&buf);
+		if (buf.len > 0)
+			cached_accept_language = strbuf_detach(&buf, NULL);
+	}
+
+	return cached_accept_language;
+}
+
 /* http_request() targets */
 #define HTTP_REQUEST_STRBUF	0
 #define HTTP_REQUEST_FILE	1
@@ -998,6 +1139,7 @@ static int http_request(const char *url,
 	struct slot_results results;
 	struct curl_slist *headers = NULL;
 	struct strbuf buf = STRBUF_INIT;
+	const char *accept_language;
 	int ret;
 
 	slot = get_active_slot();
@@ -1023,6 +1165,11 @@ static int http_request(const char *url,
 					 fwrite_buffer);
 	}
 
+	accept_language = get_accept_language();
+
+	if (accept_language)
+		headers = curl_slist_append(headers, accept_language);
+
 	strbuf_addstr(&buf, "Pragma:");
 	if (options && options->no_cache)
 		strbuf_addstr(&buf, " no-cache");
diff --git a/remote-curl.c b/remote-curl.c
index dd63bc2..04989e5 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -962,6 +962,8 @@ int main(int argc, const char **argv)
 	struct strbuf buf = STRBUF_INIT;
 	int nongit;
 
+	git_setup_gettext();
+
 	git_extract_argv0_path(argv[0]);
 	setup_git_directory_gently(&nongit);
 	if (argc < 2) {
diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh
index ac71418..e1e2938 100755
--- a/t/t5550-http-fetch-dumb.sh
+++ b/t/t5550-http-fetch-dumb.sh
@@ -196,5 +196,47 @@ test_expect_success 'reencoding is robust to whitespace oddities' '
 	grep "this is the error message" stderr
 '
 
+check_language () {
+	case "$2" in
+	'')
+		>expect
+		;;
+	?*)
+		echo "Accept-Language: $1" >expect
+		;;
+	esac &&
+	GIT_CURL_VERBOSE=1 \
+	LANGUAGE=$2 \
+	git ls-remote "$HTTPD_URL/dumb/repo.git" >output 2>&1 &&
+	tr -d '\015' <output |
+	sort -u |
+	sed -ne '/^Accept-Language:/ p' >actual &&
+	test_cmp expect actual
+}
+
+test_expect_success 'git client sends Accept-Language based on LANGUAGE' '
+	check_language "ko-KR, *;q=0.9" ko_KR.UTF-8'
+
+test_expect_success 'git client sends Accept-Language correctly with unordinary LANGUAGE' '
+	check_language "ko-KR, *;q=0.9" "ko_KR:" &&
+	check_language "ko-KR, en-US;q=0.9, *;q=0.8" "ko_KR::en_US" &&
+	check_language "ko-KR, *;q=0.9" ":::ko_KR" &&
+	check_language "ko-KR, en-US;q=0.9, *;q=0.8" "ko_KR!!:en_US" &&
+	check_language "ko-KR, ja-JP;q=0.9, *;q=0.8" "ko_KR en_US:ja_JP"'
+
+test_expect_success 'git client sends Accept-Language with many preferred languages' '
+	check_language "ko-KR, en-US;q=0.9, fr-CA;q=0.8, de;q=0.7, sr;q=0.6, \
+ja;q=0.5, zh;q=0.4, sv;q=0.3, pt;q=0.2, *;q=0.1" \
+		ko_KR.EUC-KR:en_US.UTF-8:fr_CA:de.UTF-8@euro:sr@latin:ja:zh:sv:pt &&
+	check_language "ko-KR, en-US;q=0.99, fr-CA;q=0.98, de;q=0.97, sr;q=0.96, \
+ja;q=0.95, zh;q=0.94, sv;q=0.93, pt;q=0.92, nb;q=0.91, *;q=0.90" \
+		ko_KR.EUC-KR:en_US.UTF-8:fr_CA:de.UTF-8@euro:sr@latin:ja:zh:sv:pt:nb
+'
+
+test_expect_success 'git client does not send an empty Accept-Language' '
+	GIT_CURL_VERBOSE=1 LANGUAGE= git ls-remote "$HTTPD_URL/dumb/repo.git" 2>stderr &&
+	! grep "^Accept-Language:" stderr
+'
+
 stop_httpd
 test_done
-- 
2.3.0.rc1.32.g7a36c04

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: [PATCH v9 0/1] http: Add Accept-Language header if possible
  2015-01-28 12:04                                   ` [PATCH v9 0/1] " Yi EungJun
  2015-01-28 12:04                                     ` [PATCH v9 1/1] " Yi EungJun
@ 2015-01-29  6:19                                     ` Junio C Hamano
  2015-01-30 17:23                                       ` Yi, EungJun
  1 sibling, 1 reply; 58+ messages in thread
From: Junio C Hamano @ 2015-01-29  6:19 UTC (permalink / raw)
  To: Yi EungJun
  Cc: Git List, Yi EungJun, Jeff King, Peter Krefting, Michael Blume,
	Torsten Bögershausen

Thanks; queued.  Let's run with this and try to make it graduate
early next cycle.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v9 0/1] http: Add Accept-Language header if possible
  2015-01-29  6:19                                     ` [PATCH v9 0/1] " Junio C Hamano
@ 2015-01-30 17:23                                       ` Yi, EungJun
  0 siblings, 0 replies; 58+ messages in thread
From: Yi, EungJun @ 2015-01-30 17:23 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Git List, Yi EungJun, Jeff King, Peter Krefting, Michael Blume,
	Torsten Bögershausen

I'm very glad to hear that. Thanks to all reviewers!

On Thu, Jan 29, 2015 at 3:19 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Thanks; queued.  Let's run with this and try to make it graduate
> early next cycle.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v9 1/1] http: Add Accept-Language header if possible
  2015-01-28 12:04                                     ` [PATCH v9 1/1] " Yi EungJun
@ 2015-02-25 22:52                                       ` Junio C Hamano
  2015-02-26  3:04                                         ` Jeff King
  0 siblings, 1 reply; 58+ messages in thread
From: Junio C Hamano @ 2015-02-25 22:52 UTC (permalink / raw)
  To: Yi EungJun
  Cc: Git List, Yi EungJun, Jeff King, Peter Krefting, Michael Blume,
	Torsten Bögershausen

Yi EungJun <semtlenori@gmail.com> writes:

> From: Yi EungJun <eungjun.yi@navercorp.com>
>
> Add an Accept-Language header which indicates the user's preferred
> languages defined by $LANGUAGE, $LC_ALL, $LC_MESSAGES and $LANG.
> ...
> Signed-off-by: Yi EungJun <eungjun.yi@navercorp.com>
> ---

Yikes.

This is now in 'master', but I wonder if people are getting
compilation errors because of this change.  I do.

It introduces a call to setlocale() without causing <locale.h> to be
included, and runs afoul of -Wimplicit-function-declaration.

Other call sites of setlocale() are in gettext.c, which does include
the header at the beginning.

> diff --git a/http.c b/http.c
> index 040f362..b2ad2a8 100644
> --- a/http.c
> +++ b/http.c
> ...
> +#ifndef NO_GETTEXT
> +	retval = setlocale(LC_MESSAGES, NULL);
> +	if (retval && *retval &&
> +		strcmp(retval, "C") &&
> +		strcmp(retval, "POSIX"))
> +		return retval;
> +#endif

I really do not like a conditional inclusion of system header files
inside any *.c file, but here is a minimum emergency fix-up I am
running with today.  It should go to somewhere in git-compat-util.h.

Somebody care to throw a tested fix-up patch at me?

Thanks.

 http.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/http.c b/http.c
index efdab09..7e79cbd 100644
--- a/http.c
+++ b/http.c
@@ -9,6 +9,10 @@
 #include "version.h"
 #include "pkt-line.h"
 
+#ifndef NO_GETTEXT
+#include <locale.h>
+#endif
+
 int active_requests;
 int http_is_verbose;
 size_t http_post_buffer = 16 * LARGE_PACKET_MAX;

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: [PATCH v9 1/1] http: Add Accept-Language header if possible
  2015-02-25 22:52                                       ` Junio C Hamano
@ 2015-02-26  3:04                                         ` Jeff King
  2015-02-26  3:10                                           ` Jeff King
  2015-02-26 20:59                                           ` Junio C Hamano
  0 siblings, 2 replies; 58+ messages in thread
From: Jeff King @ 2015-02-26  3:04 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Yi EungJun, Git List, Yi EungJun, Peter Krefting, Michael Blume,
	Torsten Bögershausen

On Wed, Feb 25, 2015 at 02:52:26PM -0800, Junio C Hamano wrote:

> This is now in 'master', but I wonder if people are getting
> compilation errors because of this change.  I do.

I usually compile with NO_GETTEXT, but if I stop doing so, I see the
problem, too.

> I really do not like a conditional inclusion of system header files
> inside any *.c file, but here is a minimum emergency fix-up I am
> running with today.  It should go to somewhere in git-compat-util.h.

Perhaps it would be less risky to stick get_preferred_languages() into
gettext.c, like the patch below. Then we do not have to worry about
locale.h introducing other disruptive includes. The function is not
technically about gettext, but it seems reasonable to me to stuff all of
the i18n code together.

Another variant of this would for gettext.c to provide a git_setlocale
that just wraps setlocale (and does nothing when NO_GETTEXT is given).

diff --git a/gettext.c b/gettext.c
index 8b2da46..7378ba2 100644
--- a/gettext.c
+++ b/gettext.c
@@ -18,6 +18,31 @@
 #	endif
 #endif
 
+/*
+ * Guess the user's preferred languages from the value in LANGUAGE environment
+ * variable and LC_MESSAGES locale category if NO_GETTEXT is not defined.
+ *
+ * The result can be a colon-separated list like "ko:ja:en".
+ */
+const char *get_preferred_languages(void)
+{
+	const char *retval;
+
+	retval = getenv("LANGUAGE");
+	if (retval && *retval)
+		return retval;
+
+#ifndef NO_GETTEXT
+	retval = setlocale(LC_MESSAGES, NULL);
+	if (retval && *retval &&
+		strcmp(retval, "C") &&
+		strcmp(retval, "POSIX"))
+		return retval;
+#endif
+
+	return NULL;
+}
+
 #ifdef GETTEXT_POISON
 int use_gettext_poison(void)
 {
diff --git a/gettext.h b/gettext.h
index dc1722d..5d8d2df 100644
--- a/gettext.h
+++ b/gettext.h
@@ -89,4 +89,6 @@ const char *Q_(const char *msgid, const char *plu, unsigned long n)
 #define N_(msgid) (msgid)
 #endif
 
+const char *get_preferred_languages();
+
 #endif
diff --git a/http.c b/http.c
index 0153fb0..9c825af 100644
--- a/http.c
+++ b/http.c
@@ -8,6 +8,7 @@
 #include "credential.h"
 #include "version.h"
 #include "pkt-line.h"
+#include "gettext.h"
 
 int active_requests;
 int http_is_verbose;
@@ -1002,32 +1003,6 @@ static void extract_content_type(struct strbuf *raw, struct strbuf *type,
 		strbuf_addstr(charset, "ISO-8859-1");
 }
 
-
-/*
- * Guess the user's preferred languages from the value in LANGUAGE environment
- * variable and LC_MESSAGES locale category if NO_GETTEXT is not defined.
- *
- * The result can be a colon-separated list like "ko:ja:en".
- */
-static const char *get_preferred_languages(void)
-{
-	const char *retval;
-
-	retval = getenv("LANGUAGE");
-	if (retval && *retval)
-		return retval;
-
-#ifndef NO_GETTEXT
-	retval = setlocale(LC_MESSAGES, NULL);
-	if (retval && *retval &&
-		strcmp(retval, "C") &&
-		strcmp(retval, "POSIX"))
-		return retval;
-#endif
-
-	return NULL;
-}
-
 static void write_accept_language(struct strbuf *buf)
 {
 	/*

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: [PATCH v9 1/1] http: Add Accept-Language header if possible
  2015-02-26  3:04                                         ` Jeff King
@ 2015-02-26  3:10                                           ` Jeff King
  2015-02-26 20:59                                           ` Junio C Hamano
  1 sibling, 0 replies; 58+ messages in thread
From: Jeff King @ 2015-02-26  3:10 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Yi EungJun, Git List, Yi EungJun, Peter Krefting, Michael Blume,
	Torsten Bögershausen

On Wed, Feb 25, 2015 at 10:04:16PM -0500, Jeff King wrote:

> Another variant of this would for gettext.c to provide a git_setlocale
> that just wraps setlocale (and does nothing when NO_GETTEXT is given).

This doesn't _quite_ work. In addition to the function, we have to have
LC_MESSAGES defined. So we cannot provide a straight git_setlocale, but
it has to be something like "git_message_locale". At which point we
might as well just provide get_preferred_languages() as the interface
between gettext.c and the rest of the program.

-Peff

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v9 1/1] http: Add Accept-Language header if possible
  2015-02-26  3:04                                         ` Jeff King
  2015-02-26  3:10                                           ` Jeff King
@ 2015-02-26 20:59                                           ` Junio C Hamano
  2015-02-26 21:33                                             ` Jeff King
  1 sibling, 1 reply; 58+ messages in thread
From: Junio C Hamano @ 2015-02-26 20:59 UTC (permalink / raw)
  To: Jeff King
  Cc: Yi EungJun, Git List, Yi EungJun, Peter Krefting, Michael Blume,
	Torsten Bögershausen

Jeff King <peff@peff.net> writes:

> Perhaps it would be less risky to stick get_preferred_languages() into
> gettext.c, like the patch below. Then we do not have to worry about
> locale.h introducing other disruptive includes. The function is not
> technically about gettext, but it seems reasonable to me to stuff all of
> the i18n code together.

Yeah, I like that a lot better.  Thanks.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v9 1/1] http: Add Accept-Language header if possible
  2015-02-26 20:59                                           ` Junio C Hamano
@ 2015-02-26 21:33                                             ` Jeff King
  2015-02-26 21:42                                               ` Junio C Hamano
  0 siblings, 1 reply; 58+ messages in thread
From: Jeff King @ 2015-02-26 21:33 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Yi EungJun, Git List, Yi EungJun, Peter Krefting, Michael Blume,
	Torsten Bögershausen

On Thu, Feb 26, 2015 at 12:59:56PM -0800, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > Perhaps it would be less risky to stick get_preferred_languages() into
> > gettext.c, like the patch below. Then we do not have to worry about
> > locale.h introducing other disruptive includes. The function is not
> > technically about gettext, but it seems reasonable to me to stuff all of
> > the i18n code together.
> 
> Yeah, I like that a lot better.  Thanks.

Are you just using it for inspiration, or did you want me to wrap it up
with a commit message?

-Peff

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v9 1/1] http: Add Accept-Language header if possible
  2015-02-26 21:33                                             ` Jeff King
@ 2015-02-26 21:42                                               ` Junio C Hamano
  2015-02-26 21:47                                                 ` Stefan Beller
  0 siblings, 1 reply; 58+ messages in thread
From: Junio C Hamano @ 2015-02-26 21:42 UTC (permalink / raw)
  To: Jeff King
  Cc: Yi EungJun, Git List, Yi EungJun, Peter Krefting, Michael Blume,
	Torsten Bögershausen

Jeff King <peff@peff.net> writes:

> On Thu, Feb 26, 2015 at 12:59:56PM -0800, Junio C Hamano wrote:
>
>> Jeff King <peff@peff.net> writes:
>> 
>> > Perhaps it would be less risky to stick get_preferred_languages() into
>> > gettext.c, like the patch below. Then we do not have to worry about
>> > locale.h introducing other disruptive includes. The function is not
>> > technically about gettext, but it seems reasonable to me to stuff all of
>> > the i18n code together.
>> 
>> Yeah, I like that a lot better.  Thanks.
>
> Are you just using it for inspiration, or did you want me to wrap it up
> with a commit message?

Here is what I queued.  Thanks.

-- >8 --
From: Jeff King <peff@peff.net>
Date: Wed, 25 Feb 2015 22:04:16 -0500
Subject: [PATCH] gettext.c: move get_preferred_languages() from http.c

Calling setlocale(LC_MESSAGES, ...) directly from http.c, without
including <locale.h>, was causing compilation warnings.  Move the
helper function to gettext.c that already includes the header and
where locale-related issues are handled.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 gettext.c | 25 +++++++++++++++++++++++++
 gettext.h |  2 ++
 http.c    |  1 +
 3 files changed, 28 insertions(+)

diff --git a/gettext.c b/gettext.c
index 8b2da46..7378ba2 100644
--- a/gettext.c
+++ b/gettext.c
@@ -18,6 +18,31 @@
 #	endif
 #endif
 
+/*
+ * Guess the user's preferred languages from the value in LANGUAGE environment
+ * variable and LC_MESSAGES locale category if NO_GETTEXT is not defined.
+ *
+ * The result can be a colon-separated list like "ko:ja:en".
+ */
+const char *get_preferred_languages(void)
+{
+	const char *retval;
+
+	retval = getenv("LANGUAGE");
+	if (retval && *retval)
+		return retval;
+
+#ifndef NO_GETTEXT
+	retval = setlocale(LC_MESSAGES, NULL);
+	if (retval && *retval &&
+		strcmp(retval, "C") &&
+		strcmp(retval, "POSIX"))
+		return retval;
+#endif
+
+	return NULL;
+}
+
 #ifdef GETTEXT_POISON
 int use_gettext_poison(void)
 {
diff --git a/gettext.h b/gettext.h
index 7671d09..e539482 100644
--- a/gettext.h
+++ b/gettext.h
@@ -65,4 +65,6 @@ const char *Q_(const char *msgid, const char *plu, unsigned long n)
 /* Mark msgid for translation but do not translate it. */
 #define N_(msgid) msgid
 
+const char *get_preferred_languages(void);
+
 #endif
diff --git a/http.c b/http.c
index 8b659b6..71ed418 100644
--- a/http.c
+++ b/http.c
@@ -8,6 +8,7 @@
 #include "credential.h"
 #include "version.h"
 #include "pkt-line.h"
+#include "gettext.h"
 
 int active_requests;
 int http_is_verbose;
-- 
2.3.1-280-g2531f2d

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: [PATCH v9 1/1] http: Add Accept-Language header if possible
  2015-02-26 21:42                                               ` Junio C Hamano
@ 2015-02-26 21:47                                                 ` Stefan Beller
  2015-02-26 22:06                                                   ` Jeff King
  0 siblings, 1 reply; 58+ messages in thread
From: Stefan Beller @ 2015-02-26 21:47 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jeff King, Yi EungJun, Git List, Yi EungJun, Peter Krefting,
	Michael Blume, Torsten Bögershausen

On Thu, Feb 26, 2015 at 1:42 PM, Junio C Hamano <gitster@pobox.com> wrote:>
> Here is what I queued.  Thanks.

I did not follow the thread if there are any intermediate patches,
though it applied cleanly.

Applying this on top of f18604bbf2c391c689a41fca14cbaeff5e106255
(http: add Accept-Language header if possible) still doesn't compile for me.

http.c:1001:20: error: static declaration of 'get_preferred_languages'
follows non-static declaration
 static const char *get_preferred_languages(void)
                    ^
In file included from cache.h:8:0,
                 from http.h:4,
                 from http.c:2:
gettext.h:68:13: note: previous declaration of
'get_preferred_languages' was here
 const char *get_preferred_languages(void);
             ^
http.c: In function 'get_preferred_languages':
http.c:1010:2: warning: implicit declaration of function 'setlocale'
[-Wimplicit-function-declaration]
  retval = setlocale(LC_MESSAGES, NULL);
  ^
http.c:1010:21: error: 'LC_MESSAGES' undeclared (first use in this function)
  retval = setlocale(LC_MESSAGES, NULL);
                     ^
http.c:1010:21: note: each undeclared identifier is reported only once
for each function it appears in

Rebasing this on top of current master (Post 2.3 cyle (batch #5)) also fails:

http.c:1013:20: error: static declaration of 'get_preferred_languages'
follows non-static declaration
 static const char *get_preferred_languages(void)
                    ^
In file included from cache.h:8:0,
                 from http.h:4,
                 from http.c:2:
gettext.h:92:13: note: previous declaration of
'get_preferred_languages' was here
 const char *get_preferred_languages(void);
             ^
http.c: In function 'get_preferred_languages':
http.c:1022:2: warning: implicit declaration of function 'setlocale'
[-Wimplicit-function-declaration]
  retval = setlocale(LC_MESSAGES, NULL);
  ^
http.c:1022:21: error: 'LC_MESSAGES' undeclared (first use in this function)
  retval = setlocale(LC_MESSAGES, NULL);
                     ^
http.c:1022:21: note: each undeclared identifier is reported only once
for each function it appears in




>
> -- >8 --
> From: Jeff King <peff@peff.net>
> Date: Wed, 25 Feb 2015 22:04:16 -0500
> Subject: [PATCH] gettext.c: move get_preferred_languages() from http.c
>
> Calling setlocale(LC_MESSAGES, ...) directly from http.c, without
> including <locale.h>, was causing compilation warnings.  Move the
> helper function to gettext.c that already includes the header and
> where locale-related issues are handled.
>
> Signed-off-by: Jeff King <peff@peff.net>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
>  gettext.c | 25 +++++++++++++++++++++++++
>  gettext.h |  2 ++
>  http.c    |  1 +
>  3 files changed, 28 insertions(+)
>
> diff --git a/gettext.c b/gettext.c
> index 8b2da46..7378ba2 100644
> --- a/gettext.c
> +++ b/gettext.c
> @@ -18,6 +18,31 @@
>  #      endif
>  #endif
>
> +/*
> + * Guess the user's preferred languages from the value in LANGUAGE environment
> + * variable and LC_MESSAGES locale category if NO_GETTEXT is not defined.
> + *
> + * The result can be a colon-separated list like "ko:ja:en".
> + */
> +const char *get_preferred_languages(void)
> +{
> +       const char *retval;
> +
> +       retval = getenv("LANGUAGE");
> +       if (retval && *retval)
> +               return retval;
> +
> +#ifndef NO_GETTEXT
> +       retval = setlocale(LC_MESSAGES, NULL);
> +       if (retval && *retval &&
> +               strcmp(retval, "C") &&
> +               strcmp(retval, "POSIX"))
> +               return retval;
> +#endif
> +
> +       return NULL;
> +}
> +
>  #ifdef GETTEXT_POISON
>  int use_gettext_poison(void)
>  {
> diff --git a/gettext.h b/gettext.h
> index 7671d09..e539482 100644
> --- a/gettext.h
> +++ b/gettext.h
> @@ -65,4 +65,6 @@ const char *Q_(const char *msgid, const char *plu, unsigned long n)
>  /* Mark msgid for translation but do not translate it. */
>  #define N_(msgid) msgid
>
> +const char *get_preferred_languages(void);
> +
>  #endif
> diff --git a/http.c b/http.c
> index 8b659b6..71ed418 100644
> --- a/http.c
> +++ b/http.c
> @@ -8,6 +8,7 @@
>  #include "credential.h"
>  #include "version.h"
>  #include "pkt-line.h"
> +#include "gettext.h"
>
>  int active_requests;
>  int http_is_verbose;
> --
> 2.3.1-280-g2531f2d
>
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v9 1/1] http: Add Accept-Language header if possible
  2015-02-26 21:47                                                 ` Stefan Beller
@ 2015-02-26 22:06                                                   ` Jeff King
  2015-02-26 22:07                                                     ` Jeff King
  2015-02-26 22:13                                                     ` Junio C Hamano
  0 siblings, 2 replies; 58+ messages in thread
From: Jeff King @ 2015-02-26 22:06 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Junio C Hamano, Yi EungJun, Git List, Yi EungJun, Peter Krefting,
	Michael Blume, Torsten Bögershausen

On Thu, Feb 26, 2015 at 01:47:34PM -0800, Stefan Beller wrote:

> On Thu, Feb 26, 2015 at 1:42 PM, Junio C Hamano <gitster@pobox.com> wrote:>
> > Here is what I queued.  Thanks.
> 
> I did not follow the thread if there are any intermediate patches,
> though it applied cleanly.

What Junio posted is missing the hunk to drop the old static definition
of get_preferred_languages from http.c.

-Peff

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v9 1/1] http: Add Accept-Language header if possible
  2015-02-26 22:06                                                   ` Jeff King
@ 2015-02-26 22:07                                                     ` Jeff King
  2015-02-26 22:26                                                       ` Stefan Beller
  2015-02-26 22:13                                                     ` Junio C Hamano
  1 sibling, 1 reply; 58+ messages in thread
From: Jeff King @ 2015-02-26 22:07 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Junio C Hamano, Yi EungJun, Git List, Yi EungJun, Peter Krefting,
	Michael Blume, Torsten Bögershausen

On Thu, Feb 26, 2015 at 05:06:10PM -0500, Jeff King wrote:

> On Thu, Feb 26, 2015 at 01:47:34PM -0800, Stefan Beller wrote:
> 
> > On Thu, Feb 26, 2015 at 1:42 PM, Junio C Hamano <gitster@pobox.com> wrote:>
> > > Here is what I queued.  Thanks.
> > 
> > I did not follow the thread if there are any intermediate patches,
> > though it applied cleanly.
> 
> What Junio posted is missing the hunk to drop the old static definition
> of get_preferred_languages from http.c.

Here it is, with the commit message and the missing hunk. This works for
me both with and without NO_GETTEXT defined.

-- >8 --
Subject: [PATCH] gettext.c: move get_preferred_languages() from http.c

Calling setlocale(LC_MESSAGES, ...) directly from http.c,
without including <locale.h>, was causing compilation
warnings.  Move the helper function to gettext.c that
already includes the header and where locale-related issues
are handled.

Signed-off-by: Jeff King <peff@peff.net>
---
 gettext.c | 25 +++++++++++++++++++++++++
 gettext.h |  2 ++
 http.c    | 27 +--------------------------
 3 files changed, 28 insertions(+), 26 deletions(-)

diff --git a/gettext.c b/gettext.c
index 8b2da46..7378ba2 100644
--- a/gettext.c
+++ b/gettext.c
@@ -18,6 +18,31 @@
 #	endif
 #endif
 
+/*
+ * Guess the user's preferred languages from the value in LANGUAGE environment
+ * variable and LC_MESSAGES locale category if NO_GETTEXT is not defined.
+ *
+ * The result can be a colon-separated list like "ko:ja:en".
+ */
+const char *get_preferred_languages(void)
+{
+	const char *retval;
+
+	retval = getenv("LANGUAGE");
+	if (retval && *retval)
+		return retval;
+
+#ifndef NO_GETTEXT
+	retval = setlocale(LC_MESSAGES, NULL);
+	if (retval && *retval &&
+		strcmp(retval, "C") &&
+		strcmp(retval, "POSIX"))
+		return retval;
+#endif
+
+	return NULL;
+}
+
 #ifdef GETTEXT_POISON
 int use_gettext_poison(void)
 {
diff --git a/gettext.h b/gettext.h
index dc1722d..5d8d2df 100644
--- a/gettext.h
+++ b/gettext.h
@@ -89,4 +89,6 @@ const char *Q_(const char *msgid, const char *plu, unsigned long n)
 #define N_(msgid) (msgid)
 #endif
 
+const char *get_preferred_languages();
+
 #endif
diff --git a/http.c b/http.c
index 0153fb0..9c825af 100644
--- a/http.c
+++ b/http.c
@@ -8,6 +8,7 @@
 #include "credential.h"
 #include "version.h"
 #include "pkt-line.h"
+#include "gettext.h"
 
 int active_requests;
 int http_is_verbose;
@@ -1002,32 +1003,6 @@ static void extract_content_type(struct strbuf *raw, struct strbuf *type,
 		strbuf_addstr(charset, "ISO-8859-1");
 }
 
-
-/*
- * Guess the user's preferred languages from the value in LANGUAGE environment
- * variable and LC_MESSAGES locale category if NO_GETTEXT is not defined.
- *
- * The result can be a colon-separated list like "ko:ja:en".
- */
-static const char *get_preferred_languages(void)
-{
-	const char *retval;
-
-	retval = getenv("LANGUAGE");
-	if (retval && *retval)
-		return retval;
-
-#ifndef NO_GETTEXT
-	retval = setlocale(LC_MESSAGES, NULL);
-	if (retval && *retval &&
-		strcmp(retval, "C") &&
-		strcmp(retval, "POSIX"))
-		return retval;
-#endif
-
-	return NULL;
-}
-
 static void write_accept_language(struct strbuf *buf)
 {
 	/*
-- 
2.3.0.449.g1690e78

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: [PATCH v9 1/1] http: Add Accept-Language header if possible
  2015-02-26 22:06                                                   ` Jeff King
  2015-02-26 22:07                                                     ` Jeff King
@ 2015-02-26 22:13                                                     ` Junio C Hamano
  1 sibling, 0 replies; 58+ messages in thread
From: Junio C Hamano @ 2015-02-26 22:13 UTC (permalink / raw)
  To: Jeff King
  Cc: Stefan Beller, Yi EungJun, Git List, Yi EungJun, Peter Krefting,
	Michael Blume, Torsten Bögershausen

Jeff King <peff@peff.net> writes:

> On Thu, Feb 26, 2015 at 01:47:34PM -0800, Stefan Beller wrote:
>
>> On Thu, Feb 26, 2015 at 1:42 PM, Junio C Hamano <gitster@pobox.com> wrote:>
>> > Here is what I queued.  Thanks.
>> 
>> I did not follow the thread if there are any intermediate patches,
>> though it applied cleanly.
>
> What Junio posted is missing the hunk to drop the old static definition
> of get_preferred_languages from http.c.

I am still scratching my head to see how this happened, but I think
when I did

    $ git checkout ye/http-accept-language
    $ git apply -3 $gmane/264422

I took the wrong side of the confict in http.c

Thanks both for noticing.  Now it is fixed up.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v9 1/1] http: Add Accept-Language header if possible
  2015-02-26 22:07                                                     ` Jeff King
@ 2015-02-26 22:26                                                       ` Stefan Beller
  2015-02-26 22:36                                                         ` Jeff King
  0 siblings, 1 reply; 58+ messages in thread
From: Stefan Beller @ 2015-02-26 22:26 UTC (permalink / raw)
  To: Jeff King
  Cc: Junio C Hamano, Yi EungJun, Git List, Yi EungJun, Peter Krefting,
	Michael Blume, Torsten Bögershausen

On Thu, Feb 26, 2015 at 2:07 PM, Jeff King <peff@peff.net> wrote:
>
> Here it is, with the commit message and the missing hunk. This works for
> me both with and without NO_GETTEXT defined.

This compiles here though a warning is spit:
In file included from cache.h:8:0,
                 from userdiff.c:1:
gettext.h:92:1: warning: function declaration isn't a prototype
[-Wstrict-prototypes]
 const char *get_preferred_languages();
 ^
so I guess I can still add a
Tested-by: Stefan Beller <sbeller@google.com>

>
> -- >8 --
> Subject: [PATCH] gettext.c: move get_preferred_languages() from http.c
>
> Calling setlocale(LC_MESSAGES, ...) directly from http.c,
> without including <locale.h>, was causing compilation
> warnings.  Move the helper function to gettext.c that
> already includes the header and where locale-related issues
> are handled.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  gettext.c | 25 +++++++++++++++++++++++++
>  gettext.h |  2 ++
>  http.c    | 27 +--------------------------
>  3 files changed, 28 insertions(+), 26 deletions(-)
>
> diff --git a/gettext.c b/gettext.c
> index 8b2da46..7378ba2 100644
> --- a/gettext.c
> +++ b/gettext.c
> @@ -18,6 +18,31 @@
>  #      endif
>  #endif
>
> +/*
> + * Guess the user's preferred languages from the value in LANGUAGE environment
> + * variable and LC_MESSAGES locale category if NO_GETTEXT is not defined.
> + *
> + * The result can be a colon-separated list like "ko:ja:en".
> + */
> +const char *get_preferred_languages(void)
> +{
> +       const char *retval;
> +
> +       retval = getenv("LANGUAGE");
> +       if (retval && *retval)
> +               return retval;
> +
> +#ifndef NO_GETTEXT
> +       retval = setlocale(LC_MESSAGES, NULL);
> +       if (retval && *retval &&
> +               strcmp(retval, "C") &&
> +               strcmp(retval, "POSIX"))
> +               return retval;
> +#endif
> +
> +       return NULL;
> +}
> +
>  #ifdef GETTEXT_POISON
>  int use_gettext_poison(void)
>  {
> diff --git a/gettext.h b/gettext.h
> index dc1722d..5d8d2df 100644
> --- a/gettext.h
> +++ b/gettext.h
> @@ -89,4 +89,6 @@ const char *Q_(const char *msgid, const char *plu, unsigned long n)
>  #define N_(msgid) (msgid)
>  #endif
>
> +const char *get_preferred_languages();
> +
>  #endif
> diff --git a/http.c b/http.c
> index 0153fb0..9c825af 100644
> --- a/http.c
> +++ b/http.c
> @@ -8,6 +8,7 @@
>  #include "credential.h"
>  #include "version.h"
>  #include "pkt-line.h"
> +#include "gettext.h"
>
>  int active_requests;
>  int http_is_verbose;
> @@ -1002,32 +1003,6 @@ static void extract_content_type(struct strbuf *raw, struct strbuf *type,
>                 strbuf_addstr(charset, "ISO-8859-1");
>  }
>
> -
> -/*
> - * Guess the user's preferred languages from the value in LANGUAGE environment
> - * variable and LC_MESSAGES locale category if NO_GETTEXT is not defined.
> - *
> - * The result can be a colon-separated list like "ko:ja:en".
> - */
> -static const char *get_preferred_languages(void)
> -{
> -       const char *retval;
> -
> -       retval = getenv("LANGUAGE");
> -       if (retval && *retval)
> -               return retval;
> -
> -#ifndef NO_GETTEXT
> -       retval = setlocale(LC_MESSAGES, NULL);
> -       if (retval && *retval &&
> -               strcmp(retval, "C") &&
> -               strcmp(retval, "POSIX"))
> -               return retval;
> -#endif
> -
> -       return NULL;
> -}
> -
>  static void write_accept_language(struct strbuf *buf)
>  {
>         /*
> --
> 2.3.0.449.g1690e78
>

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v9 1/1] http: Add Accept-Language header if possible
  2015-02-26 22:26                                                       ` Stefan Beller
@ 2015-02-26 22:36                                                         ` Jeff King
  2015-02-26 22:45                                                           ` Jeff King
  0 siblings, 1 reply; 58+ messages in thread
From: Jeff King @ 2015-02-26 22:36 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Junio C Hamano, Yi EungJun, Git List, Yi EungJun, Peter Krefting,
	Michael Blume, Torsten Bögershausen

On Thu, Feb 26, 2015 at 02:26:05PM -0800, Stefan Beller wrote:

> On Thu, Feb 26, 2015 at 2:07 PM, Jeff King <peff@peff.net> wrote:
> >
> > Here it is, with the commit message and the missing hunk. This works for
> > me both with and without NO_GETTEXT defined.
> 
> This compiles here though a warning is spit:
> In file included from cache.h:8:0,
>                  from userdiff.c:1:
> gettext.h:92:1: warning: function declaration isn't a prototype
> [-Wstrict-prototypes]
>  const char *get_preferred_languages();
>  ^

Hmph. The compiler is right that it should be:

 const char *get_preferred_languages(void);

but my gcc (4.9.2, with -Wstrict_prototypes) does not seem to notice it!
Weird.

-Peff

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v9 1/1] http: Add Accept-Language header if possible
  2015-02-26 22:36                                                         ` Jeff King
@ 2015-02-26 22:45                                                           ` Jeff King
  2015-02-26 23:29                                                             ` Junio C Hamano
  0 siblings, 1 reply; 58+ messages in thread
From: Jeff King @ 2015-02-26 22:45 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Junio C Hamano, Yi EungJun, Git List, Yi EungJun, Peter Krefting,
	Michael Blume, Torsten Bögershausen

On Thu, Feb 26, 2015 at 05:36:03PM -0500, Jeff King wrote:

> > [-Wstrict-prototypes]
> >  const char *get_preferred_languages();
> >  ^
> 
> Hmph. The compiler is right that it should be:
> 
>  const char *get_preferred_languages(void);
> 
> but my gcc (4.9.2, with -Wstrict_prototypes) does not seem to notice it!
> Weird.

Ugh. I have a snippet in my config.mak that relaxes the warnings on older
versions of git, and it was accidentally triggering due to a typo. :(

So that explains that. Junio, do you mind squashing in:

diff --git a/gettext.h b/gettext.h
index 5d8d2df..33696a4 100644
--- a/gettext.h
+++ b/gettext.h
@@ -89,6 +89,6 @@ const char *Q_(const char *msgid, const char *plu, unsigned long n)
 #define N_(msgid) (msgid)
 #endif
 
-const char *get_preferred_languages();
+const char *get_preferred_languages(void);
 
 #endif

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: [PATCH v9 1/1] http: Add Accept-Language header if possible
  2015-02-26 22:45                                                           ` Jeff King
@ 2015-02-26 23:29                                                             ` Junio C Hamano
  0 siblings, 0 replies; 58+ messages in thread
From: Junio C Hamano @ 2015-02-26 23:29 UTC (permalink / raw)
  To: Jeff King
  Cc: Stefan Beller, Yi EungJun, Git List, Yi EungJun, Peter Krefting,
	Michael Blume, Torsten Bögershausen

Jeff King <peff@peff.net> writes:

> On Thu, Feb 26, 2015 at 05:36:03PM -0500, Jeff King wrote:
>
>> > [-Wstrict-prototypes]
>> >  const char *get_preferred_languages();
>> >  ^
>> 
>> Hmph. The compiler is right that it should be:
>> 
>>  const char *get_preferred_languages(void);
>> 
>> but my gcc (4.9.2, with -Wstrict_prototypes) does not seem to notice it!
>> Weird.
>
> Ugh. I have a snippet in my config.mak that relaxes the warnings on older
> versions of git, and it was accidentally triggering due to a typo. :(
>
> So that explains that. Junio, do you mind squashing in:

Yup, I already did when I got the first one.

Thanks.

>
> diff --git a/gettext.h b/gettext.h
> index 5d8d2df..33696a4 100644
> --- a/gettext.h
> +++ b/gettext.h
> @@ -89,6 +89,6 @@ const char *Q_(const char *msgid, const char *plu, unsigned long n)
>  #define N_(msgid) (msgid)
>  #endif
>  
> -const char *get_preferred_languages();
> +const char *get_preferred_languages(void);
>  
>  #endif

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH] http: Include locale.h when using setlocale()
  2014-07-19 17:58 [PATCH v4 0/1] http: Add Accept-Language header if possible Yi EungJun
  2014-07-19 17:58 ` [PATCH v4 1/1] " Yi EungJun
  2014-12-02 12:12 ` [PATCH v5 0/1] " Yi EungJun
@ 2015-03-06 16:13 ` Ævar Arnfjörð Bjarmason
  2015-03-06 19:01   ` Junio C Hamano
  2 siblings, 1 reply; 58+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2015-03-06 16:13 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Yi EungJun, Ævar Arnfjörð Bjarmason

Since v2.3.0-rc1-37-gf18604b we've been using setlocale() here without
importing locale.h. Oddly enough this only causes issues for me under
-O0 on GCC & Clang. I.e. if I do:

    $ git clean -dxf; make -j 1 V=1 CFLAGS="-g -O0 -Wall" http.o

I'll get this on clang 3.5.0-6 & GCC 4.9.1-19 on Debian:

    http.c: In function ‘get_preferred_languages’:
    http.c:1021:2: warning: implicit declaration of function ‘setlocale’ [-Wimplicit-function-declaration]
      retval = setlocale(LC_MESSAGES, NULL);
      ^
    http.c:1021:21: error: ‘LC_MESSAGES’ undeclared (first use in this function)
      retval = setlocale(LC_MESSAGES, NULL);

But changing -O0 to -O1 or another optimization level makes the issue go
away. Odd, but in any case we should be including this header if we're
going to use the function, so just do that.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 http.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/http.c b/http.c
index 0153fb0..0606e6c 100644
--- a/http.c
+++ b/http.c
@@ -8,6 +8,9 @@
 #include "credential.h"
 #include "version.h"
 #include "pkt-line.h"
+#ifndef NO_GETTEXT
+#	include <locale.h>
+#endif
 
 int active_requests;
 int http_is_verbose;
-- 
2.1.3

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: [PATCH] http: Include locale.h when using setlocale()
  2015-03-06 16:13 ` [PATCH] http: Include locale.h when using setlocale() Ævar Arnfjörð Bjarmason
@ 2015-03-06 19:01   ` Junio C Hamano
  0 siblings, 0 replies; 58+ messages in thread
From: Junio C Hamano @ 2015-03-06 19:01 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, Yi EungJun

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> Since v2.3.0-rc1-37-gf18604b we've been using setlocale() here without
> importing locale.h. Oddly enough this only causes issues for me under
> -O0 on GCC & Clang.

Sorry for not making this entry in "What's cooking" report very
prominent:

    * ye/http-accept-language (2015-02-26) 1 commit
      (merged to 'next' on 2015-03-03 at 58d195e)
     + gettext.c: move get_preferred_languages() from http.c

     Compilation fix for a recent topic in 'master'.

     Will merge to 'master'.

This has cooked on 'next' for a few days, and is eligible to
graduate to 'master'.  Will be in the next update.

Thanks.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] http: Add Accept-Language header if possible
  2014-07-10 20:10         ` Jeff King
  2014-07-11  9:43           ` Yi, EungJun
@ 2014-07-12 10:11           ` Peter Krefting
  1 sibling, 0 replies; 58+ messages in thread
From: Peter Krefting @ 2014-07-12 10:11 UTC (permalink / raw)
  To: Jeff King; +Cc: Yi, EungJun, git

Jeff King:

> If that is the case, though, I wonder if we should actually be adding it
> as a git-protocol header so that all transports can benefit (i.e., we
> could be localizing human-readable error messages in upload-pack,
> receive-pack, etc).

That would be very nice, thre is a lot of language mixing going on at 
the moment.

-- 
\\// Peter - http://www.softwolves.pp.se/

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] http: Add Accept-Language header if possible
       [not found]     ` <CAPig+cQ05pzU9uVBqS8tBHvB8_3qAtgsPYz1sGhpa0W1CVymqA@mail.gmail.com>
@ 2014-07-11 16:35       ` Yi, EungJun
  0 siblings, 0 replies; 58+ messages in thread
From: Yi, EungJun @ 2014-07-11 16:35 UTC (permalink / raw)
  To: Eric Sunshine, git

2014-07-12 1:24 GMT+09:00 Eric Sunshine <sunshine@sunshineco.com>:
> On Fri, Jul 11, 2014 at 5:22 AM, Yi, EungJun <semtlenori@gmail.com> wrote:
>> 2014-07-09 6:52 GMT+09:00 Eric Sunshine <sunshine@sunshineco.com>:
>>>> +       grep "^Accept-Language: ko; q=1.000, en; q=0.999, \*; q=0.001" actual
>>>
>>> Do you want to \-escape the periods? (Or maybe use 'grep -F'?)
>>
>> I just want to match '*' character. I tried 'grep -F' but it does not help.
>
> I meant that the periods in your grep pattern are matching any
> character. If you want to be very strict, so that they match only
> period, then you should \-escape them.

Oops, I misunderstood you. You are right. I'll \- escape them.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] http: Add Accept-Language header if possible
  2014-07-10 20:10         ` Jeff King
@ 2014-07-11  9:43           ` Yi, EungJun
  2014-07-12 10:11           ` Peter Krefting
  1 sibling, 0 replies; 58+ messages in thread
From: Yi, EungJun @ 2014-07-11  9:43 UTC (permalink / raw)
  To: Jeff King; +Cc: Peter Krefting, git

2014-07-11 5:10 GMT+09:00 Jeff King <peff@peff.net>:
> On Wed, Jul 09, 2014 at 11:46:14AM +0100, Peter Krefting wrote:
>
>> Jeff King:
>>
>> >I did some digging, and I think the public API is setlocale with a NULL
>> >parameter, like:
>> >
>> > printf("%s\n", setlocale(LC_MESSAGES, NULL));
>> >
>> >That still will end up like "en_US.UTF-8", though;
>>
>> And it only yields the highest-priority language, I think.
>
> I wasn't clear on whether POSIX locale variables actually supported
> multiple languages with priorities. I have never seen that, though the
> original commit message indicated that LANGUAGE=x:y was a thing (I
> wasn't sure if that was a made-up thing, or something that libc actually
> supported).
>
>> Debian's website has a nice writeup on the subject:
>> http://www.debian.org/intro/cn#howtoset
>
> That seems to be about language settings in browsers, which are a much
> richer set of preferences than POSIX locales (I think).
>
> It would not be wrong to have that level of configuration for git's http
> requests, but I do not know if it is worth the effort. Mapping the
> user's gettext locale into an accept-language header seems like a
> straightforward way to communicate to the other side what the client is
> using to show errors (so that errors coming from the server can match).

Thanks for you advice. I'll write a path to use both of
setlocale(LC_MESSAGES, NULL) and getenv("LANGUAGE") to get the user's
preferred language. setlocale(LC_MESSAGES, NULL) is quite nice way
because it takes LC_ALL, LC_MESSAGES and LANG into account, but not
LANGUAGE. I think we should take also LANGUAGE into account as gettext
does. [1]

[1]: http://www.gnu.org/software/gettext/manual/gettext.html#Locale-Environment-Variables

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] http: Add Accept-Language header if possible
  2014-07-09 10:40 ` Peter Krefting
@ 2014-07-11  9:05   ` Yi, EungJun
  0 siblings, 0 replies; 58+ messages in thread
From: Yi, EungJun @ 2014-07-11  9:05 UTC (permalink / raw)
  To: Peter Krefting; +Cc: git, Yi EungJun

2014-07-09 19:40 GMT+09:00 Peter Krefting <peter@softwolves.pp.se>:
> Yi EungJun:
>
>
>> Example:
>>  LANGUAGE= -> ""
>>  LANGUAGE=ko -> "Accept-Language: ko; q=1.000, *; q=0.001"
>>  LANGUAGE=ko:en -> "Accept-Language: ko; q=1.000, en; q=0.999, *; q=0.001"
>
>
> Avoid adding "q=1.000". It is redundant (the default for any unqualified
> language names is 1.0, and additionally there has historically been some
> buggy servers that failed if it was included.

Ok, I'll fix it.

>
>
>> +       p1 = getenv("LANGUAGE");
>
>
> You need a fallback mechanism here to parse all the possible language
> variables. I would use the first one I find of these:
>
>  1. LANGUAGE
>  2. LC_ALL
>  3. LC_MESSAGES
>  4. LANG
>
> Only "LANGUAGE" holds a colon-separated list, but the same code can parse
> all of them, just yielding a single entry for the others.

I'll use setlocale(LC_MESSAGES, NULL) as well as getenv("LANGUAGE").

>
>
>> +                               strbuf_add(buf, p1, p2 - p1);
>
>
> The tokens are on the form language_COUNTRY.encoding@identifier, whereas
> Accept-Language wants language-COUNTRY, so you need to a) replace "_" with
> "-", and b) chop off anything following a "." or "@".
>
>
>> +                               strbuf_addf(buf, "; q=%.3f", q);
>> +                               q -= 0.001;
>
>
> Three decimals seems a bit overkill, but some experimentation might be
> necessary.

I'll use three decimals only if there are 100 or more preferred languages.

>
>
>> +               strbuf_addstr(buf, "*; q=0.001\r\n");
>
>
> You should probably also add an explicit "en" here, if none was already
> included. I've seen some servers break horribly if "en" isn't included.

I'll send Accept-Language only if there is at least one preferred
language. Is it enough?

Thanks for your review.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] http: Add Accept-Language header if possible
  2014-07-09 10:46       ` Peter Krefting
@ 2014-07-10 20:10         ` Jeff King
  2014-07-11  9:43           ` Yi, EungJun
  2014-07-12 10:11           ` Peter Krefting
  0 siblings, 2 replies; 58+ messages in thread
From: Jeff King @ 2014-07-10 20:10 UTC (permalink / raw)
  To: Peter Krefting; +Cc: Yi, EungJun, git

On Wed, Jul 09, 2014 at 11:46:14AM +0100, Peter Krefting wrote:

> Jeff King:
> 
> >I did some digging, and I think the public API is setlocale with a NULL
> >parameter, like:
> >
> > printf("%s\n", setlocale(LC_MESSAGES, NULL));
> >
> >That still will end up like "en_US.UTF-8", though;
> 
> And it only yields the highest-priority language, I think.

I wasn't clear on whether POSIX locale variables actually supported
multiple languages with priorities. I have never seen that, though the
original commit message indicated that LANGUAGE=x:y was a thing (I
wasn't sure if that was a made-up thing, or something that libc actually
supported).

> Debian's website has a nice writeup on the subject:
> http://www.debian.org/intro/cn#howtoset

That seems to be about language settings in browsers, which are a much
richer set of preferences than POSIX locales (I think).

It would not be wrong to have that level of configuration for git's http
requests, but I do not know if it is worth the effort. Mapping the
user's gettext locale into an accept-language header seems like a
straightforward way to communicate to the other side what the client is
using to show errors (so that errors coming from the server can match).

If that is the case, though, I wonder if we should actually be adding it
as a git-protocol header so that all transports can benefit (i.e., we
could be localizing human-readable error messages in upload-pack,
receive-pack, etc).

-Peff

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] http: Add Accept-Language header if possible
  2014-07-09  6:18     ` Jeff King
@ 2014-07-09 10:46       ` Peter Krefting
  2014-07-10 20:10         ` Jeff King
  0 siblings, 1 reply; 58+ messages in thread
From: Peter Krefting @ 2014-07-09 10:46 UTC (permalink / raw)
  To: Jeff King; +Cc: Yi, EungJun, git

Jeff King:

> I did some digging, and I think the public API is setlocale with a 
> NULL parameter, like:
>
>  printf("%s\n", setlocale(LC_MESSAGES, NULL));
>
> That still will end up like "en_US.UTF-8", though;

And it only yields the highest-priority language, I think.

> I couldn't find any standard functions for parsing that. It seems 
> like it would be pretty straightforward to do so, though.

RFC 5646 is the current specification on language tags, btw.

>  From my brief reading of rfc2616, that should probably become 
> "en-us", and any time we add "x-y", we may want to add "x" as a 
> fallback (that is certainly true for English; I don't know about 
> other languages with dialects).

Yes, adding the generic fallback is necessary, as "en-US" on the server 
matches a client's "en", but not vice versa. So if you request "en-US" 
and "de" and the server only has "en-GB" and "de", you'd get the 
"de" version.

Debian's website has a nice writeup on the subject: 
http://www.debian.org/intro/cn#howtoset

-- 
\\// Peter - http://www.softwolves.pp.se/

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] http: Add Accept-Language header if possible
  2014-07-08 15:54 [PATCH] http: Add Accept-Language header if possible Yi EungJun
  2014-07-08 21:52 ` Eric Sunshine
  2014-07-09  5:10 ` Jeff King
@ 2014-07-09 10:40 ` Peter Krefting
  2014-07-11  9:05   ` Yi, EungJun
  2 siblings, 1 reply; 58+ messages in thread
From: Peter Krefting @ 2014-07-09 10:40 UTC (permalink / raw)
  To: Yi EungJun; +Cc: git, Yi EungJun

Yi EungJun:

> Example:
>  LANGUAGE= -> ""
>  LANGUAGE=ko -> "Accept-Language: ko; q=1.000, *; q=0.001"
>  LANGUAGE=ko:en -> "Accept-Language: ko; q=1.000, en; q=0.999, *; q=0.001"

Avoid adding "q=1.000". It is redundant (the default for any 
unqualified language names is 1.0, and additionally there has 
historically been some buggy servers that failed if it was included.

> +	p1 = getenv("LANGUAGE");

You need a fallback mechanism here to parse all the possible language 
variables. I would use the first one I find of these:

  1. LANGUAGE
  2. LC_ALL
  3. LC_MESSAGES
  4. LANG

Only "LANGUAGE" holds a colon-separated list, but the same code can 
parse all of them, just yielding a single entry for the others.

> +				strbuf_add(buf, p1, p2 - p1);

The tokens are on the form language_COUNTRY.encoding@identifier, whereas 
Accept-Language wants language-COUNTRY, so you need to a) replace "_" 
with "-", and b) chop off anything following a "." or "@".

> +				strbuf_addf(buf, "; q=%.3f", q);
> +				q -= 0.001;

Three decimals seems a bit overkill, but some experimentation might be 
necessary.

> +		strbuf_addstr(buf, "*; q=0.001\r\n");

You should probably also add an explicit "en" here, if none was 
already included. I've seen some servers break horribly if "en" isn't 
included.



For reference, I have my LANGUAGE variable set to 
"sv_SE.utf8:sv:nb_NO.utf8:nb:da_DK.utf8:da:nn_NO.utf8:nn:en_GB.utf8:en_US.utf8:en"

-- 
\\// Peter - http://www.softwolves.pp.se/

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] http: Add Accept-Language header if possible
  2014-07-09  5:46   ` Yi, EungJun
@ 2014-07-09  6:18     ` Jeff King
  2014-07-09 10:46       ` Peter Krefting
  0 siblings, 1 reply; 58+ messages in thread
From: Jeff King @ 2014-07-09  6:18 UTC (permalink / raw)
  To: Yi, EungJun; +Cc: git

On Wed, Jul 09, 2014 at 02:46:35PM +0900, Yi, EungJun wrote:

> I agree with you. In fact, I tried to get user's preferred language in
> the same way as gettext. It has guess_category_value() to do that and
> the function is good enough because it considers $LANGUAGE, $LC_ALL,
> $LANG, and also system-dependent preferences. But the function does
> not seem a public API and I don't know how can I use the function in
> Git. So I chose to use $LANGUAGE only.

I did some digging, and I think the public API is setlocale with a NULL
parameter, like:

  printf("%s\n", setlocale(LC_MESSAGES, NULL));

That still will end up like "en_US.UTF-8", though; I couldn't find any
standard functions for parsing that. It seems like it would be pretty
straightforward to do so, though.

>From my brief reading of rfc2616, that should probably become "en-us",
and any time we add "x-y", we may want to add "x" as a fallback (that is
certainly true for English; I don't know about other languages with
dialects).

-Peff

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] http: Add Accept-Language header if possible
  2014-07-09  5:10 ` Jeff King
@ 2014-07-09  5:46   ` Yi, EungJun
  2014-07-09  6:18     ` Jeff King
  0 siblings, 1 reply; 58+ messages in thread
From: Yi, EungJun @ 2014-07-09  5:46 UTC (permalink / raw)
  To: Jeff King; +Cc: git

2014-07-09 14:10 GMT+09:00 Jeff King <peff@peff.net>:
> On Wed, Jul 09, 2014 at 12:54:06AM +0900, Yi EungJun wrote:
>
>> From: Yi EungJun <eungjun.yi@navercorp.com>
>>
>> Add an Accept-Language header which indicates the user's preferred
>> languages defined by 'LANGUAGE' environment variable if the variable is
>> not empty.
>>
>> Example:
>>   LANGUAGE= -> ""
>>   LANGUAGE=ko -> "Accept-Language: ko; q=1.000, *; q=0.001"
>>   LANGUAGE=ko:en -> "Accept-Language: ko; q=1.000, en; q=0.999, *; q=0.001"
>>
>> This gives git servers a chance to display remote error messages in
>> the user's preferred language.
>
> Should this also take into account other language-related variables? I'd
> think $LC_ALL, $LC_MESSAGES, and $LANG would affect it, too.  Are
> colon-separated values a standard in $LANGUAGE? I have never seen them,
> but I admit I am not very knowledgeable about localization issues.
>
> Also, we do we need to do more parsing? My $LANG is set to en_US.UTF-8.
> The encoding part is presumably uninteresting to the remote server.  I
> also wonder if there are support functions in libc or as part of gettext
> that can help us get these values.
>
> -Peff

I agree with you. In fact, I tried to get user's preferred language in
the same way as gettext. It has guess_category_value() to do that and
the function is good enough because it considers $LANGUAGE, $LC_ALL,
$LANG, and also system-dependent preferences. But the function does
not seem a public API and I don't know how can I use the function in
Git. So I chose to use $LANGUAGE only.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] http: Add Accept-Language header if possible
  2014-07-08 15:54 [PATCH] http: Add Accept-Language header if possible Yi EungJun
  2014-07-08 21:52 ` Eric Sunshine
@ 2014-07-09  5:10 ` Jeff King
  2014-07-09  5:46   ` Yi, EungJun
  2014-07-09 10:40 ` Peter Krefting
  2 siblings, 1 reply; 58+ messages in thread
From: Jeff King @ 2014-07-09  5:10 UTC (permalink / raw)
  To: Yi EungJun; +Cc: git, Yi EungJun

On Wed, Jul 09, 2014 at 12:54:06AM +0900, Yi EungJun wrote:

> From: Yi EungJun <eungjun.yi@navercorp.com>
> 
> Add an Accept-Language header which indicates the user's preferred
> languages defined by 'LANGUAGE' environment variable if the variable is
> not empty.
> 
> Example:
>   LANGUAGE= -> ""
>   LANGUAGE=ko -> "Accept-Language: ko; q=1.000, *; q=0.001"
>   LANGUAGE=ko:en -> "Accept-Language: ko; q=1.000, en; q=0.999, *; q=0.001"
> 
> This gives git servers a chance to display remote error messages in
> the user's preferred language.

Should this also take into account other language-related variables? I'd
think $LC_ALL, $LC_MESSAGES, and $LANG would affect it, too.  Are
colon-separated values a standard in $LANGUAGE? I have never seen them,
but I admit I am not very knowledgeable about localization issues.

Also, we do we need to do more parsing? My $LANG is set to en_US.UTF-8.
The encoding part is presumably uninteresting to the remote server.  I
also wonder if there are support functions in libc or as part of gettext
that can help us get these values.

-Peff

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] http: Add Accept-Language header if possible
  2014-07-08 15:54 [PATCH] http: Add Accept-Language header if possible Yi EungJun
@ 2014-07-08 21:52 ` Eric Sunshine
       [not found]   ` <CAFT+Tg-6fR9OJ93TT7ww3x=zYHY=Dh5u-7owgQMBK5o_JKLEaA@mail.gmail.com>
  2014-07-09  5:10 ` Jeff King
  2014-07-09 10:40 ` Peter Krefting
  2 siblings, 1 reply; 58+ messages in thread
From: Eric Sunshine @ 2014-07-08 21:52 UTC (permalink / raw)
  To: Yi EungJun; +Cc: Git List, Yi EungJun

On Tue, Jul 8, 2014 at 11:54 AM, Yi EungJun <semtlenori@gmail.com> wrote:
> From: Yi EungJun <eungjun.yi@navercorp.com>
>
> Add an Accept-Language header which indicates the user's preferred
> languages defined by 'LANGUAGE' environment variable if the variable is
> not empty.
>
> Example:
>   LANGUAGE= -> ""
>   LANGUAGE=ko -> "Accept-Language: ko; q=1.000, *; q=0.001"
>   LANGUAGE=ko:en -> "Accept-Language: ko; q=1.000, en; q=0.999, *; q=0.001"
>
> This gives git servers a chance to display remote error messages in
> the user's preferred language.
> ---
>  http.c                     | 43 +++++++++++++++++++++++++++++++++++++++++++
>  t/t5550-http-fetch-dumb.sh | 10 ++++++++++
>  2 files changed, 53 insertions(+)
>
> diff --git a/http.c b/http.c
> index 3a28b21..c345616 100644
> --- a/http.c
> +++ b/http.c
> @@ -983,6 +983,47 @@ static void extract_content_type(struct strbuf *raw, struct strbuf *type,
>                 strbuf_addstr(charset, "ISO-8859-1");
>  }
>
> +/*
> + * Add an Accept-Language header which indicates user's preferred languages
> + * defined by 'LANGUAGE' environment variable if the variable is not empty.
> + *
> + * Example:
> + *   LANGUAGE= -> ""
> + *   LANGUAGE=ko -> "Accept-Language: ko; q=1.000, *; q=0.001"
> + *   LANGUAGE=ko:en -> "Accept-Language: ko; q=1.000, en; q=0.999, *; q=0.001"
> + */
> +static void add_accept_language(struct strbuf *buf)
> +{
> +       const char *p1, *p2;
> +       float q = 1.000;
> +
> +       p1 = getenv("LANGUAGE");
> +
> +       if (p1 != NULL && p1[0] != '\0') {
> +               strbuf_reset(buf);

It seems wrong to clear 'buf' in a function named add_accept_language().

> +               strbuf_addstr(buf, "Accept-Language: ");
> +               for (p2 = p1; q > 0.001; p2++) {
> +                       if ((*p2 == ':' || *p2 == '\0') && p1 != p2) {
> +                               if (q < 1.0) {
> +                                       strbuf_addstr(buf, ", ");
> +                               }
> +                               strbuf_add(buf, p1, p2 - p1);
> +                               strbuf_addf(buf, "; q=%.3f", q);
> +                               q -= 0.001;
> +                               p1 = p2 + 1;
> +
> +                               if (*p2 == '\0') {
> +                                       break;
> +                               }
> +                       }
> +               }
> +               if (q < 1.0) {
> +                       strbuf_addstr(buf, ", ");
> +               }
> +               strbuf_addstr(buf, "*; q=0.001\r\n");

Manually adding "\r\n" is contraindicated. Headers passed to
curl_easy_setopt(c, CURLOPT_HTTPHEADER, headers) must not have "\r\n",
since curl will add terminators itself [1].

[1]: http://curl.haxx.se/libcurl/c/CURLOPT_HTTPHEADER.html

> +       }
> +}
> +
>  /* http_request() targets */
>  #define HTTP_REQUEST_STRBUF    0
>  #define HTTP_REQUEST_FILE      1
> @@ -1020,6 +1061,8 @@ static int http_request(const char *url,
>                                          fwrite_buffer);
>         }
>
> +       add_accept_language(&buf);

This is inconsistent with how other headers are handled by this
function. The existing idiom is:

    strbuf_add(&buf, ...); /* construct header */
    headers = curl_slist_apend(headers, buf.buf);
    strbuf_reset(&buf);

> +
>         strbuf_addstr(&buf, "Pragma:");
>         if (options && options->no_cache)
>                 strbuf_addstr(&buf, " no-cache");
> diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh
> index ac71418..ea15158 100755
> --- a/t/t5550-http-fetch-dumb.sh
> +++ b/t/t5550-http-fetch-dumb.sh
> @@ -196,5 +196,15 @@ test_expect_success 'reencoding is robust to whitespace oddities' '
>         grep "this is the error message" stderr
>  '
>
> +test_expect_success 'git client sends Accept-Language' '
> +       GIT_CURL_VERBOSE=1 LANGUAGE=ko:en git clone "$HTTPD_URL/accept/language" 2>actual

Broken &&-chain.

> +       grep "^Accept-Language: ko; q=1.000, en; q=0.999, \*; q=0.001" actual

Do you want to \-escape the periods? (Or maybe use 'grep -F'?)

> +'
> +
> +test_expect_success 'git client does not send Accept-Language' '
> +       GIT_CURL_VERBOSE=1 LANGUAGE= git clone "$HTTPD_URL/accept/language" 2>actual

Broken &&-chain.

> +       test_must_fail grep "^Accept-Language:" actual
> +'
> +
>  stop_httpd
>  test_done
> --
> 2.0.1.473.gafdefd9.dirty

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH] http: Add Accept-Language header if possible
@ 2014-07-08 15:54 Yi EungJun
  2014-07-08 21:52 ` Eric Sunshine
                   ` (2 more replies)
  0 siblings, 3 replies; 58+ messages in thread
From: Yi EungJun @ 2014-07-08 15:54 UTC (permalink / raw)
  To: git; +Cc: Yi EungJun

From: Yi EungJun <eungjun.yi@navercorp.com>

Add an Accept-Language header which indicates the user's preferred
languages defined by 'LANGUAGE' environment variable if the variable is
not empty.

Example:
  LANGUAGE= -> ""
  LANGUAGE=ko -> "Accept-Language: ko; q=1.000, *; q=0.001"
  LANGUAGE=ko:en -> "Accept-Language: ko; q=1.000, en; q=0.999, *; q=0.001"

This gives git servers a chance to display remote error messages in
the user's preferred language.
---
 http.c                     | 43 +++++++++++++++++++++++++++++++++++++++++++
 t/t5550-http-fetch-dumb.sh | 10 ++++++++++
 2 files changed, 53 insertions(+)

diff --git a/http.c b/http.c
index 3a28b21..c345616 100644
--- a/http.c
+++ b/http.c
@@ -983,6 +983,47 @@ static void extract_content_type(struct strbuf *raw, struct strbuf *type,
 		strbuf_addstr(charset, "ISO-8859-1");
 }
 
+/*
+ * Add an Accept-Language header which indicates user's preferred languages
+ * defined by 'LANGUAGE' environment variable if the variable is not empty.
+ *
+ * Example:
+ *   LANGUAGE= -> ""
+ *   LANGUAGE=ko -> "Accept-Language: ko; q=1.000, *; q=0.001"
+ *   LANGUAGE=ko:en -> "Accept-Language: ko; q=1.000, en; q=0.999, *; q=0.001"
+ */
+static void add_accept_language(struct strbuf *buf)
+{
+	const char *p1, *p2;
+	float q = 1.000;
+
+	p1 = getenv("LANGUAGE");
+
+	if (p1 != NULL && p1[0] != '\0') {
+		strbuf_reset(buf);
+		strbuf_addstr(buf, "Accept-Language: ");
+		for (p2 = p1; q > 0.001; p2++) {
+			if ((*p2 == ':' || *p2 == '\0') && p1 != p2) {
+				if (q < 1.0) {
+					strbuf_addstr(buf, ", ");
+				}
+				strbuf_add(buf, p1, p2 - p1);
+				strbuf_addf(buf, "; q=%.3f", q);
+				q -= 0.001;
+				p1 = p2 + 1;
+
+				if (*p2 == '\0') {
+					break;
+				}
+			}
+		}
+		if (q < 1.0) {
+			strbuf_addstr(buf, ", ");
+		}
+		strbuf_addstr(buf, "*; q=0.001\r\n");
+	}
+}
+
 /* http_request() targets */
 #define HTTP_REQUEST_STRBUF	0
 #define HTTP_REQUEST_FILE	1
@@ -1020,6 +1061,8 @@ static int http_request(const char *url,
 					 fwrite_buffer);
 	}
 
+	add_accept_language(&buf);
+
 	strbuf_addstr(&buf, "Pragma:");
 	if (options && options->no_cache)
 		strbuf_addstr(&buf, " no-cache");
diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh
index ac71418..ea15158 100755
--- a/t/t5550-http-fetch-dumb.sh
+++ b/t/t5550-http-fetch-dumb.sh
@@ -196,5 +196,15 @@ test_expect_success 'reencoding is robust to whitespace oddities' '
 	grep "this is the error message" stderr
 '
 
+test_expect_success 'git client sends Accept-Language' '
+	GIT_CURL_VERBOSE=1 LANGUAGE=ko:en git clone "$HTTPD_URL/accept/language" 2>actual
+	grep "^Accept-Language: ko; q=1.000, en; q=0.999, \*; q=0.001" actual
+'
+
+test_expect_success 'git client does not send Accept-Language' '
+	GIT_CURL_VERBOSE=1 LANGUAGE= git clone "$HTTPD_URL/accept/language" 2>actual
+	test_must_fail grep "^Accept-Language:" actual
+'
+
 stop_httpd
 test_done
-- 
2.0.1.473.gafdefd9.dirty

^ permalink raw reply related	[flat|nested] 58+ messages in thread

end of thread, other threads:[~2015-03-06 19:01 UTC | newest]

Thread overview: 58+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-19 17:58 [PATCH v4 0/1] http: Add Accept-Language header if possible Yi EungJun
2014-07-19 17:58 ` [PATCH v4 1/1] " Yi EungJun
2014-07-21 19:01   ` Junio C Hamano
2014-08-03  7:35     ` Yi, EungJun
2014-12-02 12:12 ` [PATCH v5 0/1] " Yi EungJun
2014-12-02 12:12   ` [PATCH v5 1/1] " Yi EungJun
2014-12-03 18:22     ` Junio C Hamano
2014-12-03 19:31     ` Eric Sunshine
2014-12-03 21:37       ` Junio C Hamano
2014-12-03 22:00         ` Michael Blume
2014-12-03 22:06           ` Michael Blume
2014-12-22 16:44             ` [PATCH v6 0/1] " Yi EungJun
2014-12-22 16:44               ` [PATCH v6 1/1] " Yi EungJun
2014-12-22 19:34                 ` Junio C Hamano
2014-12-24 20:35                 ` Eric Sunshine
2014-12-29 16:18                   ` Junio C Hamano
2015-01-18 12:23                     ` [PATCH v7 0/1] " Yi EungJun
2015-01-18 12:26                       ` [PATCH v7 1/1] " Yi EungJun
2015-01-18 15:14                         ` Torsten Bögershausen
2015-01-19 20:21                         ` [PATCH v6 0/1] " Eric Sunshine
2015-01-22  7:54                         ` [PATCH v7 1/1] " Junio C Hamano
2015-01-27 15:51                           ` [PATCH v8 0/1] " Yi EungJun
2015-01-27 15:51                             ` [PATCH] " Yi EungJun
2015-01-27 23:34                               ` Junio C Hamano
2015-01-28  6:15                                 ` Junio C Hamano
2015-01-28 11:59                                   ` Yi, EungJun
2015-01-28 12:04                                   ` [PATCH v9 0/1] " Yi EungJun
2015-01-28 12:04                                     ` [PATCH v9 1/1] " Yi EungJun
2015-02-25 22:52                                       ` Junio C Hamano
2015-02-26  3:04                                         ` Jeff King
2015-02-26  3:10                                           ` Jeff King
2015-02-26 20:59                                           ` Junio C Hamano
2015-02-26 21:33                                             ` Jeff King
2015-02-26 21:42                                               ` Junio C Hamano
2015-02-26 21:47                                                 ` Stefan Beller
2015-02-26 22:06                                                   ` Jeff King
2015-02-26 22:07                                                     ` Jeff King
2015-02-26 22:26                                                       ` Stefan Beller
2015-02-26 22:36                                                         ` Jeff King
2015-02-26 22:45                                                           ` Jeff King
2015-02-26 23:29                                                             ` Junio C Hamano
2015-02-26 22:13                                                     ` Junio C Hamano
2015-01-29  6:19                                     ` [PATCH v9 0/1] " Junio C Hamano
2015-01-30 17:23                                       ` Yi, EungJun
2015-03-06 16:13 ` [PATCH] http: Include locale.h when using setlocale() Ævar Arnfjörð Bjarmason
2015-03-06 19:01   ` Junio C Hamano
  -- strict thread matches above, loose matches on Subject: below --
2014-07-08 15:54 [PATCH] http: Add Accept-Language header if possible Yi EungJun
2014-07-08 21:52 ` Eric Sunshine
     [not found]   ` <CAFT+Tg-6fR9OJ93TT7ww3x=zYHY=Dh5u-7owgQMBK5o_JKLEaA@mail.gmail.com>
     [not found]     ` <CAPig+cQ05pzU9uVBqS8tBHvB8_3qAtgsPYz1sGhpa0W1CVymqA@mail.gmail.com>
2014-07-11 16:35       ` Yi, EungJun
2014-07-09  5:10 ` Jeff King
2014-07-09  5:46   ` Yi, EungJun
2014-07-09  6:18     ` Jeff King
2014-07-09 10:46       ` Peter Krefting
2014-07-10 20:10         ` Jeff King
2014-07-11  9:43           ` Yi, EungJun
2014-07-12 10:11           ` Peter Krefting
2014-07-09 10:40 ` Peter Krefting
2014-07-11  9:05   ` Yi, EungJun

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.