* [PATCH v4 0/1] http: Add Accept-Language header if possible @ 2014-07-19 17:58 Yi EungJun 2014-07-19 17:58 ` [PATCH v4 1/1] " Yi EungJun ` (2 more replies) 0 siblings, 3 replies; 58+ messages in thread From: Yi EungJun @ 2014-07-19 17:58 UTC (permalink / raw) To: Git List Cc: Yi EungJun, Eric Sunshine, Jeff King, Peter Krefting, Junio C Hamano Changes since v3: * Fix styles and syntax. (Thanks to Jeff King and Eric Sunshine) * Cache Accept-Language header. (Thanks to Jeff King) * Remove floating point numbers. (Thanks to Junio C Hamano) * Make the for-loop to get the value of the header simpler. * Add more comments. Yi EungJun (1): http: Add Accept-Language header if possible http.c | 134 +++++++++++++++++++++++++++++++++++++++++++++ remote-curl.c | 2 + t/t5550-http-fetch-dumb.sh | 31 +++++++++++ 3 files changed, 167 insertions(+) -- 2.0.1.473.g731ddce.dirty ^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH v4 1/1] http: Add Accept-Language header if possible 2014-07-19 17:58 [PATCH v4 0/1] http: Add Accept-Language header if possible Yi EungJun @ 2014-07-19 17:58 ` Yi EungJun 2014-07-21 19:01 ` Junio C Hamano 2014-12-02 12:12 ` [PATCH v5 0/1] " Yi EungJun 2015-03-06 16:13 ` [PATCH] http: Include locale.h when using setlocale() Ævar Arnfjörð Bjarmason 2 siblings, 1 reply; 58+ messages in thread From: Yi EungJun @ 2014-07-19 17:58 UTC (permalink / raw) To: Git List Cc: Yi EungJun, Eric Sunshine, Jeff King, Peter Krefting, Junio C Hamano From: Yi EungJun <eungjun.yi@navercorp.com> Add an Accept-Language header which indicates the user's preferred languages defined by $LANGUAGE, $LC_ALL, $LC_MESSAGES and $LANG. Examples: LANGUAGE= -> "" LANGUAGE=ko:en -> "Accept-Language: ko, en; q=0.9, *; q=0.1" LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *; q=0.1" LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *; q=0.1" This gives git servers a chance to display remote error messages in the user's preferred language. Signed-off-by: Yi EungJun <eungjun.yi@navercorp.com> --- http.c | 134 +++++++++++++++++++++++++++++++++++++++++++++ remote-curl.c | 2 + t/t5550-http-fetch-dumb.sh | 31 +++++++++++ 3 files changed, 167 insertions(+) diff --git a/http.c b/http.c index 3a28b21..ed4e8e1 100644 --- a/http.c +++ b/http.c @@ -67,6 +67,8 @@ static struct curl_slist *no_pragma_header; static struct active_request_slot *active_queue_head; +static struct strbuf *cached_accept_language = NULL; + size_t fread_buffer(char *ptr, size_t eltsize, size_t nmemb, void *buffer_) { size_t size = eltsize * nmemb; @@ -512,6 +514,9 @@ void http_cleanup(void) cert_auth.password = NULL; } ssl_cert_password_required = 0; + + if (cached_accept_language) + strbuf_release(cached_accept_language); } struct active_request_slot *get_active_slot(void) @@ -983,6 +988,129 @@ static void extract_content_type(struct strbuf *raw, struct strbuf *type, strbuf_addstr(charset, "ISO-8859-1"); } +/* + * Guess the user's preferred languages from the value in LANGUAGE environment + * variable and LC_MESSAGES locale category. + * + * The result can be a colon-separated list like "ko:ja:en". + */ +static const char *get_preferred_languages(void) +{ + const char *retval; + + retval = getenv("LANGUAGE"); + if (retval && *retval) + return retval; + + retval = setlocale(LC_MESSAGES, NULL); + if (retval && *retval && + strcmp(retval, "C") && + strcmp(retval, "POSIX")) + return retval; + + return NULL; +} + +/* + * Get an Accept-Language header which indicates user's preferred languages. + * + * Examples: + * LANGUAGE= -> "" + * LANGUAGE=ko:en -> "Accept-Language: ko, en; q=0.9, *; q=0.1" + * LANGUAGE=ko_KR.UTF-8:sr@latin -> "Accept-Language: ko-KR, sr; q=0.9, *; q=0.1" + * LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *; q=0.1" + * LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *; q=0.1" + * LANGUAGE= LANG=C -> "" + */ +static struct strbuf *get_accept_language(void) +{ + const char *lang_begin, *pos; + int q, max_q; + int num_langs; + int decimal_places; + int is_codeset_or_modifier = 0; + static struct strbuf buf = STRBUF_INIT; + struct strbuf q_format_buf = STRBUF_INIT; + char *q_format; + + if (cached_accept_language) + return cached_accept_language; + + lang_begin = get_preferred_languages(); + + /* Don't add Accept-Language header if no language is preferred. */ + if (!(lang_begin && *lang_begin)) { + cached_accept_language = &buf; + return cached_accept_language; + } + + /* Count number of preferred lang_begin to decide precision of q-factor */ + for (num_langs = 1, pos = lang_begin; *pos; pos++) + if (*pos == ':') + num_langs++; + + /* Decide the precision for q-factor on number of preferred lang_begin. */ + num_langs += 1; /* for '*' */ + decimal_places = 1 + (num_langs > 10) + (num_langs > 100); + strbuf_addf(&q_format_buf, "; q=0.%%0%dd", decimal_places); + q_format = strbuf_detach(&q_format_buf, NULL); + for (max_q = 1; decimal_places-- > 0;) max_q *= 10; + q = max_q; + + strbuf_addstr(&buf, "Accept-Language: "); + + /* + * Convert a list of colon-separated locale values [1][2] to a list of + * comma-separated language tags [3] which can be used as a value of + * Accept-Language header. + * + * [1]: http://pubs.opengroup.org/onlinepubs/007908799/xbd/envvar.html + * [2]: http://www.gnu.org/software/libc/manual/html_node/Using-gettextized-software.html + * [3]: http://tools.ietf.org/html/rfc7231#section-5.3.5 + */ + for (pos = lang_begin; ; pos++) { + if (*pos == ':' || !*pos) { + /* Ignore if this character is the first one. */ + if (pos == lang_begin) + continue; + + is_codeset_or_modifier = 0; + + /* Put a q-factor only if it is less than 1.0. */ + if (q < max_q) + strbuf_addf(&buf, q_format, q); + + if (q > 1) + q--; + + /* NULL pos means this is the last language. */ + if (*pos) + strbuf_addstr(&buf, ", "); + else + break; + + } else if (is_codeset_or_modifier) + continue; + else if (*pos == '.' || *pos == '@') /* Remove .codeset and @modifier. */ + is_codeset_or_modifier = 1; + else + strbuf_addch(&buf, *pos == '_' ? '-' : *pos); + } + + /* Don't add Accept-Language header if no language is preferred. */ + if (q >= max_q) { + cached_accept_language = &buf; + return cached_accept_language; + } + + /* Add '*' with minimum q-factor greater than 0.0. */ + strbuf_addstr(&buf, ", *"); + strbuf_addf(&buf, q_format, 1); + + cached_accept_language = &buf; + return cached_accept_language; +} + /* http_request() targets */ #define HTTP_REQUEST_STRBUF 0 #define HTTP_REQUEST_FILE 1 @@ -995,6 +1123,7 @@ static int http_request(const char *url, struct slot_results results; struct curl_slist *headers = NULL; struct strbuf buf = STRBUF_INIT; + struct strbuf* accept_language; int ret; slot = get_active_slot(); @@ -1020,6 +1149,11 @@ static int http_request(const char *url, fwrite_buffer); } + accept_language = get_accept_language(); + + if (accept_language && accept_language->len > 0) + headers = curl_slist_append(headers, accept_language->buf); + strbuf_addstr(&buf, "Pragma:"); if (options && options->no_cache) strbuf_addstr(&buf, " no-cache"); diff --git a/remote-curl.c b/remote-curl.c index 4493b38..07f2a5d 100644 --- a/remote-curl.c +++ b/remote-curl.c @@ -946,6 +946,8 @@ int main(int argc, const char **argv) struct strbuf buf = STRBUF_INIT; int nongit; + git_setup_gettext(); + git_extract_argv0_path(argv[0]); setup_git_directory_gently(&nongit); if (argc < 2) { diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh index ac71418..d2dac44 100755 --- a/t/t5550-http-fetch-dumb.sh +++ b/t/t5550-http-fetch-dumb.sh @@ -196,5 +196,36 @@ test_expect_success 'reencoding is robust to whitespace oddities' ' grep "this is the error message" stderr ' +check_language () { + echo "Accept-Language: $1\r" >expect && + test_must_fail env \ + GIT_CURL_VERBOSE=1 \ + LANGUAGE=$2 \ + LC_ALL=$3 \ + LC_MESSAGES=$4 \ + LANG=$5 \ + git clone "$HTTPD_URL/accept/language" 2>stderr && + grep -i ^Accept-Language: stderr >actual && + test_cmp expect actual +} + +test_expect_success 'git client sends Accept-Language based on LANGUAGE, LC_ALL, LC_MESSAGES and LANG' ' + check_language "ko-KR, *; q=0.1" ko_KR.UTF-8 de_DE.UTF-8 ja_JP.UTF-8 en_US.UTF-8 && + check_language "de-DE, *; q=0.1" "" de_DE.UTF-8 ja_JP.UTF-8 en_US.UTF-8 && + check_language "ja-JP, *; q=0.1" "" "" ja_JP.UTF-8 en_US.UTF-8 && + check_language "en-US, *; q=0.1" "" "" "" en_US.UTF-8 +' + +test_expect_success 'git client sends Accept-Language with many preferred languages' ' + check_language "ko-KR, en-US; q=0.99, fr-CA; q=0.98, de; q=0.97, sr; q=0.96, \ +ja; q=0.95, zh; q=0.94, sv; q=0.93, pt; q=0.92, nb; q=0.91, *; q=0.01" \ + ko_KR.EUC-KR:en_US.UTF-8:fr_CA:de.UTF-8@euro:sr@latin:ja:zh:sv:pt:nb +' + +test_expect_success 'git client does not send Accept-Language' ' + test_must_fail env GIT_CURL_VERBOSE=1 LANGUAGE= git clone "$HTTPD_URL/accept/language" 2>stderr && + ! grep "^Accept-Language:" stderr +' + stop_httpd test_done -- 2.0.1.473.g731ddce.dirty ^ permalink raw reply related [flat|nested] 58+ messages in thread
* Re: [PATCH v4 1/1] http: Add Accept-Language header if possible 2014-07-19 17:58 ` [PATCH v4 1/1] " Yi EungJun @ 2014-07-21 19:01 ` Junio C Hamano 2014-08-03 7:35 ` Yi, EungJun 0 siblings, 1 reply; 58+ messages in thread From: Junio C Hamano @ 2014-07-21 19:01 UTC (permalink / raw) To: Yi EungJun; +Cc: Git List, Yi EungJun, Eric Sunshine, Jeff King, Peter Krefting Yi EungJun <semtlenori@gmail.com> writes: > From: Yi EungJun <eungjun.yi@navercorp.com> > > Add an Accept-Language header which indicates the user's preferred > languages defined by $LANGUAGE, $LC_ALL, $LC_MESSAGES and $LANG. > > Examples: > LANGUAGE= -> "" > LANGUAGE=ko:en -> "Accept-Language: ko, en; q=0.9, *; q=0.1" > LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *; q=0.1" > LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *; q=0.1" > > This gives git servers a chance to display remote error messages in > the user's preferred language. > > Signed-off-by: Yi EungJun <eungjun.yi@navercorp.com> > --- > http.c | 134 +++++++++++++++++++++++++++++++++++++++++++++ > remote-curl.c | 2 + > t/t5550-http-fetch-dumb.sh | 31 +++++++++++ > 3 files changed, 167 insertions(+) > > diff --git a/http.c b/http.c > index 3a28b21..ed4e8e1 100644 > --- a/http.c > +++ b/http.c > @@ -67,6 +67,8 @@ static struct curl_slist *no_pragma_header; > > static struct active_request_slot *active_queue_head; > > +static struct strbuf *cached_accept_language = NULL; Please drop " = NULL" that is unnecessary for BSS. > @@ -512,6 +514,9 @@ void http_cleanup(void) > cert_auth.password = NULL; > } > ssl_cert_password_required = 0; > + > + if (cached_accept_language) > + strbuf_release(cached_accept_language); > } > @@ -983,6 +988,129 @@ static void extract_content_type(struct strbuf *raw, struct strbuf *type, > strbuf_addstr(charset, "ISO-8859-1"); > } > > +/* > + * Guess the user's preferred languages from the value in LANGUAGE environment > + * variable and LC_MESSAGES locale category. > + * > + * The result can be a colon-separated list like "ko:ja:en". > + */ > +static const char *get_preferred_languages(void) > +{ > + const char *retval; > + > + retval = getenv("LANGUAGE"); > + if (retval && *retval) > + return retval; > + > + retval = setlocale(LC_MESSAGES, NULL); > + if (retval && *retval && > + strcmp(retval, "C") && > + strcmp(retval, "POSIX")) > + return retval; > + > + return NULL; > +} > + > +/* > + * Get an Accept-Language header which indicates user's preferred languages. > + * > + * Examples: > + * LANGUAGE= -> "" > + * LANGUAGE=ko:en -> "Accept-Language: ko, en; q=0.9, *; q=0.1" > + * LANGUAGE=ko_KR.UTF-8:sr@latin -> "Accept-Language: ko-KR, sr; q=0.9, *; q=0.1" > + * LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *; q=0.1" > + * LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *; q=0.1" > + * LANGUAGE= LANG=C -> "" > + */ > +static struct strbuf *get_accept_language(void) > +{ > + const char *lang_begin, *pos; > + int q, max_q; > + int num_langs; > + int decimal_places; > + int is_codeset_or_modifier = 0; > + static struct strbuf buf = STRBUF_INIT; > + struct strbuf q_format_buf = STRBUF_INIT; > + char *q_format; > + > + if (cached_accept_language) > + return cached_accept_language; > + > + lang_begin = get_preferred_languages(); > + > + /* Don't add Accept-Language header if no language is preferred. */ > + if (!(lang_begin && *lang_begin)) { It is not wrong per-se, but given how hard get_preferred_languages() tries not to return a pointer to an empty string, this seems a bit overly defensive to me. > + cached_accept_language = &buf; > + return cached_accept_language; It is somewhat unconventional to have a static pointer outside to point at a singleton and then have a singleton actually as a static structure. I would have done without "buf" in this function and instead started this function like so: if (cached_accept_language) return cached_accept_language; cached_accept_language = xmalloc(sizeof(struct strbuf)); strbuf_init(cached_accept_language, 0); lang_begin = get_preferred_languages(); if (!lang_begin) return cached_accept_language; > + } > + > + /* Count number of preferred lang_begin to decide precision of q-factor */ > + for (num_langs = 1, pos = lang_begin; *pos; pos++) > + if (*pos == ':') > + num_langs++; > + > + /* Decide the precision for q-factor on number of preferred lang_begin. */ > + num_langs += 1; /* for '*' */ > + decimal_places = 1 + (num_langs > 10) + (num_langs > 100); What if you got 60000 languages ;-)? I do not think we want to bend backwards and make the code list all 60000 of them, assigning a unique and decreasing q value to each of them, forming an overlong Accept-Language header, but at the same time, I do not think we want to show nonsense output because we compute the precision incorrectly here. > + strbuf_addf(&q_format_buf, "; q=0.%%0%dd", decimal_places); > + q_format = strbuf_detach(&q_format_buf, NULL); q_format_buf is an overkill use of strbuf, isn't it? Just char q_format_buf[32]; sprintf(q_format_buf, ";q=0.%%0%d", decimal_places); or something should be more than sufficient, no? > + for (max_q = 1; decimal_places-- > 0;) max_q *= 10; As you have to do one loop like this that amounts to computing log10 of num_langs, why not compute decimal_places the same way while at it? It may also make sense to cap the number of languages to avoid spitting out overly long Accept-Language header with practicaly useless list of many languages. That is, something along the lines of ... (note that I may very well have off-by-one or off-by-ten errors here you may need to tweak to get right): if (MAX_LANGS < num_langs) num_langs = MAX_LANGS; for (max_q = 1, decimal_places = 1; max_q < num_langs; decimal_places++, max_q *= 10) ; If you are to use the MAX_LANGS cap, the main loop would also need to pay attention to it by breaking out of the loop early before you reach the end of the string, of course. > + q = max_q; > + > + strbuf_addstr(&buf, "Accept-Language: "); > + > + /* > + * Convert a list of colon-separated locale values [1][2] to a list of > + * comma-separated language tags [3] which can be used as a value of > + * Accept-Language header. > + * > + * [1]: http://pubs.opengroup.org/onlinepubs/007908799/xbd/envvar.html > + * [2]: http://www.gnu.org/software/libc/manual/html_node/Using-gettextized-software.html > + * [3]: http://tools.ietf.org/html/rfc7231#section-5.3.5 > + */ > + for (pos = lang_begin; ; pos++) { > + if (*pos == ':' || !*pos) { > + /* Ignore if this character is the first one. */ > + if (pos == lang_begin) > + continue; By doing this "ignore empty" here, but not doing the same when you count num_langs, are you potentially miscounting num_langs? > + is_codeset_or_modifier = 0; > + > + /* Put a q-factor only if it is less than 1.0. */ > + if (q < max_q) ... is it the same thing as "do not do this for the first round, but do so for all the other round"? > + strbuf_addf(&buf, q_format, q); > + > + if (q > 1) Hmm, I am puzzled. C this ever be an issue (unless you have off-by-one error or you add "cap num_langs to MAX_LANGS", that is)? > + q--; > + /* NULL pos means this is the last language. */ > + if (*pos) > + strbuf_addstr(&buf, ", "); > + else > + break; > + > + } else if (is_codeset_or_modifier) > + continue; > + else if (*pos == '.' || *pos == '@') /* Remove .codeset and @modifier. */ > + is_codeset_or_modifier = 1; > + else > + strbuf_addch(&buf, *pos == '_' ? '-' : *pos); > + } > + > + /* Don't add Accept-Language header if no language is preferred. */ > + if (q >= max_q) { Can q go over max_q, or is it "q may be max_q"? In other words, is this essentially saying "if we did not find any language in the preferred languages list"? > + cached_accept_language = &buf; > + return cached_accept_language; > + } > + > + /* Add '*' with minimum q-factor greater than 0.0. */ > + strbuf_addstr(&buf, ", *"); > + strbuf_addf(&buf, q_format, 1); > + > + cached_accept_language = &buf; > + return cached_accept_language; > +} > + > /* http_request() targets */ > #define HTTP_REQUEST_STRBUF 0 > #define HTTP_REQUEST_FILE 1 > @@ -995,6 +1123,7 @@ static int http_request(const char *url, > struct slot_results results; > struct curl_slist *headers = NULL; > struct strbuf buf = STRBUF_INIT; > + struct strbuf* accept_language; As we write in C, not C++, our asterisks stick to the variable, not the type. ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH v4 1/1] http: Add Accept-Language header if possible 2014-07-21 19:01 ` Junio C Hamano @ 2014-08-03 7:35 ` Yi, EungJun 0 siblings, 0 replies; 58+ messages in thread From: Yi, EungJun @ 2014-08-03 7:35 UTC (permalink / raw) To: Junio C Hamano Cc: Git List, Yi EungJun, Eric Sunshine, Jeff King, Peter Krefting Thanks very much for your detailed review and sorry for late reply. 2014-07-22 4:01 GMT+09:00 Junio C Hamano <gitster@pobox.com>: > Yi EungJun <semtlenori@gmail.com> writes: > >> From: Yi EungJun <eungjun.yi@navercorp.com> >> >> Add an Accept-Language header which indicates the user's preferred >> languages defined by $LANGUAGE, $LC_ALL, $LC_MESSAGES and $LANG. >> >> Examples: >> LANGUAGE= -> "" >> LANGUAGE=ko:en -> "Accept-Language: ko, en; q=0.9, *; q=0.1" >> LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *; q=0.1" >> LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *; q=0.1" >> >> This gives git servers a chance to display remote error messages in >> the user's preferred language. >> >> Signed-off-by: Yi EungJun <eungjun.yi@navercorp.com> >> --- >> http.c | 134 +++++++++++++++++++++++++++++++++++++++++++++ >> remote-curl.c | 2 + >> t/t5550-http-fetch-dumb.sh | 31 +++++++++++ >> 3 files changed, 167 insertions(+) >> >> diff --git a/http.c b/http.c >> index 3a28b21..ed4e8e1 100644 >> --- a/http.c >> +++ b/http.c >> @@ -67,6 +67,8 @@ static struct curl_slist *no_pragma_header; >> >> static struct active_request_slot *active_queue_head; >> >> +static struct strbuf *cached_accept_language = NULL; > > Please drop " = NULL" that is unnecessary for BSS. Thanks, I'll fix it. > >> @@ -512,6 +514,9 @@ void http_cleanup(void) >> cert_auth.password = NULL; >> } >> ssl_cert_password_required = 0; >> + >> + if (cached_accept_language) >> + strbuf_release(cached_accept_language); >> } > > > >> @@ -983,6 +988,129 @@ static void extract_content_type(struct strbuf *raw, struct strbuf *type, >> strbuf_addstr(charset, "ISO-8859-1"); >> } >> >> +/* >> + * Guess the user's preferred languages from the value in LANGUAGE environment >> + * variable and LC_MESSAGES locale category. >> + * >> + * The result can be a colon-separated list like "ko:ja:en". >> + */ >> +static const char *get_preferred_languages(void) >> +{ >> + const char *retval; >> + >> + retval = getenv("LANGUAGE"); >> + if (retval && *retval) >> + return retval; >> + >> + retval = setlocale(LC_MESSAGES, NULL); >> + if (retval && *retval && >> + strcmp(retval, "C") && >> + strcmp(retval, "POSIX")) >> + return retval; >> + >> + return NULL; >> +} >> + >> +/* >> + * Get an Accept-Language header which indicates user's preferred languages. >> + * >> + * Examples: >> + * LANGUAGE= -> "" >> + * LANGUAGE=ko:en -> "Accept-Language: ko, en; q=0.9, *; q=0.1" >> + * LANGUAGE=ko_KR.UTF-8:sr@latin -> "Accept-Language: ko-KR, sr; q=0.9, *; q=0.1" >> + * LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *; q=0.1" >> + * LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *; q=0.1" >> + * LANGUAGE= LANG=C -> "" >> + */ >> +static struct strbuf *get_accept_language(void) >> +{ >> + const char *lang_begin, *pos; >> + int q, max_q; >> + int num_langs; >> + int decimal_places; >> + int is_codeset_or_modifier = 0; >> + static struct strbuf buf = STRBUF_INIT; >> + struct strbuf q_format_buf = STRBUF_INIT; >> + char *q_format; >> + >> + if (cached_accept_language) >> + return cached_accept_language; >> + >> + lang_begin = get_preferred_languages(); >> + >> + /* Don't add Accept-Language header if no language is preferred. */ >> + if (!(lang_begin && *lang_begin)) { > > It is not wrong per-se, but given how hard get_preferred_languages() > tries not to return a pointer to an empty string, this seems a bit > overly defensive to me. Thanks, I'll fix it. > >> + cached_accept_language = &buf; >> + return cached_accept_language; > > It is somewhat unconventional to have a static pointer outside to > point at a singleton and then have a singleton actually as a static > structure. I would have done without "buf" in this function and > instead started this function like so: > > if (cached_accept_language) > return cached_accept_language; > > cached_accept_language = xmalloc(sizeof(struct strbuf)); > strbuf_init(cached_accept_language, 0); > lang_begin = get_preferred_languages(); > if (!lang_begin) > return cached_accept_language; > Thanks, I'll fix it as you suggest. >> + } >> + >> + /* Count number of preferred lang_begin to decide precision of q-factor */ >> + for (num_langs = 1, pos = lang_begin; *pos; pos++) >> + if (*pos == ':') >> + num_langs++; >> + >> + /* Decide the precision for q-factor on number of preferred lang_begin. */ >> + num_langs += 1; /* for '*' */ > > >> + decimal_places = 1 + (num_langs > 10) + (num_langs > 100); > > What if you got 60000 languages ;-)? I do not think we want to bend > backwards and make the code list all 60000 of them, assigning a > unique and decreasing q value to each of them, forming an overlong > Accept-Language header, but at the same time, I do not think we want > to show nonsense output because we compute the precision incorrectly > here. In that case, less-preferred 59,001 languages will have same q-value of 0.001. Unfortunately, there is no way to represent user's preferences of over 1,000 languages since the HTTP specification restricts the range of q-value from 0.001 to 1.000 [1]. I think this code reflects user's preferences better than Google chrome whose minimum q-value is 0.2 and Mozilla firefox whose minimum q-value is 0.01 [2]. > >> + strbuf_addf(&q_format_buf, "; q=0.%%0%dd", decimal_places); >> + q_format = strbuf_detach(&q_format_buf, NULL); > > q_format_buf is an overkill use of strbuf, isn't it? Just > > char q_format_buf[32]; > sprintf(q_format_buf, ";q=0.%%0%d", decimal_places); > > or something should be more than sufficient, no? Thanks, I'll fix it as you suggest. > >> + for (max_q = 1; decimal_places-- > 0;) max_q *= 10; > > As you have to do one loop like this that amounts to computing log10 > of num_langs, why not compute decimal_places the same way while at > it? It may also make sense to cap the number of languages to avoid > spitting out overly long Accept-Language header with practicaly > useless list of many languages. That is, something along the lines > of ... (note that I may very well have off-by-one or off-by-ten > errors here you may need to tweak to get right): > > if (MAX_LANGS < num_langs) > num_langs = MAX_LANGS; > for (max_q = 1, decimal_places = 1; > max_q < num_langs; > decimal_places++, max_q *= 10) > ; > > If you are to use the MAX_LANGS cap, the main loop would also need > to pay attention to it by breaking out of the loop early before you > reach the end of the string, of course. Thanks for good point. As you said, we should limit the length of the value of Accept-Language header because some HTTP servers respond 4xx Client Error if any header's value is very long (4KB or more) [3]. But MAX_LANGS may not be enough because the header can be too long even if the number of languages does not exceed MAX_LANGS if language tag is too long. Many of language tags are 2-3 characters but some are 11 characters; Even it is possible that a user has a very long custom language tag. I think this problem can be solved by one of these solutions: A. Set MAX_LANGS conservatively (100 or less). B. Limit the length of Accept-Language header directly (4KB or less). C. Negotiate with server; Resend a request with shorter Accept-Language header if the server responds 4xx error. > >> + q = max_q; >> + >> + strbuf_addstr(&buf, "Accept-Language: "); >> + >> + /* >> + * Convert a list of colon-separated locale values [1][2] to a list of >> + * comma-separated language tags [3] which can be used as a value of >> + * Accept-Language header. >> + * >> + * [1]: http://pubs.opengroup.org/onlinepubs/007908799/xbd/envvar.html >> + * [2]: http://www.gnu.org/software/libc/manual/html_node/Using-gettextized-software.html >> + * [3]: http://tools.ietf.org/html/rfc7231#section-5.3.5 >> + */ >> + for (pos = lang_begin; ; pos++) { >> + if (*pos == ':' || !*pos) { >> + /* Ignore if this character is the first one. */ >> + if (pos == lang_begin) >> + continue; > > By doing this "ignore empty" here, but not doing the same when you > count num_langs, are you potentially miscounting num_langs? Yes, num_langs can be larger than actual number of languages. For example, if LANGUAGE="en:" num_langs is 2. I wanted to make the logic to compute num_langs as simple as possible and thought num_langs does not need to be very accurate because it is used only to compute the precision of q-factor (max_q and decimal_places). Do we need to compute num_langs accurately? > >> + is_codeset_or_modifier = 0; >> + >> + /* Put a q-factor only if it is less than 1.0. */ >> + if (q < max_q) > > ... is it the same thing as "do not do this for the first round, but > do so for all the other round"? Yes, q-factor is q/max_q and q-factor of 1.0 is not necessary because: > if no "q" parameter is present, the default weight is 1. > -- http://tools.ietf.org/html/rfc7231#section-5.3.1 > >> + strbuf_addf(&buf, q_format, q); >> + >> + if (q > 1) > > Hmm, I am puzzled. C this ever be an issue (unless you have > off-by-one error or you add "cap num_langs to MAX_LANGS", that is)? Yes, it may be an issue if the number of languages is larger than the max_q. max_q must not be larger than 1000 but the number of languages may be. But this if-statement will be not necessary if we limit the number of languages to 1,000 or less (by MAX_LANGS you suggest). > >> + q--; > >> + /* NULL pos means this is the last language. */ >> + if (*pos) >> + strbuf_addstr(&buf, ", "); >> + else >> + break; >> + >> + } else if (is_codeset_or_modifier) >> + continue; >> + else if (*pos == '.' || *pos == '@') /* Remove .codeset and @modifier. */ >> + is_codeset_or_modifier = 1; >> + else >> + strbuf_addch(&buf, *pos == '_' ? '-' : *pos); >> + } >> + >> + /* Don't add Accept-Language header if no language is preferred. */ >> + if (q >= max_q) { > > Can q go over max_q, or is it "q may be max_q"? In other words, is > this essentially saying "if we did not find any language in the > preferred languages list"? Yes. q may be max_q if we did not find any language in the preferred languages list. But there is no chance that q goes over max_q. But it will be not necessary if num_langs is computed accurately. > >> + cached_accept_language = &buf; >> + return cached_accept_language; >> + } >> + >> + /* Add '*' with minimum q-factor greater than 0.0. */ >> + strbuf_addstr(&buf, ", *"); >> + strbuf_addf(&buf, q_format, 1); >> + >> + cached_accept_language = &buf; >> + return cached_accept_language; >> +} >> + >> /* http_request() targets */ >> #define HTTP_REQUEST_STRBUF 0 >> #define HTTP_REQUEST_FILE 1 >> @@ -995,6 +1123,7 @@ static int http_request(const char *url, >> struct slot_results results; >> struct curl_slist *headers = NULL; >> struct strbuf buf = STRBUF_INIT; >> + struct strbuf* accept_language; > > As we write in C, not C++, our asterisks stick to the variable, not > the type. Thanks, I'll fix it. [1]: http://tools.ietf.org/html/rfc7231#section-5.3.1 [2]: https://hg.mozilla.org/integration/mozilla-inbound/rev/1418f9ce6f8b [3]: http://stackoverflow.com/questions/686217/maximum-on-http-header-values/8623061#8623061 ^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH v5 0/1] http: Add Accept-Language header if possible 2014-07-19 17:58 [PATCH v4 0/1] http: Add Accept-Language header if possible Yi EungJun 2014-07-19 17:58 ` [PATCH v4 1/1] " Yi EungJun @ 2014-12-02 12:12 ` Yi EungJun 2014-12-02 12:12 ` [PATCH v5 1/1] " Yi EungJun 2015-03-06 16:13 ` [PATCH] http: Include locale.h when using setlocale() Ævar Arnfjörð Bjarmason 2 siblings, 1 reply; 58+ messages in thread From: Yi EungJun @ 2014-12-02 12:12 UTC (permalink / raw) To: git; +Cc: Yi EungJun, Eric Sunshine, Jeff King, Peter Krefting, Junio C Hamano Changes since v4 * Fix styles as Junio C Hamano suggested. * Limit number of languages and length of Accept-Language header. Yi EungJun (1): http: Add Accept-Language header if possible http.c | 154 +++++++++++++++++++++++++++++++++++++++++++++ remote-curl.c | 2 + t/t5550-http-fetch-dumb.sh | 31 +++++++++ 3 files changed, 187 insertions(+) -- 2.2.0 ^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH v5 1/1] http: Add Accept-Language header if possible 2014-12-02 12:12 ` [PATCH v5 0/1] " Yi EungJun @ 2014-12-02 12:12 ` Yi EungJun 2014-12-03 18:22 ` Junio C Hamano 2014-12-03 19:31 ` Eric Sunshine 0 siblings, 2 replies; 58+ messages in thread From: Yi EungJun @ 2014-12-02 12:12 UTC (permalink / raw) To: git; +Cc: Yi EungJun, Eric Sunshine, Jeff King, Peter Krefting, Junio C Hamano From: Yi EungJun <eungjun.yi@navercorp.com> Add an Accept-Language header which indicates the user's preferred languages defined by $LANGUAGE, $LC_ALL, $LC_MESSAGES and $LANG. Examples: LANGUAGE= -> "" LANGUAGE=ko:en -> "Accept-Language: ko, en;q=0.9, *;q=0.1" LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *;q=0.1" LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *;q=0.1" This gives git servers a chance to display remote error messages in the user's preferred language. Limit the number of languages to 1,000 because q-value must not be smaller than 0.001, and limit the length of Accept-Language header to 4,000 bytes for some HTTP servers which cannot accept such long header. Signed-off-by: Yi EungJun <eungjun.yi@navercorp.com> --- http.c | 154 +++++++++++++++++++++++++++++++++++++++++++++ remote-curl.c | 2 + t/t5550-http-fetch-dumb.sh | 31 +++++++++ 3 files changed, 187 insertions(+) diff --git a/http.c b/http.c index 040f362..69624af 100644 --- a/http.c +++ b/http.c @@ -68,6 +68,8 @@ static struct curl_slist *no_pragma_header; static struct active_request_slot *active_queue_head; +static struct strbuf *cached_accept_language; + size_t fread_buffer(char *ptr, size_t eltsize, size_t nmemb, void *buffer_) { size_t size = eltsize * nmemb; @@ -515,6 +517,9 @@ void http_cleanup(void) cert_auth.password = NULL; } ssl_cert_password_required = 0; + + if (cached_accept_language) + strbuf_release(cached_accept_language); } struct active_request_slot *get_active_slot(void) @@ -986,6 +991,149 @@ static void extract_content_type(struct strbuf *raw, struct strbuf *type, strbuf_addstr(charset, "ISO-8859-1"); } +/* + * Guess the user's preferred languages from the value in LANGUAGE environment + * variable and LC_MESSAGES locale category. + * + * The result can be a colon-separated list like "ko:ja:en". + */ +static const char *get_preferred_languages(void) +{ + const char *retval; + + retval = getenv("LANGUAGE"); + if (retval && *retval) + return retval; + + retval = setlocale(LC_MESSAGES, NULL); + if (retval && *retval && + strcmp(retval, "C") && + strcmp(retval, "POSIX")) + return retval; + + return NULL; +} + +/* + * Get an Accept-Language header which indicates user's preferred languages. + * + * Examples: + * LANGUAGE= -> "" + * LANGUAGE=ko:en -> "Accept-Language: ko, en; q=0.9, *; q=0.1" + * LANGUAGE=ko_KR.UTF-8:sr@latin -> "Accept-Language: ko-KR, sr; q=0.9, *; q=0.1" + * LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *; q=0.1" + * LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *; q=0.1" + * LANGUAGE= LANG=C -> "" + */ +static struct strbuf *get_accept_language(void) +{ + const char *lang_begin, *pos; + int q, max_q; + int num_langs; + int decimal_places; + int is_codeset_or_modifier = 0; + char q_format[32]; + /* + * MAX_LANGS must not be larger than 1,000. If it is larger than that, + * q-value will be smaller than 0.001, the minimum q-value the HTTP + * specification allows [1]. + * + * [1]: http://tools.ietf.org/html/rfc7231#section-5.3.1 + */ + const int MAX_LANGS = 1000; + const int MAX_SIZE_OF_HEADER = 4000; + int last_size = 0; + + if (cached_accept_language) + return cached_accept_language; + + cached_accept_language = xmalloc(sizeof(struct strbuf)); + strbuf_init(cached_accept_language, 0); + lang_begin = get_preferred_languages(); + + /* Don't add Accept-Language header if no language is preferred. */ + if (!lang_begin) + return cached_accept_language; + + /* Count number of preferred lang_begin to decide precision of q-factor. */ + for (num_langs = 1, pos = lang_begin; *pos; pos++) + if (*pos == ':') + num_langs++; + + /* Decide the precision for q-factor on number of preferred lang_begin. */ + num_langs += 1; /* for '*' */ + + if (MAX_LANGS < num_langs) + num_langs = MAX_LANGS; + + for (max_q = 1, decimal_places = 0; + max_q < num_langs; + decimal_places++, max_q *= 10); + + sprintf(q_format, ";q=0.%%0%dd", decimal_places); + + q = max_q; + + strbuf_addstr(cached_accept_language, "Accept-Language: "); + + /* + * Convert a list of colon-separated locale values [1][2] to a list of + * comma-separated language tags [3] which can be used as a value of + * Accept-Language header. + * + * [1]: http://pubs.opengroup.org/onlinepubs/007908799/xbd/envvar.html + * [2]: http://www.gnu.org/software/libc/manual/html_node/Using-gettextized-software.html + * [3]: http://tools.ietf.org/html/rfc7231#section-5.3.5 + */ + for (pos = lang_begin; ; pos++) { + if (*pos == ':' || !*pos) { + /* Ignore if this character is the first one. */ + if (pos == lang_begin) + continue; + + is_codeset_or_modifier = 0; + + /* Put a q-factor only if it is less than 1.0. */ + if (q < max_q) + strbuf_addf(cached_accept_language, q_format, q); + + if (q > 1) + q--; + + last_size = cached_accept_language->len; + + /* NULL pos means this is the last language. */ + if (*pos) + strbuf_addstr(cached_accept_language, ", "); + else + break; + + } else if (is_codeset_or_modifier) + continue; + else if (*pos == '.' || *pos == '@') /* Remove .codeset and @modifier. */ + is_codeset_or_modifier = 1; + else + strbuf_addch(cached_accept_language, *pos == '_' ? '-' : *pos); + + if (cached_accept_language->len > MAX_SIZE_OF_HEADER) { + strbuf_remove(cached_accept_language, last_size, + cached_accept_language->len - last_size); + break; + } + } + + /* Don't add Accept-Language header if no language is preferred. */ + if (q >= max_q) { + return cached_accept_language; + } + + /* Add '*' with minimum q-factor greater than 0.0. */ + strbuf_addstr(cached_accept_language, ", *"); + strbuf_addf(cached_accept_language, q_format, 1); + + return cached_accept_language; +} + /* http_request() targets */ #define HTTP_REQUEST_STRBUF 0 #define HTTP_REQUEST_FILE 1 @@ -998,6 +1146,7 @@ static int http_request(const char *url, struct slot_results results; struct curl_slist *headers = NULL; struct strbuf buf = STRBUF_INIT; + struct strbuf *accept_language; int ret; slot = get_active_slot(); @@ -1023,6 +1172,11 @@ static int http_request(const char *url, fwrite_buffer); } + accept_language = get_accept_language(); + + if (accept_language && accept_language->len > 0) + headers = curl_slist_append(headers, accept_language->buf); + strbuf_addstr(&buf, "Pragma:"); if (options && options->no_cache) strbuf_addstr(&buf, " no-cache"); diff --git a/remote-curl.c b/remote-curl.c index dd63bc2..04989e5 100644 --- a/remote-curl.c +++ b/remote-curl.c @@ -962,6 +962,8 @@ int main(int argc, const char **argv) struct strbuf buf = STRBUF_INIT; int nongit; + git_setup_gettext(); + git_extract_argv0_path(argv[0]); setup_git_directory_gently(&nongit); if (argc < 2) { diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh index ac71418..197c361 100755 --- a/t/t5550-http-fetch-dumb.sh +++ b/t/t5550-http-fetch-dumb.sh @@ -196,5 +196,36 @@ test_expect_success 'reencoding is robust to whitespace oddities' ' grep "this is the error message" stderr ' +check_language () { + echo "Accept-Language: $1\r" >expect && + test_must_fail env \ + GIT_CURL_VERBOSE=1 \ + LANGUAGE=$2 \ + LC_ALL=$3 \ + LC_MESSAGES=$4 \ + LANG=$5 \ + git clone "$HTTPD_URL/accept/language" 2>stderr && + grep -i ^Accept-Language: stderr >actual && + test_cmp expect actual +} + +test_expect_success 'git client sends Accept-Language based on LANGUAGE, LC_ALL, LC_MESSAGES and LANG' ' + check_language "ko-KR, *;q=0.1" ko_KR.UTF-8 de_DE.UTF-8 ja_JP.UTF-8 en_US.UTF-8 && + check_language "de-DE, *;q=0.1" "" de_DE.UTF-8 ja_JP.UTF-8 en_US.UTF-8 && + check_language "ja-JP, *;q=0.1" "" "" ja_JP.UTF-8 en_US.UTF-8 && + check_language "en-US, *;q=0.1" "" "" "" en_US.UTF-8 +' + +test_expect_success 'git client sends Accept-Language with many preferred languages' ' + check_language "ko-KR, en-US;q=0.99, fr-CA;q=0.98, de;q=0.97, sr;q=0.96, \ +ja;q=0.95, zh;q=0.94, sv;q=0.93, pt;q=0.92, nb;q=0.91, *;q=0.01" \ + ko_KR.EUC-KR:en_US.UTF-8:fr_CA:de.UTF-8@euro:sr@latin:ja:zh:sv:pt:nb +' + +test_expect_success 'git client does not send Accept-Language' ' + test_must_fail env GIT_CURL_VERBOSE=1 LANGUAGE= git clone "$HTTPD_URL/accept/language" 2>stderr && + ! grep "^Accept-Language:" stderr +' + stop_httpd test_done -- 2.2.0 ^ permalink raw reply related [flat|nested] 58+ messages in thread
* Re: [PATCH v5 1/1] http: Add Accept-Language header if possible 2014-12-02 12:12 ` [PATCH v5 1/1] " Yi EungJun @ 2014-12-03 18:22 ` Junio C Hamano 2014-12-03 19:31 ` Eric Sunshine 1 sibling, 0 replies; 58+ messages in thread From: Junio C Hamano @ 2014-12-03 18:22 UTC (permalink / raw) To: Yi EungJun; +Cc: git, Yi EungJun, Eric Sunshine, Jeff King, Peter Krefting Yi EungJun <semtlenori@gmail.com> writes: > From: Yi EungJun <eungjun.yi@navercorp.com> > > Add an Accept-Language header which indicates the user's preferred > languages defined by $LANGUAGE, $LC_ALL, $LC_MESSAGES and $LANG. > > Examples: > LANGUAGE= -> "" > LANGUAGE=ko:en -> "Accept-Language: ko, en;q=0.9, *;q=0.1" > LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *;q=0.1" > LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *;q=0.1" > > This gives git servers a chance to display remote error messages in > the user's preferred language. > > Limit the number of languages to 1,000 because q-value must not be > smaller than 0.001, and limit the length of Accept-Language header to > 4,000 bytes for some HTTP servers which cannot accept such long header. > > Signed-off-by: Yi EungJun <eungjun.yi@navercorp.com> > --- > http.c | 154 +++++++++++++++++++++++++++++++++++++++++++++ > remote-curl.c | 2 + > t/t5550-http-fetch-dumb.sh | 31 +++++++++ > 3 files changed, 187 insertions(+) > > diff --git a/http.c b/http.c > index 040f362..69624af 100644 > --- a/http.c > +++ b/http.c > @@ -68,6 +68,8 @@ static struct curl_slist *no_pragma_header; > > static struct active_request_slot *active_queue_head; > > +static struct strbuf *cached_accept_language; > + > size_t fread_buffer(char *ptr, size_t eltsize, size_t nmemb, void *buffer_) > { > size_t size = eltsize * nmemb; > @@ -515,6 +517,9 @@ void http_cleanup(void) > cert_auth.password = NULL; > } > ssl_cert_password_required = 0; > + > + if (cached_accept_language) > + strbuf_release(cached_accept_language); Is this correct? You still keep cached_accept_language pointer itself, so the next call to get_accept_language() would say "Ah, cached_accept_language is there, so its contents (which is empty because we released it here) must be valid and to be reused". Perhaps you want to free it, too, i.e. if (cached_accept_language) { strbuf_release(cached_accept_language); free(cached_accept_language); cached_accept_language = NULL; } or something? > +static struct strbuf *get_accept_language(void) > +{ > + ... > + if (cached_accept_language) > + return cached_accept_language; > + > + cached_accept_language = xmalloc(sizeof(struct strbuf)); > + ... > + for (max_q = 1, decimal_places = 0; > + max_q < num_langs; > + decimal_places++, max_q *= 10); Have that "empty statement" on its own separate line, i.e. for (a, counter = 0; b; c, counter++) ; /* just counting */ Alternatively, you can make it more obvious that the purpose of loop is to count, i.e. for (a, counter = 0; b; c) counter++; > +test_expect_success 'git client does not send Accept-Language' ' > + test_must_fail env GIT_CURL_VERBOSE=1 LANGUAGE= git clone "$HTTPD_URL/accept/language" 2>stderr && > + ! grep "^Accept-Language:" stderr > +' Hmph, this test smells a bit brittle. What is the reason you expect "git clone" to fail? Is it because there is no repository at the named URL at "$HTTPD_URL/accept/language"? Is that the only plausible reason for a failure? It might be better to use the URL to a repository that is expected to be served by the server started in this test and expect success. If it bothers you that "clone" creates a new copy that is otherwise unused by this test, you can use something like "ls-remote" instead, I would think. ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH v5 1/1] http: Add Accept-Language header if possible 2014-12-02 12:12 ` [PATCH v5 1/1] " Yi EungJun 2014-12-03 18:22 ` Junio C Hamano @ 2014-12-03 19:31 ` Eric Sunshine 2014-12-03 21:37 ` Junio C Hamano 1 sibling, 1 reply; 58+ messages in thread From: Eric Sunshine @ 2014-12-03 19:31 UTC (permalink / raw) To: Yi EungJun Cc: Git List, Yi EungJun, Jeff King, Peter Krefting, Junio C Hamano On Tue, Dec 2, 2014 at 7:12 AM, Yi EungJun <semtlenori@gmail.com> wrote: > From: Yi EungJun <eungjun.yi@navercorp.com> > > Add an Accept-Language header which indicates the user's preferred > languages defined by $LANGUAGE, $LC_ALL, $LC_MESSAGES and $LANG. > > Examples: > LANGUAGE= -> "" > LANGUAGE=ko:en -> "Accept-Language: ko, en;q=0.9, *;q=0.1" > LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *;q=0.1" > LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *;q=0.1" > > This gives git servers a chance to display remote error messages in > the user's preferred language. > > Limit the number of languages to 1,000 because q-value must not be > smaller than 0.001, and limit the length of Accept-Language header to > 4,000 bytes for some HTTP servers which cannot accept such long header. > > Signed-off-by: Yi EungJun <eungjun.yi@navercorp.com> > --- > diff --git a/http.c b/http.c > index 040f362..69624af 100644 > --- a/http.c > +++ b/http.c > @@ -68,6 +68,8 @@ static struct curl_slist *no_pragma_header; > > static struct active_request_slot *active_queue_head; > > +static struct strbuf *cached_accept_language; Is there a reason this needs to be a pointer to a strbuf rather than just a strbuf? That is wouldn't this work? static struct strbuf cached_accept_language = STRBUF_INIT; > size_t fread_buffer(char *ptr, size_t eltsize, size_t nmemb, void *buffer_) > { > size_t size = eltsize * nmemb; > @@ -515,6 +517,9 @@ void http_cleanup(void) > cert_auth.password = NULL; > } > ssl_cert_password_required = 0; > + > + if (cached_accept_language) > + strbuf_release(cached_accept_language); Junio already mentioned that this is leaking the memory of the strbuf struct itself which was xmalloc()'d by get_accept_language(). > } > > struct active_request_slot *get_active_slot(void) > @@ -986,6 +991,149 @@ static void extract_content_type(struct strbuf *raw, struct strbuf *type, > strbuf_addstr(charset, "ISO-8859-1"); > } > > +/* > + * Guess the user's preferred languages from the value in LANGUAGE environment > + * variable and LC_MESSAGES locale category. > + * > + * The result can be a colon-separated list like "ko:ja:en". > + */ > +static const char *get_preferred_languages(void) > +{ > + const char *retval; > + > + retval = getenv("LANGUAGE"); > + if (retval && *retval) > + return retval; > + > + retval = setlocale(LC_MESSAGES, NULL); > + if (retval && *retval && > + strcmp(retval, "C") && > + strcmp(retval, "POSIX")) > + return retval; > + > + return NULL; Mental note: This function will never return an empty string "", even if LANGUAGE or LC_MESSAGES is empty. > +} > + > +/* > + * Get an Accept-Language header which indicates user's preferred languages. > + * > + * Examples: > + * LANGUAGE= -> "" > + * LANGUAGE=ko:en -> "Accept-Language: ko, en; q=0.9, *; q=0.1" > + * LANGUAGE=ko_KR.UTF-8:sr@latin -> "Accept-Language: ko-KR, sr; q=0.9, *; q=0.1" > + * LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *; q=0.1" > + * LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *; q=0.1" > + * LANGUAGE= LANG=C -> "" > + */ > +static struct strbuf *get_accept_language(void) I find this API a bit strange. Use of strbuf to construct the returned string is an implementation detail of this function. From the caller's point of view, it should just be receiving a constant string: one which it needs neither to modify nor free. Also, if the caller were to modify the returned strbuf for some reason, then that modification would impact all future calls to get_accept_language() since the strbuf is 'static' and not recomputed. Instead, I would expect the declaration to be: static const char *get_accept_language(void) > +{ > + const char *lang_begin, *pos; > + int q, max_q; > + int num_langs; > + int decimal_places; > + int is_codeset_or_modifier = 0; > + char q_format[32]; > + const int MAX_LANGS = 1000; > + const int MAX_SIZE_OF_HEADER = 4000; > + int last_size = 0; > + > + if (cached_accept_language) > + return cached_accept_language; > + > + cached_accept_language = xmalloc(sizeof(struct strbuf)); > + strbuf_init(cached_accept_language, 0); > + lang_begin = get_preferred_languages(); Mental note: lang_begin will never be the empty string "". > + /* Don't add Accept-Language header if no language is preferred. */ > + if (!lang_begin) > + return cached_accept_language; > + > + /* Count number of preferred lang_begin to decide precision of q-factor. */ > + for (num_langs = 1, pos = lang_begin; *pos; pos++) > + if (*pos == ':') > + num_langs++; > + > + /* Decide the precision for q-factor on number of preferred lang_begin. */ > + num_langs += 1; /* for '*' */ > + > + if (MAX_LANGS < num_langs) > + num_langs = MAX_LANGS; > + > + for (max_q = 1, decimal_places = 0; > + max_q < num_langs; > + decimal_places++, max_q *= 10); > + > + sprintf(q_format, ";q=0.%%0%dd", decimal_places); > + > + q = max_q; > + > + strbuf_addstr(cached_accept_language, "Accept-Language: "); > + > + for (pos = lang_begin; ; pos++) { > + if (*pos == ':' || !*pos) { > + /* Ignore if this character is the first one. */ > + if (pos == lang_begin) > + continue; If lang_begin were ever to point at an empty string "", then this logic would access memory beyond the end-of-string. Since get_preferred_languages() won't return an empty string, this case is safe, but it's not obvious to the casual reader and some person modifying the code in the future might not notice this restriction. I personally would feel more comfortable if the no-empty-string assumption was documented formally with an assert(*lang_begin) or die() before entering the loop; or if the code was rewritten to be less fragile. > + is_codeset_or_modifier = 0; > + > + /* Put a q-factor only if it is less than 1.0. */ > + if (q < max_q) > + strbuf_addf(cached_accept_language, q_format, q); > + > + if (q > 1) > + q--; > + > + last_size = cached_accept_language->len; > + > + /* NULL pos means this is the last language. */ Nit: This feels somewhat backward. It sounds like the comment is explaining the 'then' part of the 'if' rather than the 'else' part. Perhaps rephrase like this: /* non-NULL pos means more languages */ or, better yet, just drop the comment since the code is self-explanatory. > + if (*pos) > + strbuf_addstr(cached_accept_language, ", "); > + else > + break; > + > + } else if (is_codeset_or_modifier) > + continue; > + else if (*pos == '.' || *pos == '@') /* Remove .codeset and @modifier. */ > + is_codeset_or_modifier = 1; > + else > + strbuf_addch(cached_accept_language, *pos == '_' ? '-' : *pos); > + > + if (cached_accept_language->len > MAX_SIZE_OF_HEADER) { Mental note: Here you respect MAX_SIZE_OF_HEADER. > + strbuf_remove(cached_accept_language, last_size, > + cached_accept_language->len - last_size); > + break; > + } > + } > + > + /* Don't add Accept-Language header if no language is preferred. */ Is this comment correct? The "Accept-Language:" header was already added to cached_accept_language much earlier in the function. > + if (q >= max_q) { > + return cached_accept_language; > + } Style: Unnecessary braces. > + /* Add '*' with minimum q-factor greater than 0.0. */ > + strbuf_addstr(cached_accept_language, ", *"); > + strbuf_addf(cached_accept_language, q_format, 1); Here you don't respect MAX_SIZE_OF_HEADER. > + > + return cached_accept_language; > +} > + > /* http_request() targets */ > #define HTTP_REQUEST_STRBUF 0 > #define HTTP_REQUEST_FILE 1 > @@ -998,6 +1146,7 @@ static int http_request(const char *url, > struct slot_results results; > struct curl_slist *headers = NULL; > struct strbuf buf = STRBUF_INIT; > + struct strbuf *accept_language; > int ret; > > slot = get_active_slot(); > @@ -1023,6 +1172,11 @@ static int http_request(const char *url, > fwrite_buffer); > } > > + accept_language = get_accept_language(); > + > + if (accept_language && accept_language->len > 0) > + headers = curl_slist_append(headers, accept_language->buf); > + > strbuf_addstr(&buf, "Pragma:"); > if (options && options->no_cache) > strbuf_addstr(&buf, " no-cache"); ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH v5 1/1] http: Add Accept-Language header if possible 2014-12-03 19:31 ` Eric Sunshine @ 2014-12-03 21:37 ` Junio C Hamano 2014-12-03 22:00 ` Michael Blume 0 siblings, 1 reply; 58+ messages in thread From: Junio C Hamano @ 2014-12-03 21:37 UTC (permalink / raw) To: Eric Sunshine; +Cc: Yi EungJun, Git List, Yi EungJun, Jeff King, Peter Krefting Eric Sunshine <sunshine@sunshineco.com> writes: >> @@ -515,6 +517,9 @@ void http_cleanup(void) >> cert_auth.password = NULL; >> } >> ssl_cert_password_required = 0; >> + >> + if (cached_accept_language) >> + strbuf_release(cached_accept_language); > > Junio already mentioned that this is leaking the memory of the strbuf > struct itself which was xmalloc()'d by get_accept_language(). I actually didn't ;-) A singleton cached_accept_language strbuf itself being kept around, with its reuse by get_accept_language(), is fine and is not a leak. But clearing the strbuf alone will introduce correctness problem---the second HTTP connection will see an empty strbuf, get_accept_language() will say "we've already computed and the header we must issue is an empty string", which is not correct. In the fix-up "SQUASH???" commit I queued on top of this patch on 'pu', I had to run "sort -u" on the output to the standard error stream, as there seemed to be two HTTP connections and the actual output had two headers, even though the test expected only one in the output. I suspect that it is a fallout from this bug that the original code passed that test that expects only one. >> +static struct strbuf *get_accept_language(void) > > I find this API a bit strange. Use of strbuf to construct the returned > string is an implementation detail of this function. From the caller's > point of view, it should just be receiving a constant string: one > which it needs neither to modify nor free. Also, if the caller were to > modify the returned strbuf for some reason, then that modification > would impact all future calls to get_accept_language() since the > strbuf is 'static' and not recomputed. Instead, I would expect the > declaration to be: > > static const char *get_accept_language(void) Makes sense to me. >> + /* Put a q-factor only if it is less than 1.0. */ >> + if (q < max_q) >> + strbuf_addf(cached_accept_language, q_format, q); >> + >> + if (q > 1) >> + q--; I didn't mention this but if q ever goes below 1, wouldn't it mean that there is no point continuing this loop? ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH v5 1/1] http: Add Accept-Language header if possible 2014-12-03 21:37 ` Junio C Hamano @ 2014-12-03 22:00 ` Michael Blume 2014-12-03 22:06 ` Michael Blume 0 siblings, 1 reply; 58+ messages in thread From: Michael Blume @ 2014-12-03 22:00 UTC (permalink / raw) To: Junio C Hamano Cc: Eric Sunshine, Yi EungJun, Git List, Yi EungJun, Jeff King, Peter Krefting On Wed, Dec 3, 2014 at 1:37 PM, Junio C Hamano <gitster@pobox.com> wrote: > Eric Sunshine <sunshine@sunshineco.com> writes: > >>> @@ -515,6 +517,9 @@ void http_cleanup(void) >>> cert_auth.password = NULL; >>> } >>> ssl_cert_password_required = 0; >>> + >>> + if (cached_accept_language) >>> + strbuf_release(cached_accept_language); >> >> Junio already mentioned that this is leaking the memory of the strbuf >> struct itself which was xmalloc()'d by get_accept_language(). > > I actually didn't ;-) A singleton cached_accept_language strbuf > itself being kept around, with its reuse by get_accept_language(), > is fine and is not a leak. But clearing the strbuf alone will > introduce correctness problem---the second HTTP connection will see > an empty strbuf, get_accept_language() will say "we've already > computed and the header we must issue is an empty string", which is > not correct. > > In the fix-up "SQUASH???" commit I queued on top of this patch on > 'pu', I had to run "sort -u" on the output to the standard error > stream, as there seemed to be two HTTP connections and the actual > output had two headers, even though the test expected only one in > the output. I suspect that it is a fallout from this bug that the > original code passed that test that expects only one. > >>> +static struct strbuf *get_accept_language(void) >> >> I find this API a bit strange. Use of strbuf to construct the returned >> string is an implementation detail of this function. From the caller's >> point of view, it should just be receiving a constant string: one >> which it needs neither to modify nor free. Also, if the caller were to >> modify the returned strbuf for some reason, then that modification >> would impact all future calls to get_accept_language() since the >> strbuf is 'static' and not recomputed. Instead, I would expect the >> declaration to be: >> >> static const char *get_accept_language(void) > > Makes sense to me. > >>> + /* Put a q-factor only if it is less than 1.0. */ >>> + if (q < max_q) >>> + strbuf_addf(cached_accept_language, q_format, q); >>> + >>> + if (q > 1) >>> + q--; > > I didn't mention this but if q ever goes below 1, wouldn't it mean > that there is no point continuing this loop? > -- > To unsubscribe from this list: send the line "unsubscribe git" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html This seems to be failing under Mac OS for me not ok 25 - git client sends Accept-Language based on LANGUAGE, LC_ALL, LC_MESSAGES and LANG # # check_language "ko-KR, *;q=0.1" ko_KR.UTF-8 de_DE.UTF-8 ja_JP.UTF-8 en_US.UTF-8 && # check_language "de-DE, *;q=0.1" "" de_DE.UTF-8 ja_JP.UTF-8 en_US.UTF-8 && # check_language "ja-JP, *;q=0.1" "" "" ja_JP.UTF-8 en_US.UTF-8 && # check_language "en-US, *;q=0.1" "" "" "" en_US.UTF-8 # ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH v5 1/1] http: Add Accept-Language header if possible 2014-12-03 22:00 ` Michael Blume @ 2014-12-03 22:06 ` Michael Blume 2014-12-22 16:44 ` [PATCH v6 0/1] " Yi EungJun 0 siblings, 1 reply; 58+ messages in thread From: Michael Blume @ 2014-12-03 22:06 UTC (permalink / raw) To: Junio C Hamano Cc: Eric Sunshine, Yi EungJun, Git List, Yi EungJun, Jeff King, Peter Krefting On Wed, Dec 3, 2014 at 2:00 PM, Michael Blume <blume.mike@gmail.com> wrote: > On Wed, Dec 3, 2014 at 1:37 PM, Junio C Hamano <gitster@pobox.com> wrote: >> Eric Sunshine <sunshine@sunshineco.com> writes: >> >>>> @@ -515,6 +517,9 @@ void http_cleanup(void) >>>> cert_auth.password = NULL; >>>> } >>>> ssl_cert_password_required = 0; >>>> + >>>> + if (cached_accept_language) >>>> + strbuf_release(cached_accept_language); >>> >>> Junio already mentioned that this is leaking the memory of the strbuf >>> struct itself which was xmalloc()'d by get_accept_language(). >> >> I actually didn't ;-) A singleton cached_accept_language strbuf >> itself being kept around, with its reuse by get_accept_language(), >> is fine and is not a leak. But clearing the strbuf alone will >> introduce correctness problem---the second HTTP connection will see >> an empty strbuf, get_accept_language() will say "we've already >> computed and the header we must issue is an empty string", which is >> not correct. >> >> In the fix-up "SQUASH???" commit I queued on top of this patch on >> 'pu', I had to run "sort -u" on the output to the standard error >> stream, as there seemed to be two HTTP connections and the actual >> output had two headers, even though the test expected only one in >> the output. I suspect that it is a fallout from this bug that the >> original code passed that test that expects only one. >> >>>> +static struct strbuf *get_accept_language(void) >>> >>> I find this API a bit strange. Use of strbuf to construct the returned >>> string is an implementation detail of this function. From the caller's >>> point of view, it should just be receiving a constant string: one >>> which it needs neither to modify nor free. Also, if the caller were to >>> modify the returned strbuf for some reason, then that modification >>> would impact all future calls to get_accept_language() since the >>> strbuf is 'static' and not recomputed. Instead, I would expect the >>> declaration to be: >>> >>> static const char *get_accept_language(void) >> >> Makes sense to me. >> >>>> + /* Put a q-factor only if it is less than 1.0. */ >>>> + if (q < max_q) >>>> + strbuf_addf(cached_accept_language, q_format, q); >>>> + >>>> + if (q > 1) >>>> + q--; >> >> I didn't mention this but if q ever goes below 1, wouldn't it mean >> that there is no point continuing this loop? >> -- >> To unsubscribe from this list: send the line "unsubscribe git" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > This seems to be failing under Mac OS for me > > not ok 25 - git client sends Accept-Language based on LANGUAGE, > LC_ALL, LC_MESSAGES and LANG > # > # check_language "ko-KR, *;q=0.1" ko_KR.UTF-8 de_DE.UTF-8 ja_JP.UTF-8 > en_US.UTF-8 && > # check_language "de-DE, *;q=0.1" "" de_DE.UTF-8 ja_JP.UTF-8 > en_US.UTF-8 && > # check_language "ja-JP, *;q=0.1" "" "" ja_JP.UTF-8 > en_US.UTF-8 && > # check_language "en-US, *;q=0.1" "" "" "" > en_US.UTF-8 > # verbose results Initialized empty Git repository in /Users/Shared/Jenkins/Home/jobs/git/workspace/t/trash directory.t5550-http-fetch-dumb/.git/ expecting success: git config push.default matching && echo content1 >file && git add file && git commit -m one echo content2 >file && git add file && git commit -m two [master (root-commit) f5983e9] one Author: A U Thor <author@example.com> 1 file changed, 1 insertion(+) create mode 100644 file [master 2ff8a06] two Author: A U Thor <author@example.com> 1 file changed, 1 insertion(+), 1 deletion(-) ok 1 - setup repository expecting success: cp -R .git "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" && (cd "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" && git config core.bare true && mkdir -p hooks && echo "exec git update-server-info" >hooks/post-update && chmod +x hooks/post-update && hooks/post-update ) && git remote add public "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" && git push public master:master Everything up-to-date ok 2 - create http-accessible bare repository with loose objects expecting success: git clone $HTTPD_URL/dumb/repo.git clone-tmpl && cp -R clone-tmpl clone && test_cmp file clone/file Cloning into 'clone-tmpl'... ok 3 - clone http repository expecting success: mkdir -p "$HTTPD_DOCUMENT_ROOT_PATH/auth/dumb/" && cp -Rf "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" \ "$HTTPD_DOCUMENT_ROOT_PATH/auth/dumb/repo.git" ok 4 - create password-protected repository expecting success: write_script "$TRASH_DIRECTORY/askpass" <<-\EOF && echo >>"$TRASH_DIRECTORY/askpass-query" "askpass: $*" && case "$*" in *Username*) what=user ;; *Password*) what=pass ;; esac && cat "$TRASH_DIRECTORY/askpass-$what" EOF GIT_ASKPASS="$TRASH_DIRECTORY/askpass" && export GIT_ASKPASS && export TRASH_DIRECTORY ok 5 - setup askpass helper expecting success: set_askpass wrong && test_must_fail git clone "$HTTPD_URL/auth/dumb/repo.git" clone-auth-fail && expect_askpass both wrong Cloning into 'clone-auth-fail'... fatal: Authentication failed for 'http://127.0.0.1:5550/auth/dumb/repo.git/' ok 6 - cloning password-protected repository can fail expecting success: set_askpass wrong && git clone "$HTTPD_URL_USER_PASS/auth/dumb/repo.git" clone-auth-none && expect_askpass none Cloning into 'clone-auth-none'... ok 7 - http auth can use user/pass in URL expecting success: set_askpass wrong pass@host && git clone "$HTTPD_URL_USER/auth/dumb/repo.git" clone-auth-pass && expect_askpass pass user@host Cloning into 'clone-auth-pass'... ok 8 - http auth can use just user in URL expecting success: set_askpass user@host pass@host && git clone "$HTTPD_URL/auth/dumb/repo.git" clone-auth-both && expect_askpass both user@host Cloning into 'clone-auth-both'... ok 9 - http auth can request both user and pass expecting success: test_config_global credential.helper "!f() { cat >/dev/null echo username=user@host echo password=pass@host }; f" && set_askpass wrong && git clone "$HTTPD_URL/auth/dumb/repo.git" clone-auth-helper && expect_askpass none Cloning into 'clone-auth-helper'... ok 10 - http auth respects credential helper config expecting success: test_config_global "credential.$HTTPD_URL.username" user@host && set_askpass wrong pass@host && git clone "$HTTPD_URL/auth/dumb/repo.git" clone-auth-user && expect_askpass pass user@host Cloning into 'clone-auth-user'... ok 11 - http auth can get username from config expecting success: test_config_global "credential.$HTTPD_URL.username" wrong && set_askpass wrong pass@host && git clone "$HTTPD_URL_USER/auth/dumb/repo.git" clone-auth-user2 && expect_askpass pass user@host Cloning into 'clone-auth-user2'... ok 12 - configured username does not override URL expecting success: echo content >>file && git commit -a -m two && git push public && (cd clone && git pull) && test_cmp file clone/file [master d4af499] two Author: A U Thor <author@example.com> 1 file changed, 1 insertion(+) To /Users/Shared/Jenkins/Home/jobs/git/workspace/t/trash directory.t5550-http-fetch-dumb/httpd/www/repo.git 2ff8a06..d4af499 master -> master From http://127.0.0.1:5550/dumb/repo 2ff8a06..d4af499 master -> origin/master Updating 2ff8a06..d4af499 Fast-forward file | 1 + 1 file changed, 1 insertion(+) ok 13 - fetch changes via http expecting success: cp -R clone-tmpl clone2 && HEAD=$(git rev-parse --verify HEAD) && (cd clone2 && git http-fetch -a -w heads/master-new $HEAD $(git config remote.origin.url) && git checkout master-new && test $HEAD = $(git rev-parse --verify HEAD)) && test_cmp file clone2/file Switched to branch 'master-new' ok 14 - fetch changes via manual http-fetch expecting success: git push public master:other && (cd clone && git remote set-head origin -d && git remote set-head origin -a && git symbolic-ref refs/remotes/origin/HEAD > output && echo refs/remotes/origin/master > expect && test_cmp expect output ) To /Users/Shared/Jenkins/Home/jobs/git/workspace/t/trash directory.t5550-http-fetch-dumb/httpd/www/repo.git * [new branch] master -> other origin/HEAD set to master ok 15 - http remote detects correct HEAD expecting success: cp -R "$HTTPD_DOCUMENT_ROOT_PATH"/repo.git "$HTTPD_DOCUMENT_ROOT_PATH"/repo_pack.git && (cd "$HTTPD_DOCUMENT_ROOT_PATH"/repo_pack.git && git --bare repack -a -d ) && git clone $HTTPD_URL/dumb/repo_pack.git Cloning into 'repo_pack'... ok 16 - fetch packed objects expecting success: cp -R "$HTTPD_DOCUMENT_ROOT_PATH"/repo_pack.git "$HTTPD_DOCUMENT_ROOT_PATH"/repo_bad1.git && (cd "$HTTPD_DOCUMENT_ROOT_PATH"/repo_bad1.git && p=`ls objects/pack/pack-*.pack` && chmod u+w $p && printf %0256d 0 | dd of=$p bs=256 count=1 seek=1 conv=notrunc ) && mkdir repo_bad1.git && (cd repo_bad1.git && git --bare init && test_must_fail git --bare fetch $HTTPD_URL/dumb/repo_bad1.git && test 0 = `ls objects/pack/pack-*.pack | wc -l` ) 1+0 records in 1+0 records out 256 bytes transferred in 0.000014 secs (18512790 bytes/sec) Initialized empty Git repository in /Users/Shared/Jenkins/Home/jobs/git/workspace/t/trash directory.t5550-http-fetch-dumb/repo_bad1.git/ fatal: pack has bad object at offset 168: inflate returned -5 error: Unable to find d4af499a00b28bc0ab78fa94cc6a449fae19b08d under http://127.0.0.1:5550/dumb/repo_bad1.git Cannot obtain needed object d4af499a00b28bc0ab78fa94cc6a449fae19b08d error: fetch failed. ls: objects/pack/pack-*.pack: No such file or directory ok 17 - fetch notices corrupt pack expecting success: cp -R "$HTTPD_DOCUMENT_ROOT_PATH"/repo_pack.git "$HTTPD_DOCUMENT_ROOT_PATH"/repo_bad2.git && (cd "$HTTPD_DOCUMENT_ROOT_PATH"/repo_bad2.git && p=`ls objects/pack/pack-*.idx` && chmod u+w $p && printf %0256d 0 | dd of=$p bs=256 count=1 seek=1 conv=notrunc ) && mkdir repo_bad2.git && (cd repo_bad2.git && git --bare init && test_must_fail git --bare fetch $HTTPD_URL/dumb/repo_bad2.git && test 0 = `ls objects/pack | wc -l` ) 1+0 records in 1+0 records out 256 bytes transferred in 0.000013 secs (19522579 bytes/sec) Initialized empty Git repository in /Users/Shared/Jenkins/Home/jobs/git/workspace/t/trash directory.t5550-http-fetch-dumb/repo_bad2.git/ error: non-monotonic index /Users/Shared/Jenkins/Home/jobs/git/workspace/t/trash directory.t5550-http-fetch-dumb/repo_bad2.git/objects/pack/pack-7244949d8d6e59a30923b3fff7801f159ef4ba5d.idx.temp error: Unable to find d4af499a00b28bc0ab78fa94cc6a449fae19b08d under http://127.0.0.1:5550/dumb/repo_bad2.git Cannot obtain needed object d4af499a00b28bc0ab78fa94cc6a449fae19b08d error: fetch failed. ok 18 - fetch notices corrupt idx expecting success: grep /git-upload-pack <"$HTTPD_ROOT_PATH"/access.log >act : >exp test_cmp exp act ok 19 - did not use upload-pack service expecting success: test_must_fail git clone "$HTTPD_URL/error/text" 2>stderr && grep "this is the error message" stderr remote: this is the error message ok 20 - git client shows text/plain errors expecting success: test_must_fail git clone "$HTTPD_URL/error/html" 2>stderr && ! grep "this is the error message" stderr ok 21 - git client does not show html errors expecting success: test_must_fail git clone "$HTTPD_URL/error/charset" 2>stderr && grep "this is the error message" stderr remote: this is the error message ok 22 - git client shows text/plain with a charset expecting success: test_must_fail git clone "$HTTPD_URL/error/utf16" 2>stderr && grep "this is the error message" stderr remote: this is the error message ok 23 - http error messages are reencoded expecting success: test_must_fail git clone "$HTTPD_URL/error/odd-spacing" 2>stderr && grep "this is the error message" stderr remote: this is the error message ok 24 - reencoding is robust to whitespace oddities expecting success: check_language "ko-KR, *;q=0.1" ko_KR.UTF-8 de_DE.UTF-8 ja_JP.UTF-8 en_US.UTF-8 && check_language "de-DE, *;q=0.1" "" de_DE.UTF-8 ja_JP.UTF-8 en_US.UTF-8 && check_language "ja-JP, *;q=0.1" "" "" ja_JP.UTF-8 en_US.UTF-8 && check_language "en-US, *;q=0.1" "" "" "" en_US.UTF-8 not ok 25 - git client sends Accept-Language based on LANGUAGE, LC_ALL, LC_MESSAGES and LANG # # check_language "ko-KR, *;q=0.1" ko_KR.UTF-8 de_DE.UTF-8 ja_JP.UTF-8 en_US.UTF-8 && # check_language "de-DE, *;q=0.1" "" de_DE.UTF-8 ja_JP.UTF-8 en_US.UTF-8 && # check_language "ja-JP, *;q=0.1" "" "" ja_JP.UTF-8 en_US.UTF-8 && # check_language "en-US, *;q=0.1" "" "" "" en_US.UTF-8 # expecting success: check_language "ko-KR, en-US;q=0.99, fr-CA;q=0.98, de;q=0.97, sr;q=0.96, \ ja;q=0.95, zh;q=0.94, sv;q=0.93, pt;q=0.92, nb;q=0.91, *;q=0.01" \ ko_KR.EUC-KR:en_US.UTF-8:fr_CA:de.UTF-8@euro:sr@latin:ja:zh:sv:pt:nb ok 26 - git client sends Accept-Language with many preferred languages expecting success: GIT_CURL_VERBOSE=1 LANGUAGE= git ls-remote "$HTTPD_URL/dumb/repo.git" 2>stderr && ! grep "^Accept-Language:" stderr d4af499a00b28bc0ab78fa94cc6a449fae19b08d HEAD d4af499a00b28bc0ab78fa94cc6a449fae19b08d refs/heads/master d4af499a00b28bc0ab78fa94cc6a449fae19b08d refs/heads/other ok 27 - git client does not send an empty Accept-Language # failed 1 among 27 test(s) 1..27 ^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH v6 0/1] http: Add Accept-Language header if possible 2014-12-03 22:06 ` Michael Blume @ 2014-12-22 16:44 ` Yi EungJun 2014-12-22 16:44 ` [PATCH v6 1/1] " Yi EungJun 0 siblings, 1 reply; 58+ messages in thread From: Yi EungJun @ 2014-12-22 16:44 UTC (permalink / raw) To: git Cc: Yi EungJun, Eric Sunshine, Jeff King, Peter Krefting, Junio C Hamano, Michael Blume Changes since v5 >From Junio C Hamano's review: * The tests use `ls-remote` instead of `clone` for tests; I copied the test code from ba8e63dc30a80656fddc616f714fb217ad220c04. * Set cached_accept_langauge to NULL after free it. >From Eric Sunshine's review: * get_accept_language() returns a pointer to const char instead of strbuf; the type of cached_accept_language also has been changed to char* from strbuf* * write_accept_language(), which is extracted from get_accept_language(), respects MAX_SIZE_OF_HEADER. * The for-loop in write_accept_language() works correctly if lang_begin points an empty string. >From Jeff King's advice: * get_preferred_languages() considers LC_MESSAGES only if NO_GETTEXT is not defined. * Remove the tests for LC_MESSAGES, LANG and LC_ALL. Yi EungJun (1): http: Add Accept-Language header if possible http.c | 173 +++++++++++++++++++++++++++++++++++++++++++++ remote-curl.c | 2 + t/t5550-http-fetch-dumb.sh | 32 +++++++++ 3 files changed, 207 insertions(+) -- 2.2.0.375.gcd18ce6.dirty ^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH v6 1/1] http: Add Accept-Language header if possible 2014-12-22 16:44 ` [PATCH v6 0/1] " Yi EungJun @ 2014-12-22 16:44 ` Yi EungJun 2014-12-22 19:34 ` Junio C Hamano 2014-12-24 20:35 ` Eric Sunshine 0 siblings, 2 replies; 58+ messages in thread From: Yi EungJun @ 2014-12-22 16:44 UTC (permalink / raw) To: git Cc: Yi EungJun, Eric Sunshine, Jeff King, Peter Krefting, Junio C Hamano, Michael Blume From: Yi EungJun <eungjun.yi@navercorp.com> Add an Accept-Language header which indicates the user's preferred languages defined by $LANGUAGE, $LC_ALL, $LC_MESSAGES and $LANG. Examples: LANGUAGE= -> "" LANGUAGE=ko:en -> "Accept-Language: ko, en;q=0.9, *;q=0.1" LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *;q=0.1" LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *;q=0.1" This gives git servers a chance to display remote error messages in the user's preferred language. Limit the number of languages to 1,000 because q-value must not be smaller than 0.001, and limit the length of Accept-Language header to 4,000 bytes for some HTTP servers which cannot accept such long header. Signed-off-by: Yi EungJun <eungjun.yi@navercorp.com> --- http.c | 173 +++++++++++++++++++++++++++++++++++++++++++++ remote-curl.c | 2 + t/t5550-http-fetch-dumb.sh | 32 +++++++++ 3 files changed, 207 insertions(+) diff --git a/http.c b/http.c index 040f362..7a77708 100644 --- a/http.c +++ b/http.c @@ -68,6 +68,8 @@ static struct curl_slist *no_pragma_header; static struct active_request_slot *active_queue_head; +static char *cached_accept_language; + size_t fread_buffer(char *ptr, size_t eltsize, size_t nmemb, void *buffer_) { size_t size = eltsize * nmemb; @@ -515,6 +517,11 @@ void http_cleanup(void) cert_auth.password = NULL; } ssl_cert_password_required = 0; + + if (cached_accept_language) { + free(cached_accept_language); + cached_accept_language = NULL; + } } struct active_request_slot *get_active_slot(void) @@ -986,6 +993,166 @@ static void extract_content_type(struct strbuf *raw, struct strbuf *type, strbuf_addstr(charset, "ISO-8859-1"); } +/* + * Guess the user's preferred languages from the value in LANGUAGE environment + * variable and LC_MESSAGES locale category if NO_GETTEXT is not defined. + * + * The result can be a colon-separated list like "ko:ja:en". + */ +static const char *get_preferred_languages(void) +{ + const char *retval; + + retval = getenv("LANGUAGE"); + if (retval && *retval) + return retval; + +#ifndef NO_GETTEXT + retval = setlocale(LC_MESSAGES, NULL); + if (retval && *retval && + strcmp(retval, "C") && + strcmp(retval, "POSIX")) + return retval; +#endif + + return NULL; +} + +static void write_accept_language(struct strbuf *buf) +{ + const char *lang_begin, *pos; + int q, max_q; + int num_langs; + int decimal_places; + const int CODESET_OR_MODIFIER = 1; + const int LANGUAGE_TAG = 2; + const int SEPARATOR = 3; + int is_q_factor_required = 0; + int parse_state = 0; + char q_format[32]; + /* + * MAX_LANGS must not be larger than 1,000. If it is larger than that, + * q-value will be smaller than 0.001, the minimum q-value the HTTP + * specification allows [1]. + * + * [1]: http://tools.ietf.org/html/rfc7231#section-5.3.1 + */ + const int MAX_LANGS = 1000; + const int MAX_SIZE_OF_HEADER = 4000; + const int MAX_SIZE_OF_ASTERISK_ELEMENT = 11; /* for ", *;q=0.001" */ + int last_size = 0; + + lang_begin = get_preferred_languages(); + + /* Don't add Accept-Language header if no language is preferred. */ + if (!lang_begin) + return; + + /* Count number of preferred lang_begin to decide precision of q-factor. */ + for (num_langs = 1, pos = lang_begin; *pos; pos++) + if (*pos == ':') + num_langs++; + + /* Decide the precision for q-factor on number of preferred lang_begin. */ + num_langs += 1; /* for '*' */ + + if (MAX_LANGS < num_langs) + num_langs = MAX_LANGS; + + for (max_q = 1, decimal_places = 0; + max_q < num_langs; + decimal_places++, max_q *= 10) + ; + + sprintf(q_format, ";q=0.%%0%dd", decimal_places); + + q = max_q; + + strbuf_addstr(buf, "Accept-Language: "); + + /* + * Convert a list of colon-separated locale values [1][2] to a list of + * comma-separated language tags [3] which can be used as a value of + * Accept-Language header. + * + * [1]: http://pubs.opengroup.org/onlinepubs/007908799/xbd/envvar.html + * [2]: http://www.gnu.org/software/libc/manual/html_node/Using-gettextized-software.html + * [3]: http://tools.ietf.org/html/rfc7231#section-5.3.5 + */ + for (pos = lang_begin; ; pos++) { + if (!*pos || *pos == ':') { + if (is_q_factor_required) { + /* Put a q-factor only if it is less than 1.0. */ + if (q < max_q) + strbuf_addf(buf, q_format, q); + + if (q > 1) + q--; + + last_size = buf->len; + + is_q_factor_required = 0; + } + parse_state = SEPARATOR; + } else if (parse_state == CODESET_OR_MODIFIER) + continue; + else if (*pos == ' ') /* Ignore whitespace character */ + continue; + else if (*pos == '.' || *pos == '@') /* Remove .codeset and @modifier. */ + parse_state = CODESET_OR_MODIFIER; + else { + if (parse_state != LANGUAGE_TAG && q < max_q) + strbuf_addstr(buf, ", "); + strbuf_addch(buf, *pos == '_' ? '-' : *pos); + is_q_factor_required = 1; + parse_state = LANGUAGE_TAG; + } + + if (buf->len > MAX_SIZE_OF_HEADER - MAX_SIZE_OF_ASTERISK_ELEMENT) { + strbuf_remove(buf, last_size, buf->len - last_size); + break; + } + + if (!*pos) + break; + } + + /* Don't add Accept-Language header if no language is preferred. */ + if (q >= max_q) { + strbuf_reset(buf); + return; + } + + /* Add '*' with minimum q-factor greater than 0.0. */ + strbuf_addstr(buf, ", *"); + strbuf_addf(buf, q_format, 1); +} + +/* + * Get an Accept-Language header which indicates user's preferred languages. + * + * This function always return non-NULL string as strbuf_detach() does. + * + * Examples: + * LANGUAGE= -> "" + * LANGUAGE=ko:en -> "Accept-Language: ko, en; q=0.9, *; q=0.1" + * LANGUAGE=ko_KR.UTF-8:sr@latin -> "Accept-Language: ko-KR, sr; q=0.9, *; q=0.1" + * LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *; q=0.1" + * LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *; q=0.1" + * LANGUAGE= LANG=C -> "" + */ +static const char *get_accept_language(void) +{ + if (!cached_accept_language) { + struct strbuf buf = STRBUF_INIT; + write_accept_language(&buf); + cached_accept_language = strbuf_detach(&buf, NULL); + strbuf_release(&buf); + } + + return cached_accept_language; +} + /* http_request() targets */ #define HTTP_REQUEST_STRBUF 0 #define HTTP_REQUEST_FILE 1 @@ -998,6 +1165,7 @@ static int http_request(const char *url, struct slot_results results; struct curl_slist *headers = NULL; struct strbuf buf = STRBUF_INIT; + const char *accept_language; int ret; slot = get_active_slot(); @@ -1023,6 +1191,11 @@ static int http_request(const char *url, fwrite_buffer); } + accept_language = get_accept_language(); + + if (strlen(accept_language) > 0) + headers = curl_slist_append(headers, accept_language); + strbuf_addstr(&buf, "Pragma:"); if (options && options->no_cache) strbuf_addstr(&buf, " no-cache"); diff --git a/remote-curl.c b/remote-curl.c index dd63bc2..04989e5 100644 --- a/remote-curl.c +++ b/remote-curl.c @@ -962,6 +962,8 @@ int main(int argc, const char **argv) struct strbuf buf = STRBUF_INIT; int nongit; + git_setup_gettext(); + git_extract_argv0_path(argv[0]); setup_git_directory_gently(&nongit); if (argc < 2) { diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh index ac71418..1a58b97 100755 --- a/t/t5550-http-fetch-dumb.sh +++ b/t/t5550-http-fetch-dumb.sh @@ -196,5 +196,37 @@ test_expect_success 'reencoding is robust to whitespace oddities' ' grep "this is the error message" stderr ' +check_language () { + echo "Accept-Language: $1" >expect && + GIT_CURL_VERBOSE=1 \ + LANGUAGE=$2 \ + git ls-remote "$HTTPD_URL/dumb/repo.git" 2>&1 | + tr -d '\015' | + sort -u >stderr && + grep -i ^Accept-Language: stderr >actual && + test_cmp expect actual +} + +test_expect_success 'git client sends Accept-Language based on LANGUAGE' ' + check_language "ko-KR, *;q=0.1" ko_KR.UTF-8' + +test_expect_success 'git client sends Accept-Language correctly with unordinary LANGUAGE' ' + check_language "ko-KR, *;q=0.1" "ko_KR:" && + check_language "ko-KR, *;q=0.1" " ko_KR" && + check_language "ko-KR, en-US;q=0.9, *;q=0.1" "ko_KR: en_US" && + check_language "ko-KR, en-US;q=0.9, *;q=0.1" "ko_KR::en_US" && + check_language "ko-KR, *;q=0.1" ":::ko_KR"' + +test_expect_success 'git client sends Accept-Language with many preferred languages' ' + check_language "ko-KR, en-US;q=0.99, fr-CA;q=0.98, de;q=0.97, sr;q=0.96, \ +ja;q=0.95, zh;q=0.94, sv;q=0.93, pt;q=0.92, nb;q=0.91, *;q=0.01" \ + ko_KR.EUC-KR:en_US.UTF-8:fr_CA:de.UTF-8@euro:sr@latin:ja:zh:sv:pt:nb +' + +test_expect_success 'git client does not send an empty Accept-Language' ' + GIT_CURL_VERBOSE=1 LANGUAGE= git ls-remote "$HTTPD_URL/dumb/repo.git" 2>stderr && + ! grep "^Accept-Language:" stderr +' + stop_httpd test_done -- 2.2.0.375.gcd18ce6.dirty ^ permalink raw reply related [flat|nested] 58+ messages in thread
* Re: [PATCH v6 1/1] http: Add Accept-Language header if possible 2014-12-22 16:44 ` [PATCH v6 1/1] " Yi EungJun @ 2014-12-22 19:34 ` Junio C Hamano 2014-12-24 20:35 ` Eric Sunshine 1 sibling, 0 replies; 58+ messages in thread From: Junio C Hamano @ 2014-12-22 19:34 UTC (permalink / raw) To: Yi EungJun Cc: git, Yi EungJun, Eric Sunshine, Jeff King, Peter Krefting, Michael Blume Yi EungJun <semtlenori@gmail.com> writes: > From: Yi EungJun <eungjun.yi@navercorp.com> > > Add an Accept-Language header which indicates the user's preferred > languages defined by $LANGUAGE, $LC_ALL, $LC_MESSAGES and $LANG. > > Examples: > LANGUAGE= -> "" > LANGUAGE=ko:en -> "Accept-Language: ko, en;q=0.9, *;q=0.1" > LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *;q=0.1" > LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *;q=0.1" > > This gives git servers a chance to display remote error messages in > the user's preferred language. > > Limit the number of languages to 1,000 because q-value must not be > smaller than 0.001, and limit the length of Accept-Language header to > 4,000 bytes for some HTTP servers which cannot accept such long header. > > Signed-off-by: Yi EungJun <eungjun.yi@navercorp.com> Overall, this one is a much more pleasant read than the previous rounds. > @@ -986,6 +993,166 @@ static void extract_content_type(struct strbuf *raw, struct strbuf *type, > strbuf_addstr(charset, "ISO-8859-1"); > } > > +/* > + * Guess the user's preferred languages from the value in LANGUAGE environment > + * variable and LC_MESSAGES locale category if NO_GETTEXT is not defined. > + * > + * The result can be a colon-separated list like "ko:ja:en". > + */ > +static const char *get_preferred_languages(void) > +{ > + const char *retval; > + > + retval = getenv("LANGUAGE"); > + if (retval && *retval) > + return retval; > + > +#ifndef NO_GETTEXT > + retval = setlocale(LC_MESSAGES, NULL); > + if (retval && *retval && > + strcmp(retval, "C") && > + strcmp(retval, "POSIX")) > + return retval; > +#endif A tangent. I wonder if we want to have something silly like this: #ifndef NO_GETTEXT #define setlocale(x, y) NULL /* or "C"??? */ #endif in a common header (e.g. gettext.h) to avoid sprinkling #ifdefs in our code. While I do not think we call setlocale() that often to warrant such a trick, we already do something very similar to make git_setup_gettext() a no-op in NO_GETTEXT builds in that header file, and the change in this patch to remote-curl.c does take advantage of it already, so... > +static void write_accept_language(struct strbuf *buf) > +{ > + const char *lang_begin, *pos; > + int q, max_q; > + int num_langs; > + int decimal_places; > + const int CODESET_OR_MODIFIER = 1; > + const int LANGUAGE_TAG = 2; > + const int SEPARATOR = 3; Another tangent, but I think we tend to use either #define or enum for constants, not "const int", in our codebase, for symbolic constants. In order to define a set of symbolic constants limited to a function scope, the "const int" way may be nicer than the other two methods we have traditionally used. Perhaps we should promote such use more widely, write our new code following this example, and migrate existing ones over time? I dunno. > ... > + /* Decide the precision for q-factor on number of preferred lang_begin. */ > + num_langs += 1; /* for '*' */ > + > + if (MAX_LANGS < num_langs) > + num_langs = MAX_LANGS; > + > + for (max_q = 1, decimal_places = 0; > + max_q < num_langs; > + decimal_places++, max_q *= 10) > + ; So, if we have 10 languages (num_langs == 10), decimal_places becomes 1, max_q becomes 10 ... > + sprintf(q_format, ";q=0.%%0%dd", decimal_places); ... and we will use "q=0.%01d" as the format. This is OK because the first one is given without q= so we use the format only nine times, counting down from 0.9 to 0.1 in 0.1 increments. Sounds OK to me (I always miscount this kind of loop and need to think aloud while doing a sanity check). > + for (pos = lang_begin; ; pos++) { > + if (!*pos || *pos == ':') { > + if (is_q_factor_required) { > + /* Put a q-factor only if it is less than 1.0. */ > + if (q < max_q) > + strbuf_addf(buf, q_format, q); > + > + if (q > 1) > + q--; When does this "if" statement not trigger? It seems to me that it will stop decrementing only if you have very many languages (e.g. num_langs was clipped to MAX_LANGS), and at that point you would not want to scan and add more languages---is there a reason why you keep going in such a case and not break out of the loop, i.e. if (q-- < 1) break; or something like that? > + ... > + if (buf->len > MAX_SIZE_OF_HEADER - MAX_SIZE_OF_ASTERISK_ELEMENT) { > + strbuf_remove(buf, last_size, buf->len - last_size); > + break; > + } > + > + if (!*pos) > + break; Alternatively use one of these breaks when q goes below 1, perhaps? > +/* > + * Get an Accept-Language header which indicates user's preferred languages. > + * > + * This function always return non-NULL string as strbuf_detach() does. > + * > + * Examples: > + * LANGUAGE= -> "" > + * LANGUAGE=ko:en -> "Accept-Language: ko, en; q=0.9, *; q=0.1" > + * LANGUAGE=ko_KR.UTF-8:sr@latin -> "Accept-Language: ko-KR, sr; q=0.9, *; q=0.1" > + * LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *; q=0.1" > + * LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *; q=0.1" > + * LANGUAGE= LANG=C -> "" > + */ > +static const char *get_accept_language(void) > +{ > + if (!cached_accept_language) { > + struct strbuf buf = STRBUF_INIT; > + write_accept_language(&buf); > + cached_accept_language = strbuf_detach(&buf, NULL); > + strbuf_release(&buf); If you detached the associated string from the strbuf, you have already released the resource from it; no need to release it, I would think. > diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh > index ac71418..1a58b97 100755 > --- a/t/t5550-http-fetch-dumb.sh > +++ b/t/t5550-http-fetch-dumb.sh > @@ -196,5 +196,37 @@ test_expect_success 'reencoding is robust to whitespace oddities' ' > grep "this is the error message" stderr > ' > > +check_language () { > + echo "Accept-Language: $1" >expect && > + GIT_CURL_VERBOSE=1 \ > + LANGUAGE=$2 \ > + git ls-remote "$HTTPD_URL/dumb/repo.git" 2>&1 | > + tr -d '\015' | > + sort -u >stderr && > + grep -i ^Accept-Language: stderr >actual && > + test_cmp expect actual > +} This makes it hard to test a case where no Accept-Language: header should be issued in the request, because at that point we would be expecting no matching string in the output. case "$2" in '') >expect ;; ?*) echo "Accept-Language: $1" >expect ;; esac && git ls-remote "$HTTPD_URL/dumb/repo.git" >output 2>&1 && tr -d '\015' <output | sort -u | sed -ne '/^Accept-Language:/' >actual && test_cmp expect actual or something like that, perhaps? And I can see below that we are not testing that "negative" case. After writing a new shiny feature, it always is tempting to show off that it triggers when it is expected to and gives an expected result, but it is equally important to have tests that make sure that the feature does not trigger when it should not. > +test_expect_success 'git client sends Accept-Language based on LANGUAGE' ' > + check_language "ko-KR, *;q=0.1" ko_KR.UTF-8' > + > +test_expect_success 'git client sends Accept-Language correctly with unordinary LANGUAGE' ' > + check_language "ko-KR, *;q=0.1" "ko_KR:" && > + check_language "ko-KR, *;q=0.1" " ko_KR" && > + check_language "ko-KR, en-US;q=0.9, *;q=0.1" "ko_KR: en_US" && > + check_language "ko-KR, en-US;q=0.9, *;q=0.1" "ko_KR::en_US" && > + check_language "ko-KR, *;q=0.1" ":::ko_KR"' > + > +test_expect_success 'git client sends Accept-Language with many preferred languages' ' > + check_language "ko-KR, en-US;q=0.99, fr-CA;q=0.98, de;q=0.97, sr;q=0.96, \ > +ja;q=0.95, zh;q=0.94, sv;q=0.93, pt;q=0.92, nb;q=0.91, *;q=0.01" \ > + ko_KR.EUC-KR:en_US.UTF-8:fr_CA:de.UTF-8@euro:sr@latin:ja:zh:sv:pt:nb > +' ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH v6 1/1] http: Add Accept-Language header if possible 2014-12-22 16:44 ` [PATCH v6 1/1] " Yi EungJun 2014-12-22 19:34 ` Junio C Hamano @ 2014-12-24 20:35 ` Eric Sunshine 2014-12-29 16:18 ` Junio C Hamano 1 sibling, 1 reply; 58+ messages in thread From: Eric Sunshine @ 2014-12-24 20:35 UTC (permalink / raw) To: Yi EungJun Cc: Git List, Yi EungJun, Jeff King, Peter Krefting, Junio C Hamano, Michael Blume On Mon, Dec 22, 2014 at 11:44 AM, Yi EungJun <semtlenori@gmail.com> wrote: > From: Yi EungJun <eungjun.yi@navercorp.com> > > Add an Accept-Language header which indicates the user's preferred > languages defined by $LANGUAGE, $LC_ALL, $LC_MESSAGES and $LANG. > > Examples: > LANGUAGE= -> "" > LANGUAGE=ko:en -> "Accept-Language: ko, en;q=0.9, *;q=0.1" > LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *;q=0.1" > LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *;q=0.1" > > This gives git servers a chance to display remote error messages in > the user's preferred language. > > Limit the number of languages to 1,000 because q-value must not be > smaller than 0.001, and limit the length of Accept-Language header to > 4,000 bytes for some HTTP servers which cannot accept such long header. Just a few comments and observations below. Alone, they are not necessarily worth a re-roll, but if you happen to re-roll for some other reason, perhaps take them into consideration. > Signed-off-by: Yi EungJun <eungjun.yi@navercorp.com> > --- > diff --git a/http.c b/http.c > index 040f362..7a77708 100644 > --- a/http.c > +++ b/http.c > @@ -986,6 +993,166 @@ static void extract_content_type(struct strbuf *raw, struct strbuf *type, > strbuf_addstr(charset, "ISO-8859-1"); > } > > +static void write_accept_language(struct strbuf *buf) > +{ > + [...] > + /* > + * MAX_LANGS must not be larger than 1,000. If it is larger than that, > + * q-value will be smaller than 0.001, the minimum q-value the HTTP > + * specification allows [1]. > + * > + * [1]: http://tools.ietf.org/html/rfc7231#section-5.3.1 > + */ > + const int MAX_LANGS = 1000; > + const int MAX_SIZE_OF_HEADER = 4000; > + const int MAX_SIZE_OF_ASTERISK_ELEMENT = 11; /* for ", *;q=0.001" */ These two MAX_SIZE_* constants are never used individually, but rather only as (MAX_SIZE_OF_HEADER - MAX_SIZE_OF_ASTERISK_ELEMENT). It might be a bit more readable to compute the final value here, with a suitable comment, rather than at point-of-use. Perhaps something like: /* limit of some HTTP servers is 4000 - strlen(", *;q=0.001") */ const int MAX_HEADER_SIZE = 4000 - 11; More below. > + [...] > + /* > + * Convert a list of colon-separated locale values [1][2] to a list of > + * comma-separated language tags [3] which can be used as a value of > + * Accept-Language header. > + [...] > + */ > + for (pos = lang_begin; ; pos++) { > + if (!*pos || *pos == ':') { > + if (is_q_factor_required) { > + /* Put a q-factor only if it is less than 1.0. */ > + if (q < max_q) > + strbuf_addf(buf, q_format, q); > + > + if (q > 1) > + q--; > + > + last_size = buf->len; > + > + is_q_factor_required = 0; > + } > + parse_state = SEPARATOR; > + } else if (parse_state == CODESET_OR_MODIFIER) > + continue; > + else if (*pos == ' ') /* Ignore whitespace character */ > + continue; > + else if (*pos == '.' || *pos == '@') /* Remove .codeset and @modifier. */ > + parse_state = CODESET_OR_MODIFIER; > + else { > + if (parse_state != LANGUAGE_TAG && q < max_q) > + strbuf_addstr(buf, ", "); > + strbuf_addch(buf, *pos == '_' ? '-' : *pos); > + is_q_factor_required = 1; > + parse_state = LANGUAGE_TAG; > + } > + > + if (buf->len > MAX_SIZE_OF_HEADER - MAX_SIZE_OF_ASTERISK_ELEMENT) { > + strbuf_remove(buf, last_size, buf->len - last_size); > + break; > + } > + > + if (!*pos) > + break; > + } Although often suitable when parsing complex inputs, state machines demand high cognitive load. The input you're parsing, on the other hand, is straightforward and can easily be processed with a simple sequential parser, which is easier to reason about and review for correctness. For instance, something like this: while (*s) { /* collect language tag */ for (; *s && *s != '.' && *s != '@' && *s != ':'; s++) strbuf_addch(buf, *s == '_' ? '-' : *s); /* skip .codeset and @modifier */ while (*s && *s != ':') s++; strbuf_addf(buf, q_format, q); ... other bookkeeping ... if (*s == ':') s++; } This example is intentionally simplified but illustrates the general idea. It lacks comma insertion (left as an exercise for the reader) and empty language tag handling (":en", "en::ko"); and doesn't take whitespace into consideration since it wasn't clear why your v6 parser pays attention to embedded spaces, whereas your earlier versions did not. > + /* Don't add Accept-Language header if no language is preferred. */ > + if (q >= max_q) { > + strbuf_reset(buf); > + return; > + } > + > + /* Add '*' with minimum q-factor greater than 0.0. */ > + strbuf_addstr(buf, ", *"); > + strbuf_addf(buf, q_format, 1); > +} > + > +/* > + * Get an Accept-Language header which indicates user's preferred languages. > + * > + * This function always return non-NULL string as strbuf_detach() does. A couple comments: It's not necessary to explain the public API in terms of an implementation detail. Callers of this function don't care and don't need to know that the value was constructed via strbuf, nor that it is somehow dependent upon the behavior of the underlying implementation of strbuf_detach(). This is a somewhat unusual contract. It's much more common and idiomatic in C to return NULL as an indication of "no preference" (or "failure") than to return an empty string. > + * > + * Examples: > + * LANGUAGE= -> "" > + * LANGUAGE=ko:en -> "Accept-Language: ko, en; q=0.9, *; q=0.1" > + * LANGUAGE=ko_KR.UTF-8:sr@latin -> "Accept-Language: ko-KR, sr; q=0.9, *; q=0.1" > + * LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *; q=0.1" > + * LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *; q=0.1" > + * LANGUAGE= LANG=C -> "" > + */ > +static const char *get_accept_language(void) > +{ > + if (!cached_accept_language) { > + struct strbuf buf = STRBUF_INIT; > + write_accept_language(&buf); > + cached_accept_language = strbuf_detach(&buf, NULL); > + strbuf_release(&buf); Junio already mentioned that strbuf_release() is unnecessary following strbuf_detach(). > + } > + > + return cached_accept_language; > +} > + > /* http_request() targets */ > #define HTTP_REQUEST_STRBUF 0 > #define HTTP_REQUEST_FILE 1 ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH v6 1/1] http: Add Accept-Language header if possible 2014-12-24 20:35 ` Eric Sunshine @ 2014-12-29 16:18 ` Junio C Hamano 2015-01-18 12:23 ` [PATCH v7 0/1] " Yi EungJun 0 siblings, 1 reply; 58+ messages in thread From: Junio C Hamano @ 2014-12-29 16:18 UTC (permalink / raw) To: Eric Sunshine Cc: Yi EungJun, Git List, Yi EungJun, Jeff King, Peter Krefting, Michael Blume Eric Sunshine <sunshine@sunshineco.com> writes: > Just a few comments and observations below. Alone, they are not > necessarily worth a re-roll, but if you happen to re-roll for some > other reason, perhaps take them into consideration. I actually think everything you said in this review makes sense and will make the resulting code a lot better (especially the part on the parsing loop). Thanks, as usual, for a careful reading. ^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH v7 0/1] http: Add Accept-Language header if possible 2014-12-29 16:18 ` Junio C Hamano @ 2015-01-18 12:23 ` Yi EungJun 2015-01-18 12:26 ` [PATCH v7 1/1] " Yi EungJun 0 siblings, 1 reply; 58+ messages in thread From: Yi EungJun @ 2015-01-18 12:23 UTC (permalink / raw) To: Git List Cc: Yi EungJun, Junio C Hamano, Eric Sunshine, Jeff King, Peter Krefting, Michael Blume From: Yi EungJun <eungjun.yi@navercorp.com> Changes since v6 >From Junio C Hamano's review: * Fix check_language() in t5550-http-fetch-dumb.sh as his suggestion. >From Eric Sunshine's review: * Rewrite the parser without state. Yi EungJun (1): http: Add Accept-Language header if possible http.c | 152 +++++++++++++++++++++++++++++++++++++++++++++ remote-curl.c | 2 + t/t5550-http-fetch-dumb.sh | 42 +++++++++++++ 3 files changed, 196 insertions(+) -- 2.2.0.44.g37b3e56.dirty ^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH v7 1/1] http: Add Accept-Language header if possible 2015-01-18 12:23 ` [PATCH v7 0/1] " Yi EungJun @ 2015-01-18 12:26 ` Yi EungJun 2015-01-18 15:14 ` Torsten Bögershausen ` (2 more replies) 0 siblings, 3 replies; 58+ messages in thread From: Yi EungJun @ 2015-01-18 12:26 UTC (permalink / raw) To: Git List Cc: Yi EungJun, Junio C Hamano, Eric Sunshine, Jeff King, Peter Krefting, Michael Blume From: Yi EungJun <eungjun.yi@navercorp.com> Add an Accept-Language header which indicates the user's preferred languages defined by $LANGUAGE, $LC_ALL, $LC_MESSAGES and $LANG. Examples: LANGUAGE= -> "" LANGUAGE=ko:en -> "Accept-Language: ko, en;q=0.9, *;q=0.1" LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *;q=0.1" LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *;q=0.1" This gives git servers a chance to display remote error messages in the user's preferred language. Limit the number of languages to 1,000 because q-value must not be smaller than 0.001, and limit the length of Accept-Language header to 4,000 bytes for some HTTP servers which cannot accept such long header. Signed-off-by: Yi EungJun <eungjun.yi@navercorp.com> --- http.c | 152 +++++++++++++++++++++++++++++++++++++++++++++ remote-curl.c | 2 + t/t5550-http-fetch-dumb.sh | 42 +++++++++++++ 3 files changed, 196 insertions(+) diff --git a/http.c b/http.c index 040f362..349b033 100644 --- a/http.c +++ b/http.c @@ -68,6 +68,8 @@ static struct curl_slist *no_pragma_header; static struct active_request_slot *active_queue_head; +static char *cached_accept_language; + size_t fread_buffer(char *ptr, size_t eltsize, size_t nmemb, void *buffer_) { size_t size = eltsize * nmemb; @@ -515,6 +517,11 @@ void http_cleanup(void) cert_auth.password = NULL; } ssl_cert_password_required = 0; + + if (cached_accept_language) { + free(cached_accept_language); + cached_accept_language = NULL; + } } struct active_request_slot *get_active_slot(void) @@ -986,6 +993,145 @@ static void extract_content_type(struct strbuf *raw, struct strbuf *type, strbuf_addstr(charset, "ISO-8859-1"); } +/* + * Guess the user's preferred languages from the value in LANGUAGE environment + * variable and LC_MESSAGES locale category if NO_GETTEXT is not defined. + * + * The result can be a colon-separated list like "ko:ja:en". + */ +static const char *get_preferred_languages(void) +{ + const char *retval; + + retval = getenv("LANGUAGE"); + if (retval && *retval) + return retval; + +#ifndef NO_GETTEXT + retval = setlocale(LC_MESSAGES, NULL); + if (retval && *retval && + strcmp(retval, "C") && + strcmp(retval, "POSIX")) + return retval; +#endif + + return NULL; +} + +static void write_accept_language(struct strbuf *buf) +{ + /* + * MAX_DECIMAL_PLACES must not be larger than 3. If it is larger than + * that, q-value will be smaller than 0.001, the minimum q-value the + * HTTP specification allows. See + * http://tools.ietf.org/html/rfc7231#section-5.3.1 for q-value. + */ + const int MAX_DECIMAL_PLACES = 3; + const int MAX_LANGUAGE_TAGS = 1000; + const int MAX_ACCEPT_LANGUAGE_HEADER_SIZE = 4000; + struct strbuf *language_tags = NULL; + int num_langs; + const char *s = get_preferred_languages(); + + /* Don't add Accept-Language header if no language is preferred. */ + if (!s) + return; + + /* + * Split the colon-separated string of preferred languages into + * language_tags array. + */ + do { + /* increase language_tags array to add new language tag */ + REALLOC_ARRAY(language_tags, num_langs + 1); + strbuf_init(&language_tags[num_langs], 0); + + /* collect language tag */ + for (; *s && (isalnum(*s) || *s == '_'); s++) + strbuf_addch(&language_tags[num_langs], *s == '_' ? '-' : *s); + + /* skip .codeset, @modifier and any other unnecessary parts */ + while (*s && *s != ':') + s++; + + if (language_tags[num_langs].len > 0) { + num_langs++; + if (num_langs >= MAX_LANGUAGE_TAGS - 1) /* -1 for '*' */ + break; + } + } while (*s++); + + /* write Accept-Language header into buf */ + if (num_langs >= 1) { + int i; + int last_buf_len; + int max_q; + int decimal_places; + char q_format[32]; + + /* add '*' */ + REALLOC_ARRAY(language_tags, num_langs + 1); + strbuf_init(&language_tags[num_langs], 0); + strbuf_addstr(&language_tags[num_langs++], "*"); + + /* compute decimal_places */ + for (max_q = 1, decimal_places = 0; + max_q < num_langs && decimal_places <= MAX_DECIMAL_PLACES; + decimal_places++, max_q *= 10) + ; + + sprintf(q_format, ";q=0.%%0%dd", decimal_places); + + strbuf_addstr(buf, "Accept-Language: "); + + for(i = 0; i < num_langs; i++) { + if (language_tags[i].len == 0) + continue; + + if (i > 0) + strbuf_addstr(buf, ", "); + + strbuf_addstr(buf, strbuf_detach(&language_tags[i], NULL)); + + if (i > 0) + strbuf_addf(buf, q_format, max_q - i); + + if (buf->len > MAX_ACCEPT_LANGUAGE_HEADER_SIZE) { + strbuf_remove(buf, last_buf_len, buf->len - last_buf_len); + break; + } + + last_buf_len = buf->len; + } + } + + free(language_tags); +} + +/* + * Get an Accept-Language header which indicates user's preferred languages. + * + * This function always return non-NULL string as strbuf_detach() does. + * + * Examples: + * LANGUAGE= -> "" + * LANGUAGE=ko:en -> "Accept-Language: ko, en; q=0.9, *; q=0.1" + * LANGUAGE=ko_KR.UTF-8:sr@latin -> "Accept-Language: ko-KR, sr; q=0.9, *; q=0.1" + * LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *; q=0.1" + * LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *; q=0.1" + * LANGUAGE= LANG=C -> "" + */ +static const char *get_accept_language(void) +{ + if (!cached_accept_language) { + struct strbuf buf = STRBUF_INIT; + write_accept_language(&buf); + cached_accept_language = strbuf_detach(&buf, NULL); + } + + return cached_accept_language; +} + /* http_request() targets */ #define HTTP_REQUEST_STRBUF 0 #define HTTP_REQUEST_FILE 1 @@ -998,6 +1144,7 @@ static int http_request(const char *url, struct slot_results results; struct curl_slist *headers = NULL; struct strbuf buf = STRBUF_INIT; + const char *accept_language; int ret; slot = get_active_slot(); @@ -1023,6 +1170,11 @@ static int http_request(const char *url, fwrite_buffer); } + accept_language = get_accept_language(); + + if (strlen(accept_language) > 0) + headers = curl_slist_append(headers, accept_language); + strbuf_addstr(&buf, "Pragma:"); if (options && options->no_cache) strbuf_addstr(&buf, " no-cache"); diff --git a/remote-curl.c b/remote-curl.c index dd63bc2..04989e5 100644 --- a/remote-curl.c +++ b/remote-curl.c @@ -962,6 +962,8 @@ int main(int argc, const char **argv) struct strbuf buf = STRBUF_INIT; int nongit; + git_setup_gettext(); + git_extract_argv0_path(argv[0]); setup_git_directory_gently(&nongit); if (argc < 2) { diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh index ac71418..e1e2938 100755 --- a/t/t5550-http-fetch-dumb.sh +++ b/t/t5550-http-fetch-dumb.sh @@ -196,5 +196,47 @@ test_expect_success 'reencoding is robust to whitespace oddities' ' grep "this is the error message" stderr ' +check_language () { + case "$2" in + '') + >expect + ;; + ?*) + echo "Accept-Language: $1" >expect + ;; + esac && + GIT_CURL_VERBOSE=1 \ + LANGUAGE=$2 \ + git ls-remote "$HTTPD_URL/dumb/repo.git" >output 2>&1 && + tr -d '\015' <output | + sort -u | + sed -ne '/^Accept-Language:/ p' >actual && + test_cmp expect actual +} + +test_expect_success 'git client sends Accept-Language based on LANGUAGE' ' + check_language "ko-KR, *;q=0.9" ko_KR.UTF-8' + +test_expect_success 'git client sends Accept-Language correctly with unordinary LANGUAGE' ' + check_language "ko-KR, *;q=0.9" "ko_KR:" && + check_language "ko-KR, en-US;q=0.9, *;q=0.8" "ko_KR::en_US" && + check_language "ko-KR, *;q=0.9" ":::ko_KR" && + check_language "ko-KR, en-US;q=0.9, *;q=0.8" "ko_KR!!:en_US" && + check_language "ko-KR, ja-JP;q=0.9, *;q=0.8" "ko_KR en_US:ja_JP"' + +test_expect_success 'git client sends Accept-Language with many preferred languages' ' + check_language "ko-KR, en-US;q=0.9, fr-CA;q=0.8, de;q=0.7, sr;q=0.6, \ +ja;q=0.5, zh;q=0.4, sv;q=0.3, pt;q=0.2, *;q=0.1" \ + ko_KR.EUC-KR:en_US.UTF-8:fr_CA:de.UTF-8@euro:sr@latin:ja:zh:sv:pt && + check_language "ko-KR, en-US;q=0.99, fr-CA;q=0.98, de;q=0.97, sr;q=0.96, \ +ja;q=0.95, zh;q=0.94, sv;q=0.93, pt;q=0.92, nb;q=0.91, *;q=0.90" \ + ko_KR.EUC-KR:en_US.UTF-8:fr_CA:de.UTF-8@euro:sr@latin:ja:zh:sv:pt:nb +' + +test_expect_success 'git client does not send an empty Accept-Language' ' + GIT_CURL_VERBOSE=1 LANGUAGE= git ls-remote "$HTTPD_URL/dumb/repo.git" 2>stderr && + ! grep "^Accept-Language:" stderr +' + stop_httpd test_done -- 2.2.0.44.g37b3e56.dirty ^ permalink raw reply related [flat|nested] 58+ messages in thread
* Re: [PATCH v7 1/1] http: Add Accept-Language header if possible 2015-01-18 12:26 ` [PATCH v7 1/1] " Yi EungJun @ 2015-01-18 15:14 ` Torsten Bögershausen 2015-01-19 20:21 ` [PATCH v6 0/1] " Eric Sunshine 2015-01-22 7:54 ` [PATCH v7 1/1] " Junio C Hamano 2 siblings, 0 replies; 58+ messages in thread From: Torsten Bögershausen @ 2015-01-18 15:14 UTC (permalink / raw) To: Yi EungJun, Git List Cc: Yi EungJun, Junio C Hamano, Eric Sunshine, Jeff King, Peter Krefting, Michael Blume On 18.01.15 13:26, Yi EungJun wrote: > From: Yi EungJun <eungjun.yi@navercorp.com> > diff --git a/http.c b/http.c > index 040f362..349b033 100644 > --- a/http.c > +++ b/http.c > @@ -68,6 +68,8 @@ static struct curl_slist *no_pragma_header; > > static struct active_request_slot *active_queue_head; > > +static char *cached_accept_language; > + > size_t fread_buffer(char *ptr, size_t eltsize, size_t nmemb, void *buffer_) > { > size_t size = eltsize * nmemb; > @@ -515,6 +517,11 @@ void http_cleanup(void) > cert_auth.password = NULL; > } > ssl_cert_password_required = 0; > + > + if (cached_accept_language) { > + free(cached_accept_language); > + cached_accept_language = NULL; > + } Minor remark: free(NULL) is legal and does nothing. We can simplify the code somewhat: ssl_cert_password_required = 0; free(cached_accept_language); cached_accept_language = NULL; > } > > struct active_request_slot *get_active_slot(void) > @@ -986,6 +993,145 @@ static void extract_content_type(struct strbuf *raw, struct strbuf *type, > strbuf_addstr(charset, "ISO-8859-1"); > } > > +/* > + * Guess the user's preferred languages from the value in LANGUAGE environment > + * variable and LC_MESSAGES locale category if NO_GETTEXT is not defined. > + * > + * The result can be a colon-separated list like "ko:ja:en". > + */ > +static const char *get_preferred_languages(void) > +{ > + const char *retval; > + > + retval = getenv("LANGUAGE"); > + if (retval && *retval) > + return retval; > + > +#ifndef NO_GETTEXT > + retval = setlocale(LC_MESSAGES, NULL); > + if (retval && *retval && > + strcmp(retval, "C") && > + strcmp(retval, "POSIX")) > + return retval; > +#endif > + > + return NULL; > +} > + > +static void write_accept_language(struct strbuf *buf) > +{ > + /* > + * MAX_DECIMAL_PLACES must not be larger than 3. If it is larger than > + * that, q-value will be smaller than 0.001, the minimum q-value the > + * HTTP specification allows. See > + * http://tools.ietf.org/html/rfc7231#section-5.3.1 for q-value. > + */ > + const int MAX_DECIMAL_PLACES = 3; > + const int MAX_LANGUAGE_TAGS = 1000; > + const int MAX_ACCEPT_LANGUAGE_HEADER_SIZE = 4000; > + struct strbuf *language_tags = NULL; > + int num_langs; > + const char *s = get_preferred_languages(); > + > + /* Don't add Accept-Language header if no language is preferred. */ > + if (!s) > + return; > + > + /* > + * Split the colon-separated string of preferred languages into > + * language_tags array. > + */ > + do { > + /* increase language_tags array to add new language tag */ > + REALLOC_ARRAY(language_tags, num_langs + 1); > + strbuf_init(&language_tags[num_langs], 0); > + > + /* collect language tag */ > + for (; *s && (isalnum(*s) || *s == '_'); s++) > + strbuf_addch(&language_tags[num_langs], *s == '_' ? '-' : *s); > + > + /* skip .codeset, @modifier and any other unnecessary parts */ > + while (*s && *s != ':') > + s++; > + > + if (language_tags[num_langs].len > 0) { > + num_langs++; > + if (num_langs >= MAX_LANGUAGE_TAGS - 1) /* -1 for '*' */ > + break; > + } > + } while (*s++); > + > + /* write Accept-Language header into buf */ > + if (num_langs >= 1) { > + int i; > + int last_buf_len; > + int max_q; > + int decimal_places; > + char q_format[32]; > + > + /* add '*' */ > + REALLOC_ARRAY(language_tags, num_langs + 1); > + strbuf_init(&language_tags[num_langs], 0); > + strbuf_addstr(&language_tags[num_langs++], "*"); > + > + /* compute decimal_places */ > + for (max_q = 1, decimal_places = 0; > + max_q < num_langs && decimal_places <= MAX_DECIMAL_PLACES; > + decimal_places++, max_q *= 10) > + ; > + > + sprintf(q_format, ";q=0.%%0%dd", decimal_places); > + > + strbuf_addstr(buf, "Accept-Language: "); > + > + for(i = 0; i < num_langs; i++) { > + if (language_tags[i].len == 0) > + continue; > + > + if (i > 0) > + strbuf_addstr(buf, ", "); > + > + strbuf_addstr(buf, strbuf_detach(&language_tags[i], NULL)); > + > + if (i > 0) > + strbuf_addf(buf, q_format, max_q - i); > + > + if (buf->len > MAX_ACCEPT_LANGUAGE_HEADER_SIZE) { > + strbuf_remove(buf, last_buf_len, buf->len - last_buf_len); > + break; > + } > + > + last_buf_len = buf->len; > + } > + } > + > + free(language_tags); > +} > + > +/* > + * Get an Accept-Language header which indicates user's preferred languages. > + * > + * This function always return non-NULL string as strbuf_detach() does. > + * > + * Examples: > + * LANGUAGE= -> "" > + * LANGUAGE=ko:en -> "Accept-Language: ko, en; q=0.9, *; q=0.1" > + * LANGUAGE=ko_KR.UTF-8:sr@latin -> "Accept-Language: ko-KR, sr; q=0.9, *; q=0.1" > + * LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *; q=0.1" > + * LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *; q=0.1" > + * LANGUAGE= LANG=C -> "" > + */ > +static const char *get_accept_language(void) > +{ > + if (!cached_accept_language) { > + struct strbuf buf = STRBUF_INIT; > + write_accept_language(&buf); > + cached_accept_language = strbuf_detach(&buf, NULL); > + } > + > + return cached_accept_language; > +} > + > /* http_request() targets */ > #define HTTP_REQUEST_STRBUF 0 > #define HTTP_REQUEST_FILE 1 > @@ -998,6 +1144,7 @@ static int http_request(const char *url, > struct slot_results results; > struct curl_slist *headers = NULL; > struct strbuf buf = STRBUF_INIT; > + const char *accept_language; > int ret; > > slot = get_active_slot(); > @@ -1023,6 +1170,11 @@ static int http_request(const char *url, > fwrite_buffer); > } > > + accept_language = get_accept_language(); > + > + if (strlen(accept_language) > 0) > + headers = curl_slist_append(headers, accept_language); > + > strbuf_addstr(&buf, "Pragma:"); > if (options && options->no_cache) > strbuf_addstr(&buf, " no-cache"); > diff --git a/remote-curl.c b/remote-curl.c > index dd63bc2..04989e5 100644 > --- a/remote-curl.c > +++ b/remote-curl.c > @@ -962,6 +962,8 @@ int main(int argc, const char **argv) > struct strbuf buf = STRBUF_INIT; > int nongit; > > + git_setup_gettext(); > + > git_extract_argv0_path(argv[0]); > setup_git_directory_gently(&nongit); > if (argc < 2) { > diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh > index ac71418..e1e2938 100755 > --- a/t/t5550-http-fetch-dumb.sh > +++ b/t/t5550-http-fetch-dumb.sh > @@ -196,5 +196,47 @@ test_expect_success 'reencoding is robust to whitespace oddities' ' > grep "this is the error message" stderr > ' > > +check_language () { > + case "$2" in > + '') > + >expect > + ;; > + ?*) > + echo "Accept-Language: $1" >expect > + ;; > + esac && > + GIT_CURL_VERBOSE=1 \ > + LANGUAGE=$2 \ > + git ls-remote "$HTTPD_URL/dumb/repo.git" >output 2>&1 && > + tr -d '\015' <output | > + sort -u | > + sed -ne '/^Accept-Language:/ p' >actual && > + test_cmp expect actual > +} > + > +test_expect_success 'git client sends Accept-Language based on LANGUAGE' ' > + check_language "ko-KR, *;q=0.9" ko_KR.UTF-8' > + > +test_expect_success 'git client sends Accept-Language correctly with unordinary LANGUAGE' ' > + check_language "ko-KR, *;q=0.9" "ko_KR:" && > + check_language "ko-KR, en-US;q=0.9, *;q=0.8" "ko_KR::en_US" && > + check_language "ko-KR, *;q=0.9" ":::ko_KR" && > + check_language "ko-KR, en-US;q=0.9, *;q=0.8" "ko_KR!!:en_US" && > + check_language "ko-KR, ja-JP;q=0.9, *;q=0.8" "ko_KR en_US:ja_JP"' > + > +test_expect_success 'git client sends Accept-Language with many preferred languages' ' > + check_language "ko-KR, en-US;q=0.9, fr-CA;q=0.8, de;q=0.7, sr;q=0.6, \ > +ja;q=0.5, zh;q=0.4, sv;q=0.3, pt;q=0.2, *;q=0.1" \ > + ko_KR.EUC-KR:en_US.UTF-8:fr_CA:de.UTF-8@euro:sr@latin:ja:zh:sv:pt && > + check_language "ko-KR, en-US;q=0.99, fr-CA;q=0.98, de;q=0.97, sr;q=0.96, \ > +ja;q=0.95, zh;q=0.94, sv;q=0.93, pt;q=0.92, nb;q=0.91, *;q=0.90" \ > + ko_KR.EUC-KR:en_US.UTF-8:fr_CA:de.UTF-8@euro:sr@latin:ja:zh:sv:pt:nb > +' > + > +test_expect_success 'git client does not send an empty Accept-Language' ' > + GIT_CURL_VERBOSE=1 LANGUAGE= git ls-remote "$HTTPD_URL/dumb/repo.git" 2>stderr && > + ! grep "^Accept-Language:" stderr > +' > + > stop_httpd > test_done > ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH v6 0/1] http: Add Accept-Language header if possible 2015-01-18 12:26 ` [PATCH v7 1/1] " Yi EungJun 2015-01-18 15:14 ` Torsten Bögershausen @ 2015-01-19 20:21 ` Eric Sunshine 2015-01-22 7:54 ` [PATCH v7 1/1] " Junio C Hamano 2 siblings, 0 replies; 58+ messages in thread From: Eric Sunshine @ 2015-01-19 20:21 UTC (permalink / raw) To: Yi EungJun Cc: Git List, Yi EungJun, Junio C Hamano, Jeff King, Peter Krefting, Michael Blume On Sunday, January 18, 2015, Yi EungJun <semtlenori@gmail.com> wrote: > Add an Accept-Language header which indicates the user's preferred > languages defined by $LANGUAGE, $LC_ALL, $LC_MESSAGES and $LANG. > > Examples: > LANGUAGE= -> "" > LANGUAGE=ko:en -> "Accept-Language: ko, en;q=0.9, *;q=0.1" > LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *;q=0.1" > LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *;q=0.1" > > This gives git servers a chance to display remote error messages in > the user's preferred language. > > Limit the number of languages to 1,000 because q-value must not be > smaller than 0.001, and limit the length of Accept-Language header to > 4,000 bytes for some HTTP servers which cannot accept such long header. > > Signed-off-by: Yi EungJun <eungjun.yi@navercorp.com> > --- > diff --git a/http.c b/http.c > index 040f362..349b033 100644 > --- a/http.c > +++ b/http.c > @@ -986,6 +993,145 @@ static void extract_content_type(struct strbuf *raw, struct strbuf *type, > strbuf_addstr(charset, "ISO-8859-1"); > } > > +static void write_accept_language(struct strbuf *buf) > +{ > + [...] > + const int MAX_DECIMAL_PLACES = 3; > + const int MAX_LANGUAGE_TAGS = 1000; > + const int MAX_ACCEPT_LANGUAGE_HEADER_SIZE = 4000; > + struct strbuf *language_tags = NULL; > + int num_langs; Mental note: 'num_langs' is not initialized. > + const char *s = get_preferred_languages(); > + > + /* Don't add Accept-Language header if no language is preferred. */ > + if (!s) > + return; > + > + /* > + * Split the colon-separated string of preferred languages into > + * language_tags array. > + */ > + do { > + /* increase language_tags array to add new language tag */ > + REALLOC_ARRAY(language_tags, num_langs + 1); 'num_langs' is used here uninitialized. > + strbuf_init(&language_tags[num_langs], 0); > + > + /* collect language tag */ > + for (; *s && (isalnum(*s) || *s == '_'); s++) > + strbuf_addch(&language_tags[num_langs], *s == '_' ? '-' : *s); > + > + /* skip .codeset, @modifier and any other unnecessary parts */ > + while (*s && *s != ':') > + s++; > + > + if (language_tags[num_langs].len > 0) { Mental note: An empty ("") language tag is never allowed in language_tags[]. > + num_langs++; This is a little bit ugly. At the top of the loop, you allocate space in the array for a strbuf and initialize it. However, if the language tag is empty (""), then 'num_langs' is never incremented, so the next time through the loop, strbuf_init() is invoked on the same block of memory (assuming the realloc was a no-op since the allocation size did not change), overwriting whatever was there and possibly leaking memory. In this particular case, by examining the parser code closely, we can see that nothing was added to the strbuf, so nothing is being leaked the next time around, given the current implementation of strbuf. However, this is potentially fragile. A change to the implementation of strbuf in the future (for instance, if strbuf_init() allocates memory immediately) could result in a leak here. Moreover, this no-leak situation only holds true if no text at all has been added to the strbuf after strbuf_init(). If someone changes the parser in the future to operate a bit differently so that some text is added and then removed from the strbuf, even though the end result still has length is 0, then it will start leaking. One way to make this more robust would be to have a separate strbuf for collecting the language tag. When you encounter a non-empty tag, only then grow the array and initialize the new strbuf in the array. Finally, use strbuf_swap() to swap the collected language tag into the new array position. Something like this: struct strbuf tag = STRBUF_INIT; do { for (; *s && (isalnum(*s) || *s == '_'); s++) strbuf_addch(&tag, *s == '_' ? '-' : *s); [...] if (tag.len) { num_langs++; REALLOC_ARRAY(language_tags, num_langs); strbuf_init(&language_tags[num_langs], 0); strbuf_swap(&tag, &language_tags[num_langs]); if (num_langs >= ...) break; } while (...); strbuf_release(&tag); > + if (num_langs >= MAX_LANGUAGE_TAGS - 1) /* -1 for '*' */ > + break; > + } > + } while (*s++); > + > + /* write Accept-Language header into buf */ > + if (num_langs >= 1) { > + int i; > + int last_buf_len; Mental note: 'last_buf_len' is not initialized. > + int max_q; > + int decimal_places; > + char q_format[32]; > + > + /* add '*' */ > + REALLOC_ARRAY(language_tags, num_langs + 1); > + strbuf_init(&language_tags[num_langs], 0); > + strbuf_addstr(&language_tags[num_langs++], "*"); > + > + /* compute decimal_places */ > + for (max_q = 1, decimal_places = 0; > + max_q < num_langs && decimal_places <= MAX_DECIMAL_PLACES; > + decimal_places++, max_q *= 10) > + ; > + > + sprintf(q_format, ";q=0.%%0%dd", decimal_places); > + > + strbuf_addstr(buf, "Accept-Language: "); > + > + for(i = 0; i < num_langs; i++) { > + if (language_tags[i].len == 0) > + continue; The parsing code does not allow empty tags ("") in language_tags[], so this conditional is useless, isn't it? > + > + if (i > 0) > + strbuf_addstr(buf, ", "); > + > + strbuf_addstr(buf, strbuf_detach(&language_tags[i], NULL)); This leaks the string detached from 'language_tag[i]' since strbuf_addstr() does not take ownership of it. > + if (i > 0) > + strbuf_addf(buf, q_format, max_q - i); > + > + if (buf->len > MAX_ACCEPT_LANGUAGE_HEADER_SIZE) { > + strbuf_remove(buf, last_buf_len, buf->len - last_buf_len); 'last_buf_len' is (potentially) used here uninitialized the first time through loop. > + break; > + } > + > + last_buf_len = buf->len; > + } > + } > + > + free(language_tags); This _seems_ to be okay since strbuf_detach() was invoked for each strbuf in language_tags[], however, those strings were in fact leaked (as noted above), so it's not actually correct. > +} > + > +/* > + * Get an Accept-Language header which indicates user's preferred languages. > + * > + * This function always return non-NULL string as strbuf_detach() does. Repeating from [1]: It's not good form to describe the published API in terms of an implementation detail (strbuf_detach). Also, it would be more idiomatic in C to return NULL rather than empty string. [1]: http://article.gmane.org/gmane.comp.version-control.git/261810/ > + * > + * Examples: > + * LANGUAGE= -> "" > + * LANGUAGE=ko:en -> "Accept-Language: ko, en; q=0.9, *; q=0.1" > + * LANGUAGE=ko_KR.UTF-8:sr@latin -> "Accept-Language: ko-KR, sr; q=0.9, *; q=0.1" > + * LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *; q=0.1" > + * LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *; q=0.1" > + * LANGUAGE= LANG=C -> "" > + */ > +static const char *get_accept_language(void) > +{ > + if (!cached_accept_language) { > + struct strbuf buf = STRBUF_INIT; > + write_accept_language(&buf); > + cached_accept_language = strbuf_detach(&buf, NULL); > + } > + > + return cached_accept_language; > +} > + ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH v7 1/1] http: Add Accept-Language header if possible 2015-01-18 12:26 ` [PATCH v7 1/1] " Yi EungJun 2015-01-18 15:14 ` Torsten Bögershausen 2015-01-19 20:21 ` [PATCH v6 0/1] " Eric Sunshine @ 2015-01-22 7:54 ` Junio C Hamano 2015-01-27 15:51 ` [PATCH v8 0/1] " Yi EungJun 2 siblings, 1 reply; 58+ messages in thread From: Junio C Hamano @ 2015-01-22 7:54 UTC (permalink / raw) To: Yi EungJun Cc: Git List, Yi EungJun, Eric Sunshine, Jeff King, Peter Krefting, Michael Blume Yi EungJun <semtlenori@gmail.com> writes: > +static void write_accept_language(struct strbuf *buf) > +{ > + /* > + * MAX_DECIMAL_PLACES must not be larger than 3. If it is larger than > + * that, q-value will be smaller than 0.001, the minimum q-value the > + * HTTP specification allows. See > + * http://tools.ietf.org/html/rfc7231#section-5.3.1 for q-value. > + */ > + const int MAX_DECIMAL_PLACES = 3; > + const int MAX_LANGUAGE_TAGS = 1000; > + const int MAX_ACCEPT_LANGUAGE_HEADER_SIZE = 4000; > + struct strbuf *language_tags = NULL; > + int num_langs; No initial value given to this variable, but... > + const char *s = get_preferred_languages(); > + > + /* Don't add Accept-Language header if no language is preferred. */ > + if (!s) > + return; > + > + /* > + * Split the colon-separated string of preferred languages into > + * language_tags array. > + */ > + do { > + /* increase language_tags array to add new language tag */ > + REALLOC_ARRAY(language_tags, num_langs + 1); ... it is nevertheless used. I think it was meant to start at 0? > + /* write Accept-Language header into buf */ > + if (num_langs >= 1) { > + int i; > + int last_buf_len; This is uninitialized... > + int max_q; > + int decimal_places; > + char q_format[32]; > + > +... > + if (buf->len > MAX_ACCEPT_LANGUAGE_HEADER_SIZE) { > + strbuf_remove(buf, last_buf_len, buf->len - last_buf_len); ... and then it is used here. ^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH v8 0/1] http: Add Accept-Language header if possible 2015-01-22 7:54 ` [PATCH v7 1/1] " Junio C Hamano @ 2015-01-27 15:51 ` Yi EungJun 2015-01-27 15:51 ` [PATCH] " Yi EungJun 0 siblings, 1 reply; 58+ messages in thread From: Yi EungJun @ 2015-01-27 15:51 UTC (permalink / raw) To: Git List Cc: Yi EungJun, Junio C Hamano, Jeff King, Peter Krefting, Michael Blume, Torsten Bögershausen From: Yi EungJun <eungjun.yi@navercorp.com> Change since v7 From Torsten Bögershausen's review: * remove unnecessary if-statement From Eric Sunshine's review: * fix memory leaks and uninitialized variables * remove unnecessary if-statement From Junio C Hamano's review: * fix uninitialized variables Yi EungJun (1): http: Add Accept-Language header if possible http.c | 151 +++++++++++++++++++++++++++++++++++++++++++++ remote-curl.c | 2 + t/t5550-http-fetch-dumb.sh | 42 +++++++++++++ 3 files changed, 195 insertions(+) -- 2.3.0.rc1.32.ga3df1c7 ^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH] http: Add Accept-Language header if possible 2015-01-27 15:51 ` [PATCH v8 0/1] " Yi EungJun @ 2015-01-27 15:51 ` Yi EungJun 2015-01-27 23:34 ` Junio C Hamano 0 siblings, 1 reply; 58+ messages in thread From: Yi EungJun @ 2015-01-27 15:51 UTC (permalink / raw) To: Git List Cc: Yi EungJun, Junio C Hamano, Jeff King, Peter Krefting, Michael Blume, Torsten Bögershausen From: Yi EungJun <eungjun.yi@navercorp.com> Add an Accept-Language header which indicates the user's preferred languages defined by $LANGUAGE, $LC_ALL, $LC_MESSAGES and $LANG. Examples: LANGUAGE= -> "" LANGUAGE=ko:en -> "Accept-Language: ko, en;q=0.9, *;q=0.1" LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *;q=0.1" LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *;q=0.1" This gives git servers a chance to display remote error messages in the user's preferred language. Limit the number of languages to 1,000 because q-value must not be smaller than 0.001, and limit the length of Accept-Language header to 4,000 bytes for some HTTP servers which cannot accept such long header. Signed-off-by: Yi EungJun <eungjun.yi@navercorp.com> --- http.c | 151 +++++++++++++++++++++++++++++++++++++++++++++ remote-curl.c | 2 + t/t5550-http-fetch-dumb.sh | 42 +++++++++++++ 3 files changed, 195 insertions(+) diff --git a/http.c b/http.c index 040f362..6111c6a 100644 --- a/http.c +++ b/http.c @@ -68,6 +68,8 @@ static struct curl_slist *no_pragma_header; static struct active_request_slot *active_queue_head; +static char *cached_accept_language; + size_t fread_buffer(char *ptr, size_t eltsize, size_t nmemb, void *buffer_) { size_t size = eltsize * nmemb; @@ -515,6 +517,9 @@ void http_cleanup(void) cert_auth.password = NULL; } ssl_cert_password_required = 0; + + free(cached_accept_language); + cached_accept_language = NULL; } struct active_request_slot *get_active_slot(void) @@ -986,6 +991,146 @@ static void extract_content_type(struct strbuf *raw, struct strbuf *type, strbuf_addstr(charset, "ISO-8859-1"); } +/* + * Guess the user's preferred languages from the value in LANGUAGE environment + * variable and LC_MESSAGES locale category if NO_GETTEXT is not defined. + * + * The result can be a colon-separated list like "ko:ja:en". + */ +static const char *get_preferred_languages(void) +{ + const char *retval; + + retval = getenv("LANGUAGE"); + if (retval && *retval) + return retval; + +#ifndef NO_GETTEXT + retval = setlocale(LC_MESSAGES, NULL); + if (retval && *retval && + strcmp(retval, "C") && + strcmp(retval, "POSIX")) + return retval; +#endif + + return NULL; +} + +static void write_accept_language(struct strbuf *buf) +{ + /* + * MAX_DECIMAL_PLACES must not be larger than 3. If it is larger than + * that, q-value will be smaller than 0.001, the minimum q-value the + * HTTP specification allows. See + * http://tools.ietf.org/html/rfc7231#section-5.3.1 for q-value. + */ + const int MAX_DECIMAL_PLACES = 3; + const int MAX_LANGUAGE_TAGS = 1000; + const int MAX_ACCEPT_LANGUAGE_HEADER_SIZE = 4000; + struct strbuf *language_tags = NULL; + int num_langs = 0; + const char *s = get_preferred_languages(); + int i; + struct strbuf tag = STRBUF_INIT; + + /* Don't add Accept-Language header if no language is preferred. */ + if (!s) + return; + + /* + * Split the colon-separated string of preferred languages into + * language_tags array. + */ + do { + /* collect language tag */ + for (; *s && (isalnum(*s) || *s == '_'); s++) + strbuf_addch(&tag, *s == '_' ? '-' : *s); + + /* skip .codeset, @modifier and any other unnecessary parts */ + while (*s && *s != ':') + s++; + + if (tag.len) { + num_langs++; + REALLOC_ARRAY(language_tags, num_langs); + strbuf_init(&language_tags[num_langs - 1], 0); + strbuf_swap(&tag, &language_tags[num_langs - 1]); + + if (num_langs >= MAX_LANGUAGE_TAGS - 1) /* -1 for '*' */ + break; + } + } while (*s++); + + /* write Accept-Language header into buf */ + if (num_langs >= 1) { + int last_buf_len = 0; + int max_q; + int decimal_places; + char q_format[32]; + + /* add '*' */ + REALLOC_ARRAY(language_tags, num_langs + 1); + strbuf_init(&language_tags[num_langs], 0); + strbuf_addstr(&language_tags[num_langs++], "*"); + + /* compute decimal_places */ + for (max_q = 1, decimal_places = 0; + max_q < num_langs && decimal_places <= MAX_DECIMAL_PLACES; + decimal_places++, max_q *= 10) + ; + + sprintf(q_format, ";q=0.%%0%dd", decimal_places); + + strbuf_addstr(buf, "Accept-Language: "); + + for(i = 0; i < num_langs; i++) { + if (i > 0) + strbuf_addstr(buf, ", "); + + strbuf_addstr(buf, strbuf_detach(&language_tags[i], NULL)); + + if (i > 0) + strbuf_addf(buf, q_format, max_q - i); + + if (buf->len > MAX_ACCEPT_LANGUAGE_HEADER_SIZE) { + strbuf_remove(buf, last_buf_len, buf->len - last_buf_len); + break; + } + + last_buf_len = buf->len; + } + } + + /* free language tags */ + for(i = 0; i < num_langs; i++) { + strbuf_release(&language_tags[i]); + } + free(language_tags); +} + +/* + * Get an Accept-Language header which indicates user's preferred languages. + * + * Examples: + * LANGUAGE= -> "" + * LANGUAGE=ko:en -> "Accept-Language: ko, en; q=0.9, *; q=0.1" + * LANGUAGE=ko_KR.UTF-8:sr@latin -> "Accept-Language: ko-KR, sr; q=0.9, *; q=0.1" + * LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *; q=0.1" + * LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *; q=0.1" + * LANGUAGE= LANG=C -> "" + */ +static const char *get_accept_language(void) +{ + if (!cached_accept_language) { + struct strbuf buf = STRBUF_INIT; + write_accept_language(&buf); + if (buf.len > 0) + cached_accept_language = strbuf_detach(&buf, NULL); + } + + return cached_accept_language; +} + /* http_request() targets */ #define HTTP_REQUEST_STRBUF 0 #define HTTP_REQUEST_FILE 1 @@ -998,6 +1143,7 @@ static int http_request(const char *url, struct slot_results results; struct curl_slist *headers = NULL; struct strbuf buf = STRBUF_INIT; + const char *accept_language; int ret; slot = get_active_slot(); @@ -1023,6 +1169,11 @@ static int http_request(const char *url, fwrite_buffer); } + accept_language = get_accept_language(); + + if (accept_language) + headers = curl_slist_append(headers, accept_language); + strbuf_addstr(&buf, "Pragma:"); if (options && options->no_cache) strbuf_addstr(&buf, " no-cache"); diff --git a/remote-curl.c b/remote-curl.c index dd63bc2..04989e5 100644 --- a/remote-curl.c +++ b/remote-curl.c @@ -962,6 +962,8 @@ int main(int argc, const char **argv) struct strbuf buf = STRBUF_INIT; int nongit; + git_setup_gettext(); + git_extract_argv0_path(argv[0]); setup_git_directory_gently(&nongit); if (argc < 2) { diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh index ac71418..e1e2938 100755 --- a/t/t5550-http-fetch-dumb.sh +++ b/t/t5550-http-fetch-dumb.sh @@ -196,5 +196,47 @@ test_expect_success 'reencoding is robust to whitespace oddities' ' grep "this is the error message" stderr ' +check_language () { + case "$2" in + '') + >expect + ;; + ?*) + echo "Accept-Language: $1" >expect + ;; + esac && + GIT_CURL_VERBOSE=1 \ + LANGUAGE=$2 \ + git ls-remote "$HTTPD_URL/dumb/repo.git" >output 2>&1 && + tr -d '\015' <output | + sort -u | + sed -ne '/^Accept-Language:/ p' >actual && + test_cmp expect actual +} + +test_expect_success 'git client sends Accept-Language based on LANGUAGE' ' + check_language "ko-KR, *;q=0.9" ko_KR.UTF-8' + +test_expect_success 'git client sends Accept-Language correctly with unordinary LANGUAGE' ' + check_language "ko-KR, *;q=0.9" "ko_KR:" && + check_language "ko-KR, en-US;q=0.9, *;q=0.8" "ko_KR::en_US" && + check_language "ko-KR, *;q=0.9" ":::ko_KR" && + check_language "ko-KR, en-US;q=0.9, *;q=0.8" "ko_KR!!:en_US" && + check_language "ko-KR, ja-JP;q=0.9, *;q=0.8" "ko_KR en_US:ja_JP"' + +test_expect_success 'git client sends Accept-Language with many preferred languages' ' + check_language "ko-KR, en-US;q=0.9, fr-CA;q=0.8, de;q=0.7, sr;q=0.6, \ +ja;q=0.5, zh;q=0.4, sv;q=0.3, pt;q=0.2, *;q=0.1" \ + ko_KR.EUC-KR:en_US.UTF-8:fr_CA:de.UTF-8@euro:sr@latin:ja:zh:sv:pt && + check_language "ko-KR, en-US;q=0.99, fr-CA;q=0.98, de;q=0.97, sr;q=0.96, \ +ja;q=0.95, zh;q=0.94, sv;q=0.93, pt;q=0.92, nb;q=0.91, *;q=0.90" \ + ko_KR.EUC-KR:en_US.UTF-8:fr_CA:de.UTF-8@euro:sr@latin:ja:zh:sv:pt:nb +' + +test_expect_success 'git client does not send an empty Accept-Language' ' + GIT_CURL_VERBOSE=1 LANGUAGE= git ls-remote "$HTTPD_URL/dumb/repo.git" 2>stderr && + ! grep "^Accept-Language:" stderr +' + stop_httpd test_done -- 2.3.0.rc1.32.ga3df1c7 ^ permalink raw reply related [flat|nested] 58+ messages in thread
* Re: [PATCH] http: Add Accept-Language header if possible 2015-01-27 15:51 ` [PATCH] " Yi EungJun @ 2015-01-27 23:34 ` Junio C Hamano 2015-01-28 6:15 ` Junio C Hamano 0 siblings, 1 reply; 58+ messages in thread From: Junio C Hamano @ 2015-01-27 23:34 UTC (permalink / raw) To: Yi EungJun Cc: Git List, Yi EungJun, Jeff King, Peter Krefting, Michael Blume, Torsten Bögershausen Yi EungJun <semtlenori@gmail.com> writes: > +static void write_accept_language(struct strbuf *buf) > +{ > + /* > + * MAX_DECIMAL_PLACES must not be larger than 3. If it is larger than > + * that, q-value will be smaller than 0.001, the minimum q-value the > + * HTTP specification allows. See > + * http://tools.ietf.org/html/rfc7231#section-5.3.1 for q-value. > + */ > + const int MAX_DECIMAL_PLACES = 3; > + const int MAX_LANGUAGE_TAGS = 1000; > + const int MAX_ACCEPT_LANGUAGE_HEADER_SIZE = 4000; > + struct strbuf *language_tags = NULL; > + int num_langs = 0; > + const char *s = get_preferred_languages(); > + int i; > + struct strbuf tag = STRBUF_INIT; > + > + /* Don't add Accept-Language header if no language is preferred. */ > + if (!s) > + return; > + > + /* > + * Split the colon-separated string of preferred languages into > + * language_tags array. > + */ > + do { > + /* collect language tag */ > + for (; *s && (isalnum(*s) || *s == '_'); s++) > + strbuf_addch(&tag, *s == '_' ? '-' : *s); > + > + /* skip .codeset, @modifier and any other unnecessary parts */ > + while (*s && *s != ':') > + s++; > + > + if (tag.len) { > + num_langs++; > + REALLOC_ARRAY(language_tags, num_langs); > + strbuf_init(&language_tags[num_langs - 1], 0); > + strbuf_swap(&tag, &language_tags[num_langs - 1]); > + > + if (num_langs >= MAX_LANGUAGE_TAGS - 1) /* -1 for '*' */ > + break; > + } > + } while (*s++); The structure of this function is much easier to understand than any of the previous rounds. You collect up to the max you are going to support, and then you format up to the max you are going to send. Very straight-forward and simple. > + /* write Accept-Language header into buf */ > + if (num_langs >= 1) { micronit: should be OK to just say "if (num_langs)". > + int last_buf_len = 0; > + int max_q; > + int decimal_places; > + char q_format[32]; > + > + /* add '*' */ > + REALLOC_ARRAY(language_tags, num_langs + 1); > + strbuf_init(&language_tags[num_langs], 0); > + strbuf_addstr(&language_tags[num_langs++], "*"); > + > + /* compute decimal_places */ > + for (max_q = 1, decimal_places = 0; > + max_q < num_langs && decimal_places <= MAX_DECIMAL_PLACES; > + decimal_places++, max_q *= 10) > + ; micronit: the second and the third line are indented too deeply and made me wonder if this has an overlong first line (i.e. the set-up part to enter the for-loop) split into multiple lines. > + > + sprintf(q_format, ";q=0.%%0%dd", decimal_places); > + > + strbuf_addstr(buf, "Accept-Language: "); > + > + for(i = 0; i < num_langs; i++) { > + if (i > 0) > + strbuf_addstr(buf, ", "); > + > + strbuf_addstr(buf, strbuf_detach(&language_tags[i], NULL)); This is not wrong per-se, but it looks somewhat convoluted to me. You can just peek language_tags[i].buf here without detaching, as you will free the strbufs after you exit the loop anyway. Your version is not wrong and it does not result in double-freeing, because detach clears the strbuf, but at the same time, makes the responsibility to free language_tags[] strbuf split into here (for elements up to the ones that are used to fill buf) and the cleanup loop (for elements that are not used in this loop). > + if (i > 0) > + strbuf_addf(buf, q_format, max_q - i); > + > + if (buf->len > MAX_ACCEPT_LANGUAGE_HEADER_SIZE) { > + strbuf_remove(buf, last_buf_len, buf->len - last_buf_len); > + break; > + } > + > + last_buf_len = buf->len; > + } > + } > + > + /* free language tags */ > + for(i = 0; i < num_langs; i++) { > + strbuf_release(&language_tags[i]); > + } > + free(language_tags); > +} I am wondering if using strbuf for each of the language_tags[] is even necessary. How about doing it this way instead? http.c | 22 +++++++++------------- 1 file changed, 9 insertions(+), 13 deletions(-) diff --git a/http.c b/http.c index 6111c6a..db591b3 100644 --- a/http.c +++ b/http.c @@ -1027,7 +1027,7 @@ static void write_accept_language(struct strbuf *buf) const int MAX_DECIMAL_PLACES = 3; const int MAX_LANGUAGE_TAGS = 1000; const int MAX_ACCEPT_LANGUAGE_HEADER_SIZE = 4000; - struct strbuf *language_tags = NULL; + char **language_tags = NULL; int num_langs = 0; const char *s = get_preferred_languages(); int i; @@ -1053,9 +1053,7 @@ static void write_accept_language(struct strbuf *buf) if (tag.len) { num_langs++; REALLOC_ARRAY(language_tags, num_langs); - strbuf_init(&language_tags[num_langs - 1], 0); - strbuf_swap(&tag, &language_tags[num_langs - 1]); - + language_tags[num_langs - 1] = strbuf_detach(&tag, NULL); if (num_langs >= MAX_LANGUAGE_TAGS - 1) /* -1 for '*' */ break; } @@ -1070,13 +1068,12 @@ static void write_accept_language(struct strbuf *buf) /* add '*' */ REALLOC_ARRAY(language_tags, num_langs + 1); - strbuf_init(&language_tags[num_langs], 0); - strbuf_addstr(&language_tags[num_langs++], "*"); + language_tags[num_langs++] = "*"; /* it's OK; this won't be freed */ /* compute decimal_places */ for (max_q = 1, decimal_places = 0; - max_q < num_langs && decimal_places <= MAX_DECIMAL_PLACES; - decimal_places++, max_q *= 10) + max_q < num_langs && decimal_places <= MAX_DECIMAL_PLACES; + decimal_places++, max_q *= 10) ; sprintf(q_format, ";q=0.%%0%dd", decimal_places); @@ -1087,7 +1084,7 @@ static void write_accept_language(struct strbuf *buf) if (i > 0) strbuf_addstr(buf, ", "); - strbuf_addstr(buf, strbuf_detach(&language_tags[i], NULL)); + strbuf_addstr(buf, language_tags[i]); if (i > 0) strbuf_addf(buf, q_format, max_q - i); @@ -1101,10 +1098,9 @@ static void write_accept_language(struct strbuf *buf) } } - /* free language tags */ - for(i = 0; i < num_langs; i++) { - strbuf_release(&language_tags[i]); - } + /* free language tags -- last one is a static '*' */ + for(i = 0; i < num_langs - 1; i++) + free(language_tags[i]); free(language_tags); } ^ permalink raw reply related [flat|nested] 58+ messages in thread
* Re: [PATCH] http: Add Accept-Language header if possible 2015-01-27 23:34 ` Junio C Hamano @ 2015-01-28 6:15 ` Junio C Hamano 2015-01-28 11:59 ` Yi, EungJun 2015-01-28 12:04 ` [PATCH v9 0/1] " Yi EungJun 0 siblings, 2 replies; 58+ messages in thread From: Junio C Hamano @ 2015-01-28 6:15 UTC (permalink / raw) To: Yi EungJun Cc: Git List, Yi EungJun, Jeff King, Peter Krefting, Michael Blume, Torsten Bögershausen On Tue, Jan 27, 2015 at 3:34 PM, Junio C Hamano <gitster@pobox.com> wrote: > Yi EungJun <semtlenori@gmail.com> writes: > >> + >> + sprintf(q_format, ";q=0.%%0%dd", decimal_places); >> + >> + strbuf_addstr(buf, "Accept-Language: "); >> + >> + for(i = 0; i < num_langs; i++) { >> + if (i > 0) >> + strbuf_addstr(buf, ", "); >> + >> + strbuf_addstr(buf, strbuf_detach(&language_tags[i], NULL)); > > This is not wrong per-se, but it looks somewhat convoluted to me. > ... Actually, this is wrong, isn't it? strbuf_detach() removes the language_tags[i].buf from the strbuf, and the caller now owns that piece of memory. Then strbuf_addstr() appends a copy of that string to buf, and the piece of memory that was originally held by language_tags[i].buf is now lost forever. This is leaking. >> + /* free language tags */ >> + for(i = 0; i < num_langs; i++) { >> + strbuf_release(&language_tags[i]); >> + } ... because this loop does not free memory for earlier parts of language_tags[]. > I am wondering if using strbuf for each of the language_tags[] is > even necessary. How about doing it this way instead? And I think my counter-proposal does not leak (as it does not us strbuf for language_tags[] anymore). > > http.c | 22 +++++++++------------- > 1 file changed, 9 insertions(+), 13 deletions(-) > > diff --git a/http.c b/http.c > index 6111c6a..db591b3 100644 > --- a/http.c > +++ b/http.c > @@ -1027,7 +1027,7 @@ static void write_accept_language(struct strbuf *buf) > const int MAX_DECIMAL_PLACES = 3; > const int MAX_LANGUAGE_TAGS = 1000; > const int MAX_ACCEPT_LANGUAGE_HEADER_SIZE = 4000; > - struct strbuf *language_tags = NULL; > + char **language_tags = NULL; > int num_langs = 0; > const char *s = get_preferred_languages(); > int i; > @@ -1053,9 +1053,7 @@ static void write_accept_language(struct strbuf *buf) > if (tag.len) { > num_langs++; > REALLOC_ARRAY(language_tags, num_langs); > - strbuf_init(&language_tags[num_langs - 1], 0); > - strbuf_swap(&tag, &language_tags[num_langs - 1]); > - > + language_tags[num_langs - 1] = strbuf_detach(&tag, NULL); > if (num_langs >= MAX_LANGUAGE_TAGS - 1) /* -1 for '*' */ > break; > } > @@ -1070,13 +1068,12 @@ static void write_accept_language(struct strbuf *buf) > > /* add '*' */ > REALLOC_ARRAY(language_tags, num_langs + 1); > - strbuf_init(&language_tags[num_langs], 0); > - strbuf_addstr(&language_tags[num_langs++], "*"); > + language_tags[num_langs++] = "*"; /* it's OK; this won't be freed */ > > /* compute decimal_places */ > for (max_q = 1, decimal_places = 0; > - max_q < num_langs && decimal_places <= MAX_DECIMAL_PLACES; > - decimal_places++, max_q *= 10) > + max_q < num_langs && decimal_places <= MAX_DECIMAL_PLACES; > + decimal_places++, max_q *= 10) > ; > > sprintf(q_format, ";q=0.%%0%dd", decimal_places); > @@ -1087,7 +1084,7 @@ static void write_accept_language(struct strbuf *buf) > if (i > 0) > strbuf_addstr(buf, ", "); > > - strbuf_addstr(buf, strbuf_detach(&language_tags[i], NULL)); > + strbuf_addstr(buf, language_tags[i]); > > if (i > 0) > strbuf_addf(buf, q_format, max_q - i); > @@ -1101,10 +1098,9 @@ static void write_accept_language(struct strbuf *buf) > } > } > > - /* free language tags */ > - for(i = 0; i < num_langs; i++) { > - strbuf_release(&language_tags[i]); > - } > + /* free language tags -- last one is a static '*' */ > + for(i = 0; i < num_langs - 1; i++) > + free(language_tags[i]); > free(language_tags); > } > ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH] http: Add Accept-Language header if possible 2015-01-28 6:15 ` Junio C Hamano @ 2015-01-28 11:59 ` Yi, EungJun 2015-01-28 12:04 ` [PATCH v9 0/1] " Yi EungJun 1 sibling, 0 replies; 58+ messages in thread From: Yi, EungJun @ 2015-01-28 11:59 UTC (permalink / raw) To: Junio C Hamano Cc: Git List, Yi EungJun, Jeff King, Peter Krefting, Michael Blume, Torsten Bögershausen I agree that a list of char* is enough for language_tags. Thanks for your review and patch. I'll apply your patch and send v9. On Wed, Jan 28, 2015 at 3:15 PM, Junio C Hamano <gitster@pobox.com> wrote: > On Tue, Jan 27, 2015 at 3:34 PM, Junio C Hamano <gitster@pobox.com> wrote: >> Yi EungJun <semtlenori@gmail.com> writes: >> >>> + >>> + sprintf(q_format, ";q=0.%%0%dd", decimal_places); >>> + >>> + strbuf_addstr(buf, "Accept-Language: "); >>> + >>> + for(i = 0; i < num_langs; i++) { >>> + if (i > 0) >>> + strbuf_addstr(buf, ", "); >>> + >>> + strbuf_addstr(buf, strbuf_detach(&language_tags[i], NULL)); >> >> This is not wrong per-se, but it looks somewhat convoluted to me. >> ... > > Actually, this is wrong, isn't it? > > strbuf_detach() removes the language_tags[i].buf from the strbuf, > and the caller now owns that piece of memory. Then strbuf_addstr() > appends a copy of that string to buf, and the piece of memory > that was originally held by language_tags[i].buf is now lost forever. > > This is leaking. > >>> + /* free language tags */ >>> + for(i = 0; i < num_langs; i++) { >>> + strbuf_release(&language_tags[i]); >>> + } > > ... because this loop does not free memory for earlier parts of language_tags[]. > >> I am wondering if using strbuf for each of the language_tags[] is >> even necessary. How about doing it this way instead? > > And I think my counter-proposal does not leak (as it does not us strbuf for > language_tags[] anymore). > >> >> http.c | 22 +++++++++------------- >> 1 file changed, 9 insertions(+), 13 deletions(-) >> >> diff --git a/http.c b/http.c >> index 6111c6a..db591b3 100644 >> --- a/http.c >> +++ b/http.c >> @@ -1027,7 +1027,7 @@ static void write_accept_language(struct strbuf *buf) >> const int MAX_DECIMAL_PLACES = 3; >> const int MAX_LANGUAGE_TAGS = 1000; >> const int MAX_ACCEPT_LANGUAGE_HEADER_SIZE = 4000; >> - struct strbuf *language_tags = NULL; >> + char **language_tags = NULL; >> int num_langs = 0; >> const char *s = get_preferred_languages(); >> int i; >> @@ -1053,9 +1053,7 @@ static void write_accept_language(struct strbuf *buf) >> if (tag.len) { >> num_langs++; >> REALLOC_ARRAY(language_tags, num_langs); >> - strbuf_init(&language_tags[num_langs - 1], 0); >> - strbuf_swap(&tag, &language_tags[num_langs - 1]); >> - >> + language_tags[num_langs - 1] = strbuf_detach(&tag, NULL); >> if (num_langs >= MAX_LANGUAGE_TAGS - 1) /* -1 for '*' */ >> break; >> } >> @@ -1070,13 +1068,12 @@ static void write_accept_language(struct strbuf *buf) >> >> /* add '*' */ >> REALLOC_ARRAY(language_tags, num_langs + 1); >> - strbuf_init(&language_tags[num_langs], 0); >> - strbuf_addstr(&language_tags[num_langs++], "*"); >> + language_tags[num_langs++] = "*"; /* it's OK; this won't be freed */ >> >> /* compute decimal_places */ >> for (max_q = 1, decimal_places = 0; >> - max_q < num_langs && decimal_places <= MAX_DECIMAL_PLACES; >> - decimal_places++, max_q *= 10) >> + max_q < num_langs && decimal_places <= MAX_DECIMAL_PLACES; >> + decimal_places++, max_q *= 10) >> ; >> >> sprintf(q_format, ";q=0.%%0%dd", decimal_places); >> @@ -1087,7 +1084,7 @@ static void write_accept_language(struct strbuf *buf) >> if (i > 0) >> strbuf_addstr(buf, ", "); >> >> - strbuf_addstr(buf, strbuf_detach(&language_tags[i], NULL)); >> + strbuf_addstr(buf, language_tags[i]); >> >> if (i > 0) >> strbuf_addf(buf, q_format, max_q - i); >> @@ -1101,10 +1098,9 @@ static void write_accept_language(struct strbuf *buf) >> } >> } >> >> - /* free language tags */ >> - for(i = 0; i < num_langs; i++) { >> - strbuf_release(&language_tags[i]); >> - } >> + /* free language tags -- last one is a static '*' */ >> + for(i = 0; i < num_langs - 1; i++) >> + free(language_tags[i]); >> free(language_tags); >> } >> ^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH v9 0/1] http: Add Accept-Language header if possible 2015-01-28 6:15 ` Junio C Hamano 2015-01-28 11:59 ` Yi, EungJun @ 2015-01-28 12:04 ` Yi EungJun 2015-01-28 12:04 ` [PATCH v9 1/1] " Yi EungJun 2015-01-29 6:19 ` [PATCH v9 0/1] " Junio C Hamano 1 sibling, 2 replies; 58+ messages in thread From: Yi EungJun @ 2015-01-28 12:04 UTC (permalink / raw) To: Git List Cc: Yi EungJun, Junio C Hamano, Jeff King, Peter Krefting, Michael Blume, Torsten Bögershausen From: Yi EungJun <eungjun.yi@navercorp.com> Change since v8 Apply Junio's patch: Use an array of char* instead of strbuf for language_tags. Yi EungJun (1): http: Add Accept-Language header if possible http.c | 147 +++++++++++++++++++++++++++++++++++++++++++++ remote-curl.c | 2 + t/t5550-http-fetch-dumb.sh | 42 +++++++++++++ 3 files changed, 191 insertions(+) -- 2.3.0.rc1.32.g7a36c04 ^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH v9 1/1] http: Add Accept-Language header if possible 2015-01-28 12:04 ` [PATCH v9 0/1] " Yi EungJun @ 2015-01-28 12:04 ` Yi EungJun 2015-02-25 22:52 ` Junio C Hamano 2015-01-29 6:19 ` [PATCH v9 0/1] " Junio C Hamano 1 sibling, 1 reply; 58+ messages in thread From: Yi EungJun @ 2015-01-28 12:04 UTC (permalink / raw) To: Git List Cc: Yi EungJun, Junio C Hamano, Jeff King, Peter Krefting, Michael Blume, Torsten Bögershausen From: Yi EungJun <eungjun.yi@navercorp.com> Add an Accept-Language header which indicates the user's preferred languages defined by $LANGUAGE, $LC_ALL, $LC_MESSAGES and $LANG. Examples: LANGUAGE= -> "" LANGUAGE=ko:en -> "Accept-Language: ko, en;q=0.9, *;q=0.1" LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *;q=0.1" LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *;q=0.1" This gives git servers a chance to display remote error messages in the user's preferred language. Limit the number of languages to 1,000 because q-value must not be smaller than 0.001, and limit the length of Accept-Language header to 4,000 bytes for some HTTP servers which cannot accept such long header. Signed-off-by: Yi EungJun <eungjun.yi@navercorp.com> --- http.c | 147 +++++++++++++++++++++++++++++++++++++++++++++ remote-curl.c | 2 + t/t5550-http-fetch-dumb.sh | 42 +++++++++++++ 3 files changed, 191 insertions(+) diff --git a/http.c b/http.c index 040f362..b2ad2a8 100644 --- a/http.c +++ b/http.c @@ -68,6 +68,8 @@ static struct curl_slist *no_pragma_header; static struct active_request_slot *active_queue_head; +static char *cached_accept_language; + size_t fread_buffer(char *ptr, size_t eltsize, size_t nmemb, void *buffer_) { size_t size = eltsize * nmemb; @@ -515,6 +517,9 @@ void http_cleanup(void) cert_auth.password = NULL; } ssl_cert_password_required = 0; + + free(cached_accept_language); + cached_accept_language = NULL; } struct active_request_slot *get_active_slot(void) @@ -986,6 +991,142 @@ static void extract_content_type(struct strbuf *raw, struct strbuf *type, strbuf_addstr(charset, "ISO-8859-1"); } +/* + * Guess the user's preferred languages from the value in LANGUAGE environment + * variable and LC_MESSAGES locale category if NO_GETTEXT is not defined. + * + * The result can be a colon-separated list like "ko:ja:en". + */ +static const char *get_preferred_languages(void) +{ + const char *retval; + + retval = getenv("LANGUAGE"); + if (retval && *retval) + return retval; + +#ifndef NO_GETTEXT + retval = setlocale(LC_MESSAGES, NULL); + if (retval && *retval && + strcmp(retval, "C") && + strcmp(retval, "POSIX")) + return retval; +#endif + + return NULL; +} + +static void write_accept_language(struct strbuf *buf) +{ + /* + * MAX_DECIMAL_PLACES must not be larger than 3. If it is larger than + * that, q-value will be smaller than 0.001, the minimum q-value the + * HTTP specification allows. See + * http://tools.ietf.org/html/rfc7231#section-5.3.1 for q-value. + */ + const int MAX_DECIMAL_PLACES = 3; + const int MAX_LANGUAGE_TAGS = 1000; + const int MAX_ACCEPT_LANGUAGE_HEADER_SIZE = 4000; + char **language_tags = NULL; + int num_langs = 0; + const char *s = get_preferred_languages(); + int i; + struct strbuf tag = STRBUF_INIT; + + /* Don't add Accept-Language header if no language is preferred. */ + if (!s) + return; + + /* + * Split the colon-separated string of preferred languages into + * language_tags array. + */ + do { + /* collect language tag */ + for (; *s && (isalnum(*s) || *s == '_'); s++) + strbuf_addch(&tag, *s == '_' ? '-' : *s); + + /* skip .codeset, @modifier and any other unnecessary parts */ + while (*s && *s != ':') + s++; + + if (tag.len) { + num_langs++; + REALLOC_ARRAY(language_tags, num_langs); + language_tags[num_langs - 1] = strbuf_detach(&tag, NULL); + if (num_langs >= MAX_LANGUAGE_TAGS - 1) /* -1 for '*' */ + break; + } + } while (*s++); + + /* write Accept-Language header into buf */ + if (num_langs) { + int last_buf_len = 0; + int max_q; + int decimal_places; + char q_format[32]; + + /* add '*' */ + REALLOC_ARRAY(language_tags, num_langs + 1); + language_tags[num_langs++] = "*"; /* it's OK; this won't be freed */ + + /* compute decimal_places */ + for (max_q = 1, decimal_places = 0; + max_q < num_langs && decimal_places <= MAX_DECIMAL_PLACES; + decimal_places++, max_q *= 10) + ; + + sprintf(q_format, ";q=0.%%0%dd", decimal_places); + + strbuf_addstr(buf, "Accept-Language: "); + + for(i = 0; i < num_langs; i++) { + if (i > 0) + strbuf_addstr(buf, ", "); + + strbuf_addstr(buf, language_tags[i]); + + if (i > 0) + strbuf_addf(buf, q_format, max_q - i); + + if (buf->len > MAX_ACCEPT_LANGUAGE_HEADER_SIZE) { + strbuf_remove(buf, last_buf_len, buf->len - last_buf_len); + break; + } + + last_buf_len = buf->len; + } + } + + /* free language tags -- last one is a static '*' */ + for(i = 0; i < num_langs - 1; i++) + free(language_tags[i]); + free(language_tags); +} + +/* + * Get an Accept-Language header which indicates user's preferred languages. + * + * Examples: + * LANGUAGE= -> "" + * LANGUAGE=ko:en -> "Accept-Language: ko, en; q=0.9, *; q=0.1" + * LANGUAGE=ko_KR.UTF-8:sr@latin -> "Accept-Language: ko-KR, sr; q=0.9, *; q=0.1" + * LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *; q=0.1" + * LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *; q=0.1" + * LANGUAGE= LANG=C -> "" + */ +static const char *get_accept_language(void) +{ + if (!cached_accept_language) { + struct strbuf buf = STRBUF_INIT; + write_accept_language(&buf); + if (buf.len > 0) + cached_accept_language = strbuf_detach(&buf, NULL); + } + + return cached_accept_language; +} + /* http_request() targets */ #define HTTP_REQUEST_STRBUF 0 #define HTTP_REQUEST_FILE 1 @@ -998,6 +1139,7 @@ static int http_request(const char *url, struct slot_results results; struct curl_slist *headers = NULL; struct strbuf buf = STRBUF_INIT; + const char *accept_language; int ret; slot = get_active_slot(); @@ -1023,6 +1165,11 @@ static int http_request(const char *url, fwrite_buffer); } + accept_language = get_accept_language(); + + if (accept_language) + headers = curl_slist_append(headers, accept_language); + strbuf_addstr(&buf, "Pragma:"); if (options && options->no_cache) strbuf_addstr(&buf, " no-cache"); diff --git a/remote-curl.c b/remote-curl.c index dd63bc2..04989e5 100644 --- a/remote-curl.c +++ b/remote-curl.c @@ -962,6 +962,8 @@ int main(int argc, const char **argv) struct strbuf buf = STRBUF_INIT; int nongit; + git_setup_gettext(); + git_extract_argv0_path(argv[0]); setup_git_directory_gently(&nongit); if (argc < 2) { diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh index ac71418..e1e2938 100755 --- a/t/t5550-http-fetch-dumb.sh +++ b/t/t5550-http-fetch-dumb.sh @@ -196,5 +196,47 @@ test_expect_success 'reencoding is robust to whitespace oddities' ' grep "this is the error message" stderr ' +check_language () { + case "$2" in + '') + >expect + ;; + ?*) + echo "Accept-Language: $1" >expect + ;; + esac && + GIT_CURL_VERBOSE=1 \ + LANGUAGE=$2 \ + git ls-remote "$HTTPD_URL/dumb/repo.git" >output 2>&1 && + tr -d '\015' <output | + sort -u | + sed -ne '/^Accept-Language:/ p' >actual && + test_cmp expect actual +} + +test_expect_success 'git client sends Accept-Language based on LANGUAGE' ' + check_language "ko-KR, *;q=0.9" ko_KR.UTF-8' + +test_expect_success 'git client sends Accept-Language correctly with unordinary LANGUAGE' ' + check_language "ko-KR, *;q=0.9" "ko_KR:" && + check_language "ko-KR, en-US;q=0.9, *;q=0.8" "ko_KR::en_US" && + check_language "ko-KR, *;q=0.9" ":::ko_KR" && + check_language "ko-KR, en-US;q=0.9, *;q=0.8" "ko_KR!!:en_US" && + check_language "ko-KR, ja-JP;q=0.9, *;q=0.8" "ko_KR en_US:ja_JP"' + +test_expect_success 'git client sends Accept-Language with many preferred languages' ' + check_language "ko-KR, en-US;q=0.9, fr-CA;q=0.8, de;q=0.7, sr;q=0.6, \ +ja;q=0.5, zh;q=0.4, sv;q=0.3, pt;q=0.2, *;q=0.1" \ + ko_KR.EUC-KR:en_US.UTF-8:fr_CA:de.UTF-8@euro:sr@latin:ja:zh:sv:pt && + check_language "ko-KR, en-US;q=0.99, fr-CA;q=0.98, de;q=0.97, sr;q=0.96, \ +ja;q=0.95, zh;q=0.94, sv;q=0.93, pt;q=0.92, nb;q=0.91, *;q=0.90" \ + ko_KR.EUC-KR:en_US.UTF-8:fr_CA:de.UTF-8@euro:sr@latin:ja:zh:sv:pt:nb +' + +test_expect_success 'git client does not send an empty Accept-Language' ' + GIT_CURL_VERBOSE=1 LANGUAGE= git ls-remote "$HTTPD_URL/dumb/repo.git" 2>stderr && + ! grep "^Accept-Language:" stderr +' + stop_httpd test_done -- 2.3.0.rc1.32.g7a36c04 ^ permalink raw reply related [flat|nested] 58+ messages in thread
* Re: [PATCH v9 1/1] http: Add Accept-Language header if possible 2015-01-28 12:04 ` [PATCH v9 1/1] " Yi EungJun @ 2015-02-25 22:52 ` Junio C Hamano 2015-02-26 3:04 ` Jeff King 0 siblings, 1 reply; 58+ messages in thread From: Junio C Hamano @ 2015-02-25 22:52 UTC (permalink / raw) To: Yi EungJun Cc: Git List, Yi EungJun, Jeff King, Peter Krefting, Michael Blume, Torsten Bögershausen Yi EungJun <semtlenori@gmail.com> writes: > From: Yi EungJun <eungjun.yi@navercorp.com> > > Add an Accept-Language header which indicates the user's preferred > languages defined by $LANGUAGE, $LC_ALL, $LC_MESSAGES and $LANG. > ... > Signed-off-by: Yi EungJun <eungjun.yi@navercorp.com> > --- Yikes. This is now in 'master', but I wonder if people are getting compilation errors because of this change. I do. It introduces a call to setlocale() without causing <locale.h> to be included, and runs afoul of -Wimplicit-function-declaration. Other call sites of setlocale() are in gettext.c, which does include the header at the beginning. > diff --git a/http.c b/http.c > index 040f362..b2ad2a8 100644 > --- a/http.c > +++ b/http.c > ... > +#ifndef NO_GETTEXT > + retval = setlocale(LC_MESSAGES, NULL); > + if (retval && *retval && > + strcmp(retval, "C") && > + strcmp(retval, "POSIX")) > + return retval; > +#endif I really do not like a conditional inclusion of system header files inside any *.c file, but here is a minimum emergency fix-up I am running with today. It should go to somewhere in git-compat-util.h. Somebody care to throw a tested fix-up patch at me? Thanks. http.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/http.c b/http.c index efdab09..7e79cbd 100644 --- a/http.c +++ b/http.c @@ -9,6 +9,10 @@ #include "version.h" #include "pkt-line.h" +#ifndef NO_GETTEXT +#include <locale.h> +#endif + int active_requests; int http_is_verbose; size_t http_post_buffer = 16 * LARGE_PACKET_MAX; ^ permalink raw reply related [flat|nested] 58+ messages in thread
* Re: [PATCH v9 1/1] http: Add Accept-Language header if possible 2015-02-25 22:52 ` Junio C Hamano @ 2015-02-26 3:04 ` Jeff King 2015-02-26 3:10 ` Jeff King 2015-02-26 20:59 ` Junio C Hamano 0 siblings, 2 replies; 58+ messages in thread From: Jeff King @ 2015-02-26 3:04 UTC (permalink / raw) To: Junio C Hamano Cc: Yi EungJun, Git List, Yi EungJun, Peter Krefting, Michael Blume, Torsten Bögershausen On Wed, Feb 25, 2015 at 02:52:26PM -0800, Junio C Hamano wrote: > This is now in 'master', but I wonder if people are getting > compilation errors because of this change. I do. I usually compile with NO_GETTEXT, but if I stop doing so, I see the problem, too. > I really do not like a conditional inclusion of system header files > inside any *.c file, but here is a minimum emergency fix-up I am > running with today. It should go to somewhere in git-compat-util.h. Perhaps it would be less risky to stick get_preferred_languages() into gettext.c, like the patch below. Then we do not have to worry about locale.h introducing other disruptive includes. The function is not technically about gettext, but it seems reasonable to me to stuff all of the i18n code together. Another variant of this would for gettext.c to provide a git_setlocale that just wraps setlocale (and does nothing when NO_GETTEXT is given). diff --git a/gettext.c b/gettext.c index 8b2da46..7378ba2 100644 --- a/gettext.c +++ b/gettext.c @@ -18,6 +18,31 @@ # endif #endif +/* + * Guess the user's preferred languages from the value in LANGUAGE environment + * variable and LC_MESSAGES locale category if NO_GETTEXT is not defined. + * + * The result can be a colon-separated list like "ko:ja:en". + */ +const char *get_preferred_languages(void) +{ + const char *retval; + + retval = getenv("LANGUAGE"); + if (retval && *retval) + return retval; + +#ifndef NO_GETTEXT + retval = setlocale(LC_MESSAGES, NULL); + if (retval && *retval && + strcmp(retval, "C") && + strcmp(retval, "POSIX")) + return retval; +#endif + + return NULL; +} + #ifdef GETTEXT_POISON int use_gettext_poison(void) { diff --git a/gettext.h b/gettext.h index dc1722d..5d8d2df 100644 --- a/gettext.h +++ b/gettext.h @@ -89,4 +89,6 @@ const char *Q_(const char *msgid, const char *plu, unsigned long n) #define N_(msgid) (msgid) #endif +const char *get_preferred_languages(); + #endif diff --git a/http.c b/http.c index 0153fb0..9c825af 100644 --- a/http.c +++ b/http.c @@ -8,6 +8,7 @@ #include "credential.h" #include "version.h" #include "pkt-line.h" +#include "gettext.h" int active_requests; int http_is_verbose; @@ -1002,32 +1003,6 @@ static void extract_content_type(struct strbuf *raw, struct strbuf *type, strbuf_addstr(charset, "ISO-8859-1"); } - -/* - * Guess the user's preferred languages from the value in LANGUAGE environment - * variable and LC_MESSAGES locale category if NO_GETTEXT is not defined. - * - * The result can be a colon-separated list like "ko:ja:en". - */ -static const char *get_preferred_languages(void) -{ - const char *retval; - - retval = getenv("LANGUAGE"); - if (retval && *retval) - return retval; - -#ifndef NO_GETTEXT - retval = setlocale(LC_MESSAGES, NULL); - if (retval && *retval && - strcmp(retval, "C") && - strcmp(retval, "POSIX")) - return retval; -#endif - - return NULL; -} - static void write_accept_language(struct strbuf *buf) { /* ^ permalink raw reply related [flat|nested] 58+ messages in thread
* Re: [PATCH v9 1/1] http: Add Accept-Language header if possible 2015-02-26 3:04 ` Jeff King @ 2015-02-26 3:10 ` Jeff King 2015-02-26 20:59 ` Junio C Hamano 1 sibling, 0 replies; 58+ messages in thread From: Jeff King @ 2015-02-26 3:10 UTC (permalink / raw) To: Junio C Hamano Cc: Yi EungJun, Git List, Yi EungJun, Peter Krefting, Michael Blume, Torsten Bögershausen On Wed, Feb 25, 2015 at 10:04:16PM -0500, Jeff King wrote: > Another variant of this would for gettext.c to provide a git_setlocale > that just wraps setlocale (and does nothing when NO_GETTEXT is given). This doesn't _quite_ work. In addition to the function, we have to have LC_MESSAGES defined. So we cannot provide a straight git_setlocale, but it has to be something like "git_message_locale". At which point we might as well just provide get_preferred_languages() as the interface between gettext.c and the rest of the program. -Peff ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH v9 1/1] http: Add Accept-Language header if possible 2015-02-26 3:04 ` Jeff King 2015-02-26 3:10 ` Jeff King @ 2015-02-26 20:59 ` Junio C Hamano 2015-02-26 21:33 ` Jeff King 1 sibling, 1 reply; 58+ messages in thread From: Junio C Hamano @ 2015-02-26 20:59 UTC (permalink / raw) To: Jeff King Cc: Yi EungJun, Git List, Yi EungJun, Peter Krefting, Michael Blume, Torsten Bögershausen Jeff King <peff@peff.net> writes: > Perhaps it would be less risky to stick get_preferred_languages() into > gettext.c, like the patch below. Then we do not have to worry about > locale.h introducing other disruptive includes. The function is not > technically about gettext, but it seems reasonable to me to stuff all of > the i18n code together. Yeah, I like that a lot better. Thanks. ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH v9 1/1] http: Add Accept-Language header if possible 2015-02-26 20:59 ` Junio C Hamano @ 2015-02-26 21:33 ` Jeff King 2015-02-26 21:42 ` Junio C Hamano 0 siblings, 1 reply; 58+ messages in thread From: Jeff King @ 2015-02-26 21:33 UTC (permalink / raw) To: Junio C Hamano Cc: Yi EungJun, Git List, Yi EungJun, Peter Krefting, Michael Blume, Torsten Bögershausen On Thu, Feb 26, 2015 at 12:59:56PM -0800, Junio C Hamano wrote: > Jeff King <peff@peff.net> writes: > > > Perhaps it would be less risky to stick get_preferred_languages() into > > gettext.c, like the patch below. Then we do not have to worry about > > locale.h introducing other disruptive includes. The function is not > > technically about gettext, but it seems reasonable to me to stuff all of > > the i18n code together. > > Yeah, I like that a lot better. Thanks. Are you just using it for inspiration, or did you want me to wrap it up with a commit message? -Peff ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH v9 1/1] http: Add Accept-Language header if possible 2015-02-26 21:33 ` Jeff King @ 2015-02-26 21:42 ` Junio C Hamano 2015-02-26 21:47 ` Stefan Beller 0 siblings, 1 reply; 58+ messages in thread From: Junio C Hamano @ 2015-02-26 21:42 UTC (permalink / raw) To: Jeff King Cc: Yi EungJun, Git List, Yi EungJun, Peter Krefting, Michael Blume, Torsten Bögershausen Jeff King <peff@peff.net> writes: > On Thu, Feb 26, 2015 at 12:59:56PM -0800, Junio C Hamano wrote: > >> Jeff King <peff@peff.net> writes: >> >> > Perhaps it would be less risky to stick get_preferred_languages() into >> > gettext.c, like the patch below. Then we do not have to worry about >> > locale.h introducing other disruptive includes. The function is not >> > technically about gettext, but it seems reasonable to me to stuff all of >> > the i18n code together. >> >> Yeah, I like that a lot better. Thanks. > > Are you just using it for inspiration, or did you want me to wrap it up > with a commit message? Here is what I queued. Thanks. -- >8 -- From: Jeff King <peff@peff.net> Date: Wed, 25 Feb 2015 22:04:16 -0500 Subject: [PATCH] gettext.c: move get_preferred_languages() from http.c Calling setlocale(LC_MESSAGES, ...) directly from http.c, without including <locale.h>, was causing compilation warnings. Move the helper function to gettext.c that already includes the header and where locale-related issues are handled. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> --- gettext.c | 25 +++++++++++++++++++++++++ gettext.h | 2 ++ http.c | 1 + 3 files changed, 28 insertions(+) diff --git a/gettext.c b/gettext.c index 8b2da46..7378ba2 100644 --- a/gettext.c +++ b/gettext.c @@ -18,6 +18,31 @@ # endif #endif +/* + * Guess the user's preferred languages from the value in LANGUAGE environment + * variable and LC_MESSAGES locale category if NO_GETTEXT is not defined. + * + * The result can be a colon-separated list like "ko:ja:en". + */ +const char *get_preferred_languages(void) +{ + const char *retval; + + retval = getenv("LANGUAGE"); + if (retval && *retval) + return retval; + +#ifndef NO_GETTEXT + retval = setlocale(LC_MESSAGES, NULL); + if (retval && *retval && + strcmp(retval, "C") && + strcmp(retval, "POSIX")) + return retval; +#endif + + return NULL; +} + #ifdef GETTEXT_POISON int use_gettext_poison(void) { diff --git a/gettext.h b/gettext.h index 7671d09..e539482 100644 --- a/gettext.h +++ b/gettext.h @@ -65,4 +65,6 @@ const char *Q_(const char *msgid, const char *plu, unsigned long n) /* Mark msgid for translation but do not translate it. */ #define N_(msgid) msgid +const char *get_preferred_languages(void); + #endif diff --git a/http.c b/http.c index 8b659b6..71ed418 100644 --- a/http.c +++ b/http.c @@ -8,6 +8,7 @@ #include "credential.h" #include "version.h" #include "pkt-line.h" +#include "gettext.h" int active_requests; int http_is_verbose; -- 2.3.1-280-g2531f2d ^ permalink raw reply related [flat|nested] 58+ messages in thread
* Re: [PATCH v9 1/1] http: Add Accept-Language header if possible 2015-02-26 21:42 ` Junio C Hamano @ 2015-02-26 21:47 ` Stefan Beller 2015-02-26 22:06 ` Jeff King 0 siblings, 1 reply; 58+ messages in thread From: Stefan Beller @ 2015-02-26 21:47 UTC (permalink / raw) To: Junio C Hamano Cc: Jeff King, Yi EungJun, Git List, Yi EungJun, Peter Krefting, Michael Blume, Torsten Bögershausen On Thu, Feb 26, 2015 at 1:42 PM, Junio C Hamano <gitster@pobox.com> wrote:> > Here is what I queued. Thanks. I did not follow the thread if there are any intermediate patches, though it applied cleanly. Applying this on top of f18604bbf2c391c689a41fca14cbaeff5e106255 (http: add Accept-Language header if possible) still doesn't compile for me. http.c:1001:20: error: static declaration of 'get_preferred_languages' follows non-static declaration static const char *get_preferred_languages(void) ^ In file included from cache.h:8:0, from http.h:4, from http.c:2: gettext.h:68:13: note: previous declaration of 'get_preferred_languages' was here const char *get_preferred_languages(void); ^ http.c: In function 'get_preferred_languages': http.c:1010:2: warning: implicit declaration of function 'setlocale' [-Wimplicit-function-declaration] retval = setlocale(LC_MESSAGES, NULL); ^ http.c:1010:21: error: 'LC_MESSAGES' undeclared (first use in this function) retval = setlocale(LC_MESSAGES, NULL); ^ http.c:1010:21: note: each undeclared identifier is reported only once for each function it appears in Rebasing this on top of current master (Post 2.3 cyle (batch #5)) also fails: http.c:1013:20: error: static declaration of 'get_preferred_languages' follows non-static declaration static const char *get_preferred_languages(void) ^ In file included from cache.h:8:0, from http.h:4, from http.c:2: gettext.h:92:13: note: previous declaration of 'get_preferred_languages' was here const char *get_preferred_languages(void); ^ http.c: In function 'get_preferred_languages': http.c:1022:2: warning: implicit declaration of function 'setlocale' [-Wimplicit-function-declaration] retval = setlocale(LC_MESSAGES, NULL); ^ http.c:1022:21: error: 'LC_MESSAGES' undeclared (first use in this function) retval = setlocale(LC_MESSAGES, NULL); ^ http.c:1022:21: note: each undeclared identifier is reported only once for each function it appears in > > -- >8 -- > From: Jeff King <peff@peff.net> > Date: Wed, 25 Feb 2015 22:04:16 -0500 > Subject: [PATCH] gettext.c: move get_preferred_languages() from http.c > > Calling setlocale(LC_MESSAGES, ...) directly from http.c, without > including <locale.h>, was causing compilation warnings. Move the > helper function to gettext.c that already includes the header and > where locale-related issues are handled. > > Signed-off-by: Jeff King <peff@peff.net> > Signed-off-by: Junio C Hamano <gitster@pobox.com> > --- > gettext.c | 25 +++++++++++++++++++++++++ > gettext.h | 2 ++ > http.c | 1 + > 3 files changed, 28 insertions(+) > > diff --git a/gettext.c b/gettext.c > index 8b2da46..7378ba2 100644 > --- a/gettext.c > +++ b/gettext.c > @@ -18,6 +18,31 @@ > # endif > #endif > > +/* > + * Guess the user's preferred languages from the value in LANGUAGE environment > + * variable and LC_MESSAGES locale category if NO_GETTEXT is not defined. > + * > + * The result can be a colon-separated list like "ko:ja:en". > + */ > +const char *get_preferred_languages(void) > +{ > + const char *retval; > + > + retval = getenv("LANGUAGE"); > + if (retval && *retval) > + return retval; > + > +#ifndef NO_GETTEXT > + retval = setlocale(LC_MESSAGES, NULL); > + if (retval && *retval && > + strcmp(retval, "C") && > + strcmp(retval, "POSIX")) > + return retval; > +#endif > + > + return NULL; > +} > + > #ifdef GETTEXT_POISON > int use_gettext_poison(void) > { > diff --git a/gettext.h b/gettext.h > index 7671d09..e539482 100644 > --- a/gettext.h > +++ b/gettext.h > @@ -65,4 +65,6 @@ const char *Q_(const char *msgid, const char *plu, unsigned long n) > /* Mark msgid for translation but do not translate it. */ > #define N_(msgid) msgid > > +const char *get_preferred_languages(void); > + > #endif > diff --git a/http.c b/http.c > index 8b659b6..71ed418 100644 > --- a/http.c > +++ b/http.c > @@ -8,6 +8,7 @@ > #include "credential.h" > #include "version.h" > #include "pkt-line.h" > +#include "gettext.h" > > int active_requests; > int http_is_verbose; > -- > 2.3.1-280-g2531f2d > > -- > To unsubscribe from this list: send the line "unsubscribe git" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH v9 1/1] http: Add Accept-Language header if possible 2015-02-26 21:47 ` Stefan Beller @ 2015-02-26 22:06 ` Jeff King 2015-02-26 22:07 ` Jeff King 2015-02-26 22:13 ` Junio C Hamano 0 siblings, 2 replies; 58+ messages in thread From: Jeff King @ 2015-02-26 22:06 UTC (permalink / raw) To: Stefan Beller Cc: Junio C Hamano, Yi EungJun, Git List, Yi EungJun, Peter Krefting, Michael Blume, Torsten Bögershausen On Thu, Feb 26, 2015 at 01:47:34PM -0800, Stefan Beller wrote: > On Thu, Feb 26, 2015 at 1:42 PM, Junio C Hamano <gitster@pobox.com> wrote:> > > Here is what I queued. Thanks. > > I did not follow the thread if there are any intermediate patches, > though it applied cleanly. What Junio posted is missing the hunk to drop the old static definition of get_preferred_languages from http.c. -Peff ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH v9 1/1] http: Add Accept-Language header if possible 2015-02-26 22:06 ` Jeff King @ 2015-02-26 22:07 ` Jeff King 2015-02-26 22:26 ` Stefan Beller 2015-02-26 22:13 ` Junio C Hamano 1 sibling, 1 reply; 58+ messages in thread From: Jeff King @ 2015-02-26 22:07 UTC (permalink / raw) To: Stefan Beller Cc: Junio C Hamano, Yi EungJun, Git List, Yi EungJun, Peter Krefting, Michael Blume, Torsten Bögershausen On Thu, Feb 26, 2015 at 05:06:10PM -0500, Jeff King wrote: > On Thu, Feb 26, 2015 at 01:47:34PM -0800, Stefan Beller wrote: > > > On Thu, Feb 26, 2015 at 1:42 PM, Junio C Hamano <gitster@pobox.com> wrote:> > > > Here is what I queued. Thanks. > > > > I did not follow the thread if there are any intermediate patches, > > though it applied cleanly. > > What Junio posted is missing the hunk to drop the old static definition > of get_preferred_languages from http.c. Here it is, with the commit message and the missing hunk. This works for me both with and without NO_GETTEXT defined. -- >8 -- Subject: [PATCH] gettext.c: move get_preferred_languages() from http.c Calling setlocale(LC_MESSAGES, ...) directly from http.c, without including <locale.h>, was causing compilation warnings. Move the helper function to gettext.c that already includes the header and where locale-related issues are handled. Signed-off-by: Jeff King <peff@peff.net> --- gettext.c | 25 +++++++++++++++++++++++++ gettext.h | 2 ++ http.c | 27 +-------------------------- 3 files changed, 28 insertions(+), 26 deletions(-) diff --git a/gettext.c b/gettext.c index 8b2da46..7378ba2 100644 --- a/gettext.c +++ b/gettext.c @@ -18,6 +18,31 @@ # endif #endif +/* + * Guess the user's preferred languages from the value in LANGUAGE environment + * variable and LC_MESSAGES locale category if NO_GETTEXT is not defined. + * + * The result can be a colon-separated list like "ko:ja:en". + */ +const char *get_preferred_languages(void) +{ + const char *retval; + + retval = getenv("LANGUAGE"); + if (retval && *retval) + return retval; + +#ifndef NO_GETTEXT + retval = setlocale(LC_MESSAGES, NULL); + if (retval && *retval && + strcmp(retval, "C") && + strcmp(retval, "POSIX")) + return retval; +#endif + + return NULL; +} + #ifdef GETTEXT_POISON int use_gettext_poison(void) { diff --git a/gettext.h b/gettext.h index dc1722d..5d8d2df 100644 --- a/gettext.h +++ b/gettext.h @@ -89,4 +89,6 @@ const char *Q_(const char *msgid, const char *plu, unsigned long n) #define N_(msgid) (msgid) #endif +const char *get_preferred_languages(); + #endif diff --git a/http.c b/http.c index 0153fb0..9c825af 100644 --- a/http.c +++ b/http.c @@ -8,6 +8,7 @@ #include "credential.h" #include "version.h" #include "pkt-line.h" +#include "gettext.h" int active_requests; int http_is_verbose; @@ -1002,32 +1003,6 @@ static void extract_content_type(struct strbuf *raw, struct strbuf *type, strbuf_addstr(charset, "ISO-8859-1"); } - -/* - * Guess the user's preferred languages from the value in LANGUAGE environment - * variable and LC_MESSAGES locale category if NO_GETTEXT is not defined. - * - * The result can be a colon-separated list like "ko:ja:en". - */ -static const char *get_preferred_languages(void) -{ - const char *retval; - - retval = getenv("LANGUAGE"); - if (retval && *retval) - return retval; - -#ifndef NO_GETTEXT - retval = setlocale(LC_MESSAGES, NULL); - if (retval && *retval && - strcmp(retval, "C") && - strcmp(retval, "POSIX")) - return retval; -#endif - - return NULL; -} - static void write_accept_language(struct strbuf *buf) { /* -- 2.3.0.449.g1690e78 ^ permalink raw reply related [flat|nested] 58+ messages in thread
* Re: [PATCH v9 1/1] http: Add Accept-Language header if possible 2015-02-26 22:07 ` Jeff King @ 2015-02-26 22:26 ` Stefan Beller 2015-02-26 22:36 ` Jeff King 0 siblings, 1 reply; 58+ messages in thread From: Stefan Beller @ 2015-02-26 22:26 UTC (permalink / raw) To: Jeff King Cc: Junio C Hamano, Yi EungJun, Git List, Yi EungJun, Peter Krefting, Michael Blume, Torsten Bögershausen On Thu, Feb 26, 2015 at 2:07 PM, Jeff King <peff@peff.net> wrote: > > Here it is, with the commit message and the missing hunk. This works for > me both with and without NO_GETTEXT defined. This compiles here though a warning is spit: In file included from cache.h:8:0, from userdiff.c:1: gettext.h:92:1: warning: function declaration isn't a prototype [-Wstrict-prototypes] const char *get_preferred_languages(); ^ so I guess I can still add a Tested-by: Stefan Beller <sbeller@google.com> > > -- >8 -- > Subject: [PATCH] gettext.c: move get_preferred_languages() from http.c > > Calling setlocale(LC_MESSAGES, ...) directly from http.c, > without including <locale.h>, was causing compilation > warnings. Move the helper function to gettext.c that > already includes the header and where locale-related issues > are handled. > > Signed-off-by: Jeff King <peff@peff.net> > --- > gettext.c | 25 +++++++++++++++++++++++++ > gettext.h | 2 ++ > http.c | 27 +-------------------------- > 3 files changed, 28 insertions(+), 26 deletions(-) > > diff --git a/gettext.c b/gettext.c > index 8b2da46..7378ba2 100644 > --- a/gettext.c > +++ b/gettext.c > @@ -18,6 +18,31 @@ > # endif > #endif > > +/* > + * Guess the user's preferred languages from the value in LANGUAGE environment > + * variable and LC_MESSAGES locale category if NO_GETTEXT is not defined. > + * > + * The result can be a colon-separated list like "ko:ja:en". > + */ > +const char *get_preferred_languages(void) > +{ > + const char *retval; > + > + retval = getenv("LANGUAGE"); > + if (retval && *retval) > + return retval; > + > +#ifndef NO_GETTEXT > + retval = setlocale(LC_MESSAGES, NULL); > + if (retval && *retval && > + strcmp(retval, "C") && > + strcmp(retval, "POSIX")) > + return retval; > +#endif > + > + return NULL; > +} > + > #ifdef GETTEXT_POISON > int use_gettext_poison(void) > { > diff --git a/gettext.h b/gettext.h > index dc1722d..5d8d2df 100644 > --- a/gettext.h > +++ b/gettext.h > @@ -89,4 +89,6 @@ const char *Q_(const char *msgid, const char *plu, unsigned long n) > #define N_(msgid) (msgid) > #endif > > +const char *get_preferred_languages(); > + > #endif > diff --git a/http.c b/http.c > index 0153fb0..9c825af 100644 > --- a/http.c > +++ b/http.c > @@ -8,6 +8,7 @@ > #include "credential.h" > #include "version.h" > #include "pkt-line.h" > +#include "gettext.h" > > int active_requests; > int http_is_verbose; > @@ -1002,32 +1003,6 @@ static void extract_content_type(struct strbuf *raw, struct strbuf *type, > strbuf_addstr(charset, "ISO-8859-1"); > } > > - > -/* > - * Guess the user's preferred languages from the value in LANGUAGE environment > - * variable and LC_MESSAGES locale category if NO_GETTEXT is not defined. > - * > - * The result can be a colon-separated list like "ko:ja:en". > - */ > -static const char *get_preferred_languages(void) > -{ > - const char *retval; > - > - retval = getenv("LANGUAGE"); > - if (retval && *retval) > - return retval; > - > -#ifndef NO_GETTEXT > - retval = setlocale(LC_MESSAGES, NULL); > - if (retval && *retval && > - strcmp(retval, "C") && > - strcmp(retval, "POSIX")) > - return retval; > -#endif > - > - return NULL; > -} > - > static void write_accept_language(struct strbuf *buf) > { > /* > -- > 2.3.0.449.g1690e78 > ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH v9 1/1] http: Add Accept-Language header if possible 2015-02-26 22:26 ` Stefan Beller @ 2015-02-26 22:36 ` Jeff King 2015-02-26 22:45 ` Jeff King 0 siblings, 1 reply; 58+ messages in thread From: Jeff King @ 2015-02-26 22:36 UTC (permalink / raw) To: Stefan Beller Cc: Junio C Hamano, Yi EungJun, Git List, Yi EungJun, Peter Krefting, Michael Blume, Torsten Bögershausen On Thu, Feb 26, 2015 at 02:26:05PM -0800, Stefan Beller wrote: > On Thu, Feb 26, 2015 at 2:07 PM, Jeff King <peff@peff.net> wrote: > > > > Here it is, with the commit message and the missing hunk. This works for > > me both with and without NO_GETTEXT defined. > > This compiles here though a warning is spit: > In file included from cache.h:8:0, > from userdiff.c:1: > gettext.h:92:1: warning: function declaration isn't a prototype > [-Wstrict-prototypes] > const char *get_preferred_languages(); > ^ Hmph. The compiler is right that it should be: const char *get_preferred_languages(void); but my gcc (4.9.2, with -Wstrict_prototypes) does not seem to notice it! Weird. -Peff ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH v9 1/1] http: Add Accept-Language header if possible 2015-02-26 22:36 ` Jeff King @ 2015-02-26 22:45 ` Jeff King 2015-02-26 23:29 ` Junio C Hamano 0 siblings, 1 reply; 58+ messages in thread From: Jeff King @ 2015-02-26 22:45 UTC (permalink / raw) To: Stefan Beller Cc: Junio C Hamano, Yi EungJun, Git List, Yi EungJun, Peter Krefting, Michael Blume, Torsten Bögershausen On Thu, Feb 26, 2015 at 05:36:03PM -0500, Jeff King wrote: > > [-Wstrict-prototypes] > > const char *get_preferred_languages(); > > ^ > > Hmph. The compiler is right that it should be: > > const char *get_preferred_languages(void); > > but my gcc (4.9.2, with -Wstrict_prototypes) does not seem to notice it! > Weird. Ugh. I have a snippet in my config.mak that relaxes the warnings on older versions of git, and it was accidentally triggering due to a typo. :( So that explains that. Junio, do you mind squashing in: diff --git a/gettext.h b/gettext.h index 5d8d2df..33696a4 100644 --- a/gettext.h +++ b/gettext.h @@ -89,6 +89,6 @@ const char *Q_(const char *msgid, const char *plu, unsigned long n) #define N_(msgid) (msgid) #endif -const char *get_preferred_languages(); +const char *get_preferred_languages(void); #endif ^ permalink raw reply related [flat|nested] 58+ messages in thread
* Re: [PATCH v9 1/1] http: Add Accept-Language header if possible 2015-02-26 22:45 ` Jeff King @ 2015-02-26 23:29 ` Junio C Hamano 0 siblings, 0 replies; 58+ messages in thread From: Junio C Hamano @ 2015-02-26 23:29 UTC (permalink / raw) To: Jeff King Cc: Stefan Beller, Yi EungJun, Git List, Yi EungJun, Peter Krefting, Michael Blume, Torsten Bögershausen Jeff King <peff@peff.net> writes: > On Thu, Feb 26, 2015 at 05:36:03PM -0500, Jeff King wrote: > >> > [-Wstrict-prototypes] >> > const char *get_preferred_languages(); >> > ^ >> >> Hmph. The compiler is right that it should be: >> >> const char *get_preferred_languages(void); >> >> but my gcc (4.9.2, with -Wstrict_prototypes) does not seem to notice it! >> Weird. > > Ugh. I have a snippet in my config.mak that relaxes the warnings on older > versions of git, and it was accidentally triggering due to a typo. :( > > So that explains that. Junio, do you mind squashing in: Yup, I already did when I got the first one. Thanks. > > diff --git a/gettext.h b/gettext.h > index 5d8d2df..33696a4 100644 > --- a/gettext.h > +++ b/gettext.h > @@ -89,6 +89,6 @@ const char *Q_(const char *msgid, const char *plu, unsigned long n) > #define N_(msgid) (msgid) > #endif > > -const char *get_preferred_languages(); > +const char *get_preferred_languages(void); > > #endif ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH v9 1/1] http: Add Accept-Language header if possible 2015-02-26 22:06 ` Jeff King 2015-02-26 22:07 ` Jeff King @ 2015-02-26 22:13 ` Junio C Hamano 1 sibling, 0 replies; 58+ messages in thread From: Junio C Hamano @ 2015-02-26 22:13 UTC (permalink / raw) To: Jeff King Cc: Stefan Beller, Yi EungJun, Git List, Yi EungJun, Peter Krefting, Michael Blume, Torsten Bögershausen Jeff King <peff@peff.net> writes: > On Thu, Feb 26, 2015 at 01:47:34PM -0800, Stefan Beller wrote: > >> On Thu, Feb 26, 2015 at 1:42 PM, Junio C Hamano <gitster@pobox.com> wrote:> >> > Here is what I queued. Thanks. >> >> I did not follow the thread if there are any intermediate patches, >> though it applied cleanly. > > What Junio posted is missing the hunk to drop the old static definition > of get_preferred_languages from http.c. I am still scratching my head to see how this happened, but I think when I did $ git checkout ye/http-accept-language $ git apply -3 $gmane/264422 I took the wrong side of the confict in http.c Thanks both for noticing. Now it is fixed up. ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH v9 0/1] http: Add Accept-Language header if possible 2015-01-28 12:04 ` [PATCH v9 0/1] " Yi EungJun 2015-01-28 12:04 ` [PATCH v9 1/1] " Yi EungJun @ 2015-01-29 6:19 ` Junio C Hamano 2015-01-30 17:23 ` Yi, EungJun 1 sibling, 1 reply; 58+ messages in thread From: Junio C Hamano @ 2015-01-29 6:19 UTC (permalink / raw) To: Yi EungJun Cc: Git List, Yi EungJun, Jeff King, Peter Krefting, Michael Blume, Torsten Bögershausen Thanks; queued. Let's run with this and try to make it graduate early next cycle. ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH v9 0/1] http: Add Accept-Language header if possible 2015-01-29 6:19 ` [PATCH v9 0/1] " Junio C Hamano @ 2015-01-30 17:23 ` Yi, EungJun 0 siblings, 0 replies; 58+ messages in thread From: Yi, EungJun @ 2015-01-30 17:23 UTC (permalink / raw) To: Junio C Hamano Cc: Git List, Yi EungJun, Jeff King, Peter Krefting, Michael Blume, Torsten Bögershausen I'm very glad to hear that. Thanks to all reviewers! On Thu, Jan 29, 2015 at 3:19 PM, Junio C Hamano <gitster@pobox.com> wrote: > Thanks; queued. Let's run with this and try to make it graduate > early next cycle. ^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH] http: Include locale.h when using setlocale() 2014-07-19 17:58 [PATCH v4 0/1] http: Add Accept-Language header if possible Yi EungJun 2014-07-19 17:58 ` [PATCH v4 1/1] " Yi EungJun 2014-12-02 12:12 ` [PATCH v5 0/1] " Yi EungJun @ 2015-03-06 16:13 ` Ævar Arnfjörð Bjarmason 2015-03-06 19:01 ` Junio C Hamano 2 siblings, 1 reply; 58+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2015-03-06 16:13 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Yi EungJun, Ævar Arnfjörð Bjarmason Since v2.3.0-rc1-37-gf18604b we've been using setlocale() here without importing locale.h. Oddly enough this only causes issues for me under -O0 on GCC & Clang. I.e. if I do: $ git clean -dxf; make -j 1 V=1 CFLAGS="-g -O0 -Wall" http.o I'll get this on clang 3.5.0-6 & GCC 4.9.1-19 on Debian: http.c: In function ‘get_preferred_languages’: http.c:1021:2: warning: implicit declaration of function ‘setlocale’ [-Wimplicit-function-declaration] retval = setlocale(LC_MESSAGES, NULL); ^ http.c:1021:21: error: ‘LC_MESSAGES’ undeclared (first use in this function) retval = setlocale(LC_MESSAGES, NULL); But changing -O0 to -O1 or another optimization level makes the issue go away. Odd, but in any case we should be including this header if we're going to use the function, so just do that. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- http.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/http.c b/http.c index 0153fb0..0606e6c 100644 --- a/http.c +++ b/http.c @@ -8,6 +8,9 @@ #include "credential.h" #include "version.h" #include "pkt-line.h" +#ifndef NO_GETTEXT +# include <locale.h> +#endif int active_requests; int http_is_verbose; -- 2.1.3 ^ permalink raw reply related [flat|nested] 58+ messages in thread
* Re: [PATCH] http: Include locale.h when using setlocale() 2015-03-06 16:13 ` [PATCH] http: Include locale.h when using setlocale() Ævar Arnfjörð Bjarmason @ 2015-03-06 19:01 ` Junio C Hamano 0 siblings, 0 replies; 58+ messages in thread From: Junio C Hamano @ 2015-03-06 19:01 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason; +Cc: git, Yi EungJun Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > Since v2.3.0-rc1-37-gf18604b we've been using setlocale() here without > importing locale.h. Oddly enough this only causes issues for me under > -O0 on GCC & Clang. Sorry for not making this entry in "What's cooking" report very prominent: * ye/http-accept-language (2015-02-26) 1 commit (merged to 'next' on 2015-03-03 at 58d195e) + gettext.c: move get_preferred_languages() from http.c Compilation fix for a recent topic in 'master'. Will merge to 'master'. This has cooked on 'next' for a few days, and is eligible to graduate to 'master'. Will be in the next update. Thanks. ^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH] http: Add Accept-Language header if possible @ 2014-07-08 15:54 Yi EungJun 2014-07-08 21:52 ` Eric Sunshine ` (2 more replies) 0 siblings, 3 replies; 58+ messages in thread From: Yi EungJun @ 2014-07-08 15:54 UTC (permalink / raw) To: git; +Cc: Yi EungJun From: Yi EungJun <eungjun.yi@navercorp.com> Add an Accept-Language header which indicates the user's preferred languages defined by 'LANGUAGE' environment variable if the variable is not empty. Example: LANGUAGE= -> "" LANGUAGE=ko -> "Accept-Language: ko; q=1.000, *; q=0.001" LANGUAGE=ko:en -> "Accept-Language: ko; q=1.000, en; q=0.999, *; q=0.001" This gives git servers a chance to display remote error messages in the user's preferred language. --- http.c | 43 +++++++++++++++++++++++++++++++++++++++++++ t/t5550-http-fetch-dumb.sh | 10 ++++++++++ 2 files changed, 53 insertions(+) diff --git a/http.c b/http.c index 3a28b21..c345616 100644 --- a/http.c +++ b/http.c @@ -983,6 +983,47 @@ static void extract_content_type(struct strbuf *raw, struct strbuf *type, strbuf_addstr(charset, "ISO-8859-1"); } +/* + * Add an Accept-Language header which indicates user's preferred languages + * defined by 'LANGUAGE' environment variable if the variable is not empty. + * + * Example: + * LANGUAGE= -> "" + * LANGUAGE=ko -> "Accept-Language: ko; q=1.000, *; q=0.001" + * LANGUAGE=ko:en -> "Accept-Language: ko; q=1.000, en; q=0.999, *; q=0.001" + */ +static void add_accept_language(struct strbuf *buf) +{ + const char *p1, *p2; + float q = 1.000; + + p1 = getenv("LANGUAGE"); + + if (p1 != NULL && p1[0] != '\0') { + strbuf_reset(buf); + strbuf_addstr(buf, "Accept-Language: "); + for (p2 = p1; q > 0.001; p2++) { + if ((*p2 == ':' || *p2 == '\0') && p1 != p2) { + if (q < 1.0) { + strbuf_addstr(buf, ", "); + } + strbuf_add(buf, p1, p2 - p1); + strbuf_addf(buf, "; q=%.3f", q); + q -= 0.001; + p1 = p2 + 1; + + if (*p2 == '\0') { + break; + } + } + } + if (q < 1.0) { + strbuf_addstr(buf, ", "); + } + strbuf_addstr(buf, "*; q=0.001\r\n"); + } +} + /* http_request() targets */ #define HTTP_REQUEST_STRBUF 0 #define HTTP_REQUEST_FILE 1 @@ -1020,6 +1061,8 @@ static int http_request(const char *url, fwrite_buffer); } + add_accept_language(&buf); + strbuf_addstr(&buf, "Pragma:"); if (options && options->no_cache) strbuf_addstr(&buf, " no-cache"); diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh index ac71418..ea15158 100755 --- a/t/t5550-http-fetch-dumb.sh +++ b/t/t5550-http-fetch-dumb.sh @@ -196,5 +196,15 @@ test_expect_success 'reencoding is robust to whitespace oddities' ' grep "this is the error message" stderr ' +test_expect_success 'git client sends Accept-Language' ' + GIT_CURL_VERBOSE=1 LANGUAGE=ko:en git clone "$HTTPD_URL/accept/language" 2>actual + grep "^Accept-Language: ko; q=1.000, en; q=0.999, \*; q=0.001" actual +' + +test_expect_success 'git client does not send Accept-Language' ' + GIT_CURL_VERBOSE=1 LANGUAGE= git clone "$HTTPD_URL/accept/language" 2>actual + test_must_fail grep "^Accept-Language:" actual +' + stop_httpd test_done -- 2.0.1.473.gafdefd9.dirty ^ permalink raw reply related [flat|nested] 58+ messages in thread
* Re: [PATCH] http: Add Accept-Language header if possible 2014-07-08 15:54 [PATCH] http: Add Accept-Language header if possible Yi EungJun @ 2014-07-08 21:52 ` Eric Sunshine [not found] ` <CAFT+Tg-6fR9OJ93TT7ww3x=zYHY=Dh5u-7owgQMBK5o_JKLEaA@mail.gmail.com> 2014-07-09 5:10 ` Jeff King 2014-07-09 10:40 ` Peter Krefting 2 siblings, 1 reply; 58+ messages in thread From: Eric Sunshine @ 2014-07-08 21:52 UTC (permalink / raw) To: Yi EungJun; +Cc: Git List, Yi EungJun On Tue, Jul 8, 2014 at 11:54 AM, Yi EungJun <semtlenori@gmail.com> wrote: > From: Yi EungJun <eungjun.yi@navercorp.com> > > Add an Accept-Language header which indicates the user's preferred > languages defined by 'LANGUAGE' environment variable if the variable is > not empty. > > Example: > LANGUAGE= -> "" > LANGUAGE=ko -> "Accept-Language: ko; q=1.000, *; q=0.001" > LANGUAGE=ko:en -> "Accept-Language: ko; q=1.000, en; q=0.999, *; q=0.001" > > This gives git servers a chance to display remote error messages in > the user's preferred language. > --- > http.c | 43 +++++++++++++++++++++++++++++++++++++++++++ > t/t5550-http-fetch-dumb.sh | 10 ++++++++++ > 2 files changed, 53 insertions(+) > > diff --git a/http.c b/http.c > index 3a28b21..c345616 100644 > --- a/http.c > +++ b/http.c > @@ -983,6 +983,47 @@ static void extract_content_type(struct strbuf *raw, struct strbuf *type, > strbuf_addstr(charset, "ISO-8859-1"); > } > > +/* > + * Add an Accept-Language header which indicates user's preferred languages > + * defined by 'LANGUAGE' environment variable if the variable is not empty. > + * > + * Example: > + * LANGUAGE= -> "" > + * LANGUAGE=ko -> "Accept-Language: ko; q=1.000, *; q=0.001" > + * LANGUAGE=ko:en -> "Accept-Language: ko; q=1.000, en; q=0.999, *; q=0.001" > + */ > +static void add_accept_language(struct strbuf *buf) > +{ > + const char *p1, *p2; > + float q = 1.000; > + > + p1 = getenv("LANGUAGE"); > + > + if (p1 != NULL && p1[0] != '\0') { > + strbuf_reset(buf); It seems wrong to clear 'buf' in a function named add_accept_language(). > + strbuf_addstr(buf, "Accept-Language: "); > + for (p2 = p1; q > 0.001; p2++) { > + if ((*p2 == ':' || *p2 == '\0') && p1 != p2) { > + if (q < 1.0) { > + strbuf_addstr(buf, ", "); > + } > + strbuf_add(buf, p1, p2 - p1); > + strbuf_addf(buf, "; q=%.3f", q); > + q -= 0.001; > + p1 = p2 + 1; > + > + if (*p2 == '\0') { > + break; > + } > + } > + } > + if (q < 1.0) { > + strbuf_addstr(buf, ", "); > + } > + strbuf_addstr(buf, "*; q=0.001\r\n"); Manually adding "\r\n" is contraindicated. Headers passed to curl_easy_setopt(c, CURLOPT_HTTPHEADER, headers) must not have "\r\n", since curl will add terminators itself [1]. [1]: http://curl.haxx.se/libcurl/c/CURLOPT_HTTPHEADER.html > + } > +} > + > /* http_request() targets */ > #define HTTP_REQUEST_STRBUF 0 > #define HTTP_REQUEST_FILE 1 > @@ -1020,6 +1061,8 @@ static int http_request(const char *url, > fwrite_buffer); > } > > + add_accept_language(&buf); This is inconsistent with how other headers are handled by this function. The existing idiom is: strbuf_add(&buf, ...); /* construct header */ headers = curl_slist_apend(headers, buf.buf); strbuf_reset(&buf); > + > strbuf_addstr(&buf, "Pragma:"); > if (options && options->no_cache) > strbuf_addstr(&buf, " no-cache"); > diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh > index ac71418..ea15158 100755 > --- a/t/t5550-http-fetch-dumb.sh > +++ b/t/t5550-http-fetch-dumb.sh > @@ -196,5 +196,15 @@ test_expect_success 'reencoding is robust to whitespace oddities' ' > grep "this is the error message" stderr > ' > > +test_expect_success 'git client sends Accept-Language' ' > + GIT_CURL_VERBOSE=1 LANGUAGE=ko:en git clone "$HTTPD_URL/accept/language" 2>actual Broken &&-chain. > + grep "^Accept-Language: ko; q=1.000, en; q=0.999, \*; q=0.001" actual Do you want to \-escape the periods? (Or maybe use 'grep -F'?) > +' > + > +test_expect_success 'git client does not send Accept-Language' ' > + GIT_CURL_VERBOSE=1 LANGUAGE= git clone "$HTTPD_URL/accept/language" 2>actual Broken &&-chain. > + test_must_fail grep "^Accept-Language:" actual > +' > + > stop_httpd > test_done > -- > 2.0.1.473.gafdefd9.dirty ^ permalink raw reply [flat|nested] 58+ messages in thread
[parent not found: <CAFT+Tg-6fR9OJ93TT7ww3x=zYHY=Dh5u-7owgQMBK5o_JKLEaA@mail.gmail.com>]
[parent not found: <CAPig+cQ05pzU9uVBqS8tBHvB8_3qAtgsPYz1sGhpa0W1CVymqA@mail.gmail.com>]
* Re: [PATCH] http: Add Accept-Language header if possible [not found] ` <CAPig+cQ05pzU9uVBqS8tBHvB8_3qAtgsPYz1sGhpa0W1CVymqA@mail.gmail.com> @ 2014-07-11 16:35 ` Yi, EungJun 0 siblings, 0 replies; 58+ messages in thread From: Yi, EungJun @ 2014-07-11 16:35 UTC (permalink / raw) To: Eric Sunshine, git 2014-07-12 1:24 GMT+09:00 Eric Sunshine <sunshine@sunshineco.com>: > On Fri, Jul 11, 2014 at 5:22 AM, Yi, EungJun <semtlenori@gmail.com> wrote: >> 2014-07-09 6:52 GMT+09:00 Eric Sunshine <sunshine@sunshineco.com>: >>>> + grep "^Accept-Language: ko; q=1.000, en; q=0.999, \*; q=0.001" actual >>> >>> Do you want to \-escape the periods? (Or maybe use 'grep -F'?) >> >> I just want to match '*' character. I tried 'grep -F' but it does not help. > > I meant that the periods in your grep pattern are matching any > character. If you want to be very strict, so that they match only > period, then you should \-escape them. Oops, I misunderstood you. You are right. I'll \- escape them. ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH] http: Add Accept-Language header if possible 2014-07-08 15:54 [PATCH] http: Add Accept-Language header if possible Yi EungJun 2014-07-08 21:52 ` Eric Sunshine @ 2014-07-09 5:10 ` Jeff King 2014-07-09 5:46 ` Yi, EungJun 2014-07-09 10:40 ` Peter Krefting 2 siblings, 1 reply; 58+ messages in thread From: Jeff King @ 2014-07-09 5:10 UTC (permalink / raw) To: Yi EungJun; +Cc: git, Yi EungJun On Wed, Jul 09, 2014 at 12:54:06AM +0900, Yi EungJun wrote: > From: Yi EungJun <eungjun.yi@navercorp.com> > > Add an Accept-Language header which indicates the user's preferred > languages defined by 'LANGUAGE' environment variable if the variable is > not empty. > > Example: > LANGUAGE= -> "" > LANGUAGE=ko -> "Accept-Language: ko; q=1.000, *; q=0.001" > LANGUAGE=ko:en -> "Accept-Language: ko; q=1.000, en; q=0.999, *; q=0.001" > > This gives git servers a chance to display remote error messages in > the user's preferred language. Should this also take into account other language-related variables? I'd think $LC_ALL, $LC_MESSAGES, and $LANG would affect it, too. Are colon-separated values a standard in $LANGUAGE? I have never seen them, but I admit I am not very knowledgeable about localization issues. Also, we do we need to do more parsing? My $LANG is set to en_US.UTF-8. The encoding part is presumably uninteresting to the remote server. I also wonder if there are support functions in libc or as part of gettext that can help us get these values. -Peff ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH] http: Add Accept-Language header if possible 2014-07-09 5:10 ` Jeff King @ 2014-07-09 5:46 ` Yi, EungJun 2014-07-09 6:18 ` Jeff King 0 siblings, 1 reply; 58+ messages in thread From: Yi, EungJun @ 2014-07-09 5:46 UTC (permalink / raw) To: Jeff King; +Cc: git 2014-07-09 14:10 GMT+09:00 Jeff King <peff@peff.net>: > On Wed, Jul 09, 2014 at 12:54:06AM +0900, Yi EungJun wrote: > >> From: Yi EungJun <eungjun.yi@navercorp.com> >> >> Add an Accept-Language header which indicates the user's preferred >> languages defined by 'LANGUAGE' environment variable if the variable is >> not empty. >> >> Example: >> LANGUAGE= -> "" >> LANGUAGE=ko -> "Accept-Language: ko; q=1.000, *; q=0.001" >> LANGUAGE=ko:en -> "Accept-Language: ko; q=1.000, en; q=0.999, *; q=0.001" >> >> This gives git servers a chance to display remote error messages in >> the user's preferred language. > > Should this also take into account other language-related variables? I'd > think $LC_ALL, $LC_MESSAGES, and $LANG would affect it, too. Are > colon-separated values a standard in $LANGUAGE? I have never seen them, > but I admit I am not very knowledgeable about localization issues. > > Also, we do we need to do more parsing? My $LANG is set to en_US.UTF-8. > The encoding part is presumably uninteresting to the remote server. I > also wonder if there are support functions in libc or as part of gettext > that can help us get these values. > > -Peff I agree with you. In fact, I tried to get user's preferred language in the same way as gettext. It has guess_category_value() to do that and the function is good enough because it considers $LANGUAGE, $LC_ALL, $LANG, and also system-dependent preferences. But the function does not seem a public API and I don't know how can I use the function in Git. So I chose to use $LANGUAGE only. ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH] http: Add Accept-Language header if possible 2014-07-09 5:46 ` Yi, EungJun @ 2014-07-09 6:18 ` Jeff King 2014-07-09 10:46 ` Peter Krefting 0 siblings, 1 reply; 58+ messages in thread From: Jeff King @ 2014-07-09 6:18 UTC (permalink / raw) To: Yi, EungJun; +Cc: git On Wed, Jul 09, 2014 at 02:46:35PM +0900, Yi, EungJun wrote: > I agree with you. In fact, I tried to get user's preferred language in > the same way as gettext. It has guess_category_value() to do that and > the function is good enough because it considers $LANGUAGE, $LC_ALL, > $LANG, and also system-dependent preferences. But the function does > not seem a public API and I don't know how can I use the function in > Git. So I chose to use $LANGUAGE only. I did some digging, and I think the public API is setlocale with a NULL parameter, like: printf("%s\n", setlocale(LC_MESSAGES, NULL)); That still will end up like "en_US.UTF-8", though; I couldn't find any standard functions for parsing that. It seems like it would be pretty straightforward to do so, though. >From my brief reading of rfc2616, that should probably become "en-us", and any time we add "x-y", we may want to add "x" as a fallback (that is certainly true for English; I don't know about other languages with dialects). -Peff ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH] http: Add Accept-Language header if possible 2014-07-09 6:18 ` Jeff King @ 2014-07-09 10:46 ` Peter Krefting 2014-07-10 20:10 ` Jeff King 0 siblings, 1 reply; 58+ messages in thread From: Peter Krefting @ 2014-07-09 10:46 UTC (permalink / raw) To: Jeff King; +Cc: Yi, EungJun, git Jeff King: > I did some digging, and I think the public API is setlocale with a > NULL parameter, like: > > printf("%s\n", setlocale(LC_MESSAGES, NULL)); > > That still will end up like "en_US.UTF-8", though; And it only yields the highest-priority language, I think. > I couldn't find any standard functions for parsing that. It seems > like it would be pretty straightforward to do so, though. RFC 5646 is the current specification on language tags, btw. > From my brief reading of rfc2616, that should probably become > "en-us", and any time we add "x-y", we may want to add "x" as a > fallback (that is certainly true for English; I don't know about > other languages with dialects). Yes, adding the generic fallback is necessary, as "en-US" on the server matches a client's "en", but not vice versa. So if you request "en-US" and "de" and the server only has "en-GB" and "de", you'd get the "de" version. Debian's website has a nice writeup on the subject: http://www.debian.org/intro/cn#howtoset -- \\// Peter - http://www.softwolves.pp.se/ ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH] http: Add Accept-Language header if possible 2014-07-09 10:46 ` Peter Krefting @ 2014-07-10 20:10 ` Jeff King 2014-07-11 9:43 ` Yi, EungJun 2014-07-12 10:11 ` Peter Krefting 0 siblings, 2 replies; 58+ messages in thread From: Jeff King @ 2014-07-10 20:10 UTC (permalink / raw) To: Peter Krefting; +Cc: Yi, EungJun, git On Wed, Jul 09, 2014 at 11:46:14AM +0100, Peter Krefting wrote: > Jeff King: > > >I did some digging, and I think the public API is setlocale with a NULL > >parameter, like: > > > > printf("%s\n", setlocale(LC_MESSAGES, NULL)); > > > >That still will end up like "en_US.UTF-8", though; > > And it only yields the highest-priority language, I think. I wasn't clear on whether POSIX locale variables actually supported multiple languages with priorities. I have never seen that, though the original commit message indicated that LANGUAGE=x:y was a thing (I wasn't sure if that was a made-up thing, or something that libc actually supported). > Debian's website has a nice writeup on the subject: > http://www.debian.org/intro/cn#howtoset That seems to be about language settings in browsers, which are a much richer set of preferences than POSIX locales (I think). It would not be wrong to have that level of configuration for git's http requests, but I do not know if it is worth the effort. Mapping the user's gettext locale into an accept-language header seems like a straightforward way to communicate to the other side what the client is using to show errors (so that errors coming from the server can match). If that is the case, though, I wonder if we should actually be adding it as a git-protocol header so that all transports can benefit (i.e., we could be localizing human-readable error messages in upload-pack, receive-pack, etc). -Peff ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH] http: Add Accept-Language header if possible 2014-07-10 20:10 ` Jeff King @ 2014-07-11 9:43 ` Yi, EungJun 2014-07-12 10:11 ` Peter Krefting 1 sibling, 0 replies; 58+ messages in thread From: Yi, EungJun @ 2014-07-11 9:43 UTC (permalink / raw) To: Jeff King; +Cc: Peter Krefting, git 2014-07-11 5:10 GMT+09:00 Jeff King <peff@peff.net>: > On Wed, Jul 09, 2014 at 11:46:14AM +0100, Peter Krefting wrote: > >> Jeff King: >> >> >I did some digging, and I think the public API is setlocale with a NULL >> >parameter, like: >> > >> > printf("%s\n", setlocale(LC_MESSAGES, NULL)); >> > >> >That still will end up like "en_US.UTF-8", though; >> >> And it only yields the highest-priority language, I think. > > I wasn't clear on whether POSIX locale variables actually supported > multiple languages with priorities. I have never seen that, though the > original commit message indicated that LANGUAGE=x:y was a thing (I > wasn't sure if that was a made-up thing, or something that libc actually > supported). > >> Debian's website has a nice writeup on the subject: >> http://www.debian.org/intro/cn#howtoset > > That seems to be about language settings in browsers, which are a much > richer set of preferences than POSIX locales (I think). > > It would not be wrong to have that level of configuration for git's http > requests, but I do not know if it is worth the effort. Mapping the > user's gettext locale into an accept-language header seems like a > straightforward way to communicate to the other side what the client is > using to show errors (so that errors coming from the server can match). Thanks for you advice. I'll write a path to use both of setlocale(LC_MESSAGES, NULL) and getenv("LANGUAGE") to get the user's preferred language. setlocale(LC_MESSAGES, NULL) is quite nice way because it takes LC_ALL, LC_MESSAGES and LANG into account, but not LANGUAGE. I think we should take also LANGUAGE into account as gettext does. [1] [1]: http://www.gnu.org/software/gettext/manual/gettext.html#Locale-Environment-Variables ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH] http: Add Accept-Language header if possible 2014-07-10 20:10 ` Jeff King 2014-07-11 9:43 ` Yi, EungJun @ 2014-07-12 10:11 ` Peter Krefting 1 sibling, 0 replies; 58+ messages in thread From: Peter Krefting @ 2014-07-12 10:11 UTC (permalink / raw) To: Jeff King; +Cc: Yi, EungJun, git Jeff King: > If that is the case, though, I wonder if we should actually be adding it > as a git-protocol header so that all transports can benefit (i.e., we > could be localizing human-readable error messages in upload-pack, > receive-pack, etc). That would be very nice, thre is a lot of language mixing going on at the moment. -- \\// Peter - http://www.softwolves.pp.se/ ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH] http: Add Accept-Language header if possible 2014-07-08 15:54 [PATCH] http: Add Accept-Language header if possible Yi EungJun 2014-07-08 21:52 ` Eric Sunshine 2014-07-09 5:10 ` Jeff King @ 2014-07-09 10:40 ` Peter Krefting 2014-07-11 9:05 ` Yi, EungJun 2 siblings, 1 reply; 58+ messages in thread From: Peter Krefting @ 2014-07-09 10:40 UTC (permalink / raw) To: Yi EungJun; +Cc: git, Yi EungJun Yi EungJun: > Example: > LANGUAGE= -> "" > LANGUAGE=ko -> "Accept-Language: ko; q=1.000, *; q=0.001" > LANGUAGE=ko:en -> "Accept-Language: ko; q=1.000, en; q=0.999, *; q=0.001" Avoid adding "q=1.000". It is redundant (the default for any unqualified language names is 1.0, and additionally there has historically been some buggy servers that failed if it was included. > + p1 = getenv("LANGUAGE"); You need a fallback mechanism here to parse all the possible language variables. I would use the first one I find of these: 1. LANGUAGE 2. LC_ALL 3. LC_MESSAGES 4. LANG Only "LANGUAGE" holds a colon-separated list, but the same code can parse all of them, just yielding a single entry for the others. > + strbuf_add(buf, p1, p2 - p1); The tokens are on the form language_COUNTRY.encoding@identifier, whereas Accept-Language wants language-COUNTRY, so you need to a) replace "_" with "-", and b) chop off anything following a "." or "@". > + strbuf_addf(buf, "; q=%.3f", q); > + q -= 0.001; Three decimals seems a bit overkill, but some experimentation might be necessary. > + strbuf_addstr(buf, "*; q=0.001\r\n"); You should probably also add an explicit "en" here, if none was already included. I've seen some servers break horribly if "en" isn't included. For reference, I have my LANGUAGE variable set to "sv_SE.utf8:sv:nb_NO.utf8:nb:da_DK.utf8:da:nn_NO.utf8:nn:en_GB.utf8:en_US.utf8:en" -- \\// Peter - http://www.softwolves.pp.se/ ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [PATCH] http: Add Accept-Language header if possible 2014-07-09 10:40 ` Peter Krefting @ 2014-07-11 9:05 ` Yi, EungJun 0 siblings, 0 replies; 58+ messages in thread From: Yi, EungJun @ 2014-07-11 9:05 UTC (permalink / raw) To: Peter Krefting; +Cc: git, Yi EungJun 2014-07-09 19:40 GMT+09:00 Peter Krefting <peter@softwolves.pp.se>: > Yi EungJun: > > >> Example: >> LANGUAGE= -> "" >> LANGUAGE=ko -> "Accept-Language: ko; q=1.000, *; q=0.001" >> LANGUAGE=ko:en -> "Accept-Language: ko; q=1.000, en; q=0.999, *; q=0.001" > > > Avoid adding "q=1.000". It is redundant (the default for any unqualified > language names is 1.0, and additionally there has historically been some > buggy servers that failed if it was included. Ok, I'll fix it. > > >> + p1 = getenv("LANGUAGE"); > > > You need a fallback mechanism here to parse all the possible language > variables. I would use the first one I find of these: > > 1. LANGUAGE > 2. LC_ALL > 3. LC_MESSAGES > 4. LANG > > Only "LANGUAGE" holds a colon-separated list, but the same code can parse > all of them, just yielding a single entry for the others. I'll use setlocale(LC_MESSAGES, NULL) as well as getenv("LANGUAGE"). > > >> + strbuf_add(buf, p1, p2 - p1); > > > The tokens are on the form language_COUNTRY.encoding@identifier, whereas > Accept-Language wants language-COUNTRY, so you need to a) replace "_" with > "-", and b) chop off anything following a "." or "@". > > >> + strbuf_addf(buf, "; q=%.3f", q); >> + q -= 0.001; > > > Three decimals seems a bit overkill, but some experimentation might be > necessary. I'll use three decimals only if there are 100 or more preferred languages. > > >> + strbuf_addstr(buf, "*; q=0.001\r\n"); > > > You should probably also add an explicit "en" here, if none was already > included. I've seen some servers break horribly if "en" isn't included. I'll send Accept-Language only if there is at least one preferred language. Is it enough? Thanks for your review. ^ permalink raw reply [flat|nested] 58+ messages in thread
end of thread, other threads:[~2015-03-06 19:01 UTC | newest] Thread overview: 58+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2014-07-19 17:58 [PATCH v4 0/1] http: Add Accept-Language header if possible Yi EungJun 2014-07-19 17:58 ` [PATCH v4 1/1] " Yi EungJun 2014-07-21 19:01 ` Junio C Hamano 2014-08-03 7:35 ` Yi, EungJun 2014-12-02 12:12 ` [PATCH v5 0/1] " Yi EungJun 2014-12-02 12:12 ` [PATCH v5 1/1] " Yi EungJun 2014-12-03 18:22 ` Junio C Hamano 2014-12-03 19:31 ` Eric Sunshine 2014-12-03 21:37 ` Junio C Hamano 2014-12-03 22:00 ` Michael Blume 2014-12-03 22:06 ` Michael Blume 2014-12-22 16:44 ` [PATCH v6 0/1] " Yi EungJun 2014-12-22 16:44 ` [PATCH v6 1/1] " Yi EungJun 2014-12-22 19:34 ` Junio C Hamano 2014-12-24 20:35 ` Eric Sunshine 2014-12-29 16:18 ` Junio C Hamano 2015-01-18 12:23 ` [PATCH v7 0/1] " Yi EungJun 2015-01-18 12:26 ` [PATCH v7 1/1] " Yi EungJun 2015-01-18 15:14 ` Torsten Bögershausen 2015-01-19 20:21 ` [PATCH v6 0/1] " Eric Sunshine 2015-01-22 7:54 ` [PATCH v7 1/1] " Junio C Hamano 2015-01-27 15:51 ` [PATCH v8 0/1] " Yi EungJun 2015-01-27 15:51 ` [PATCH] " Yi EungJun 2015-01-27 23:34 ` Junio C Hamano 2015-01-28 6:15 ` Junio C Hamano 2015-01-28 11:59 ` Yi, EungJun 2015-01-28 12:04 ` [PATCH v9 0/1] " Yi EungJun 2015-01-28 12:04 ` [PATCH v9 1/1] " Yi EungJun 2015-02-25 22:52 ` Junio C Hamano 2015-02-26 3:04 ` Jeff King 2015-02-26 3:10 ` Jeff King 2015-02-26 20:59 ` Junio C Hamano 2015-02-26 21:33 ` Jeff King 2015-02-26 21:42 ` Junio C Hamano 2015-02-26 21:47 ` Stefan Beller 2015-02-26 22:06 ` Jeff King 2015-02-26 22:07 ` Jeff King 2015-02-26 22:26 ` Stefan Beller 2015-02-26 22:36 ` Jeff King 2015-02-26 22:45 ` Jeff King 2015-02-26 23:29 ` Junio C Hamano 2015-02-26 22:13 ` Junio C Hamano 2015-01-29 6:19 ` [PATCH v9 0/1] " Junio C Hamano 2015-01-30 17:23 ` Yi, EungJun 2015-03-06 16:13 ` [PATCH] http: Include locale.h when using setlocale() Ævar Arnfjörð Bjarmason 2015-03-06 19:01 ` Junio C Hamano -- strict thread matches above, loose matches on Subject: below -- 2014-07-08 15:54 [PATCH] http: Add Accept-Language header if possible Yi EungJun 2014-07-08 21:52 ` Eric Sunshine [not found] ` <CAFT+Tg-6fR9OJ93TT7ww3x=zYHY=Dh5u-7owgQMBK5o_JKLEaA@mail.gmail.com> [not found] ` <CAPig+cQ05pzU9uVBqS8tBHvB8_3qAtgsPYz1sGhpa0W1CVymqA@mail.gmail.com> 2014-07-11 16:35 ` Yi, EungJun 2014-07-09 5:10 ` Jeff King 2014-07-09 5:46 ` Yi, EungJun 2014-07-09 6:18 ` Jeff King 2014-07-09 10:46 ` Peter Krefting 2014-07-10 20:10 ` Jeff King 2014-07-11 9:43 ` Yi, EungJun 2014-07-12 10:11 ` Peter Krefting 2014-07-09 10:40 ` Peter Krefting 2014-07-11 9:05 ` Yi, EungJun
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.