* [PATCH v2] travis-ci: retry if Git for Windows CI returns HTTP error 502 or 503
@ 2017-05-03 21:50 Lars Schneider
2017-05-04 9:19 ` Johannes Schindelin
2017-05-09 6:31 ` Junio C Hamano
0 siblings, 2 replies; 5+ messages in thread
From: Lars Schneider @ 2017-05-03 21:50 UTC (permalink / raw)
To: git; +Cc: gitster, Johannes.Schindelin
The Git for Windows CI web app sometimes returns HTTP errors of
"502 bad gateway" or "503 service unavailable" [1]. We also need to
check the HTTP content because the GfW web app seems to pass through
(error) results from other Azure calls with HTTP code 200.
Wait a little and retry the request if this happens.
[1] https://docs.microsoft.com/en-in/azure/app-service-web/app-service-web-troubleshoot-http-502-http-503
Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
---
Hi Junio,
I can't really test this as my TravisCI account does not have the
extended timeout and I am unable to reproduce the error.
It would be great if we could test this is a little bit in pu.
Thanks,
Lars
Notes:
Base Ref: next
Web-Diff: https://github.com/larsxschneider/git/commit/af0f0f0eb8
Checkout: git fetch https://github.com/larsxschneider/git travisci/win-retry-v2 && git checkout af0f0f0eb8
Interdiff (v1..v2):
diff --git a/ci/run-windows-build.sh b/ci/run-windows-build.sh
index 7a9aa9c6a7..3e5a0abee0 100755
--- a/ci/run-windows-build.sh
+++ b/ci/run-windows-build.sh
@@ -14,26 +14,33 @@ COMMIT=$2
gfwci () {
local CURL_ERROR_CODE HTTP_CODE
- exec 3>&1
+ CONTENT_FILE=$(mktemp -t "git-windows-ci-XXXXXX")
while test -z $HTTP_CODE
do
HTTP_CODE=$(curl \
-H "Authentication: Bearer $GFW_CI_TOKEN" \
--silent --retry 5 --write-out '%{HTTP_CODE}' \
- --output >(sed "$(printf '1s/^\xef\xbb\xbf//')" >cat >&3) \
+ --output >(sed "$(printf '1s/^\xef\xbb\xbf//')" >$CONTENT_FILE) \
"https://git-for-windows-ci.azurewebsites.net/api/TestNow?$1" \
)
CURL_ERROR_CODE=$?
# The GfW CI web app sometimes returns HTTP errors of
# "502 bad gateway" or "503 service unavailable".
- # Wait a little and retry if it happens. More info:
+ # We also need to check the HTTP content because the GfW web
+ # app seems to pass through (error) results from other Azure
+ # calls with HTTP code 200.
+ # Wait a little and retry if we detect this error. More info:
# https://docs.microsoft.com/en-in/azure/app-service-web/app-service-web-troubleshoot-http-502-http-503
- if test $HTTP_CODE -eq 502 || test $HTTP_CODE -eq 503
+ if test $HTTP_CODE -eq 502 ||
+ test $HTTP_CODE -eq 503 ||
+ grep "502 - Web server received an invalid response" $CONTENT_FILE >/dev/null
then
sleep 10
HTTP_CODE=
fi
done
+ cat $CONTENT_FILE
+ rm $CONTENT_FILE
if test $CURL_ERROR_CODE -ne 0
then
return $CURL_ERROR_CODE
\0
ci/run-windows-build.sh | 23 +++++++++++++++++++++--
1 file changed, 21 insertions(+), 2 deletions(-)
diff --git a/ci/run-windows-build.sh b/ci/run-windows-build.sh
index e043440799..3e5a0abee0 100755
--- a/ci/run-windows-build.sh
+++ b/ci/run-windows-build.sh
@@ -14,14 +14,33 @@ COMMIT=$2
gfwci () {
local CURL_ERROR_CODE HTTP_CODE
- exec 3>&1
+ CONTENT_FILE=$(mktemp -t "git-windows-ci-XXXXXX")
+ while test -z $HTTP_CODE
+ do
HTTP_CODE=$(curl \
-H "Authentication: Bearer $GFW_CI_TOKEN" \
--silent --retry 5 --write-out '%{HTTP_CODE}' \
- --output >(sed "$(printf '1s/^\xef\xbb\xbf//')" >cat >&3) \
+ --output >(sed "$(printf '1s/^\xef\xbb\xbf//')" >$CONTENT_FILE) \
"https://git-for-windows-ci.azurewebsites.net/api/TestNow?$1" \
)
CURL_ERROR_CODE=$?
+ # The GfW CI web app sometimes returns HTTP errors of
+ # "502 bad gateway" or "503 service unavailable".
+ # We also need to check the HTTP content because the GfW web
+ # app seems to pass through (error) results from other Azure
+ # calls with HTTP code 200.
+ # Wait a little and retry if we detect this error. More info:
+ # https://docs.microsoft.com/en-in/azure/app-service-web/app-service-web-troubleshoot-http-502-http-503
+ if test $HTTP_CODE -eq 502 ||
+ test $HTTP_CODE -eq 503 ||
+ grep "502 - Web server received an invalid response" $CONTENT_FILE >/dev/null
+ then
+ sleep 10
+ HTTP_CODE=
+ fi
+ done
+ cat $CONTENT_FILE
+ rm $CONTENT_FILE
if test $CURL_ERROR_CODE -ne 0
then
return $CURL_ERROR_CODE
base-commit: 1ea7e62026c5dde4d8be80b2544696fc6aa70121
--
2.12.2
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v2] travis-ci: retry if Git for Windows CI returns HTTP error 502 or 503
2017-05-03 21:50 [PATCH v2] travis-ci: retry if Git for Windows CI returns HTTP error 502 or 503 Lars Schneider
@ 2017-05-04 9:19 ` Johannes Schindelin
2017-05-09 6:31 ` Junio C Hamano
1 sibling, 0 replies; 5+ messages in thread
From: Johannes Schindelin @ 2017-05-04 9:19 UTC (permalink / raw)
To: Lars Schneider; +Cc: git, gitster
Hi Lars,
On Wed, 3 May 2017, Lars Schneider wrote:
> The Git for Windows CI web app sometimes returns HTTP errors of
> "502 bad gateway" or "503 service unavailable" [1]. We also need to
> check the HTTP content because the GfW web app seems to pass through
> (error) results from other Azure calls with HTTP code 200.
> Wait a little and retry the request if this happens.
Thanks. In theory, it would be better to fix the web app to pass through
also the 502 error code, in practice I have a hard time finding the time
to make it so ;-)
Therefore, I would be very much in favor of the current version of the
patch.
Ciao,
Dscho
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2] travis-ci: retry if Git for Windows CI returns HTTP error 502 or 503
2017-05-03 21:50 [PATCH v2] travis-ci: retry if Git for Windows CI returns HTTP error 502 or 503 Lars Schneider
2017-05-04 9:19 ` Johannes Schindelin
@ 2017-05-09 6:31 ` Junio C Hamano
2017-05-09 17:40 ` Lars Schneider
1 sibling, 1 reply; 5+ messages in thread
From: Junio C Hamano @ 2017-05-09 6:31 UTC (permalink / raw)
To: Lars Schneider; +Cc: git, Johannes.Schindelin
Lars Schneider <larsxschneider@gmail.com> writes:
> The Git for Windows CI web app sometimes returns HTTP errors of
> "502 bad gateway" or "503 service unavailable" [1]. We also need to
> check the HTTP content because the GfW web app seems to pass through
> (error) results from other Azure calls with HTTP code 200.
> Wait a little and retry the request if this happens.
>
> [1] https://docs.microsoft.com/en-in/azure/app-service-web/app-service-web-troubleshoot-http-502-http-503
>
> Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
> ---
>
> Hi Junio,
>
> I can't really test this as my TravisCI account does not have the
> extended timeout and I am unable to reproduce the error.
>
> It would be great if we could test this is a little bit in pu.
This has been in 'pu' for a while.
As the patch simply discards 502 (and others), it is unclear if the
failing test on 'next' is now gone, or the attempt to run 'pu'
happened to be lucky not to get one, from the output we can see in
https://travis-ci.org/git/git/jobs/229867212
Are you comfortable enough to move this forward? It's not like a
possible breakage in this patch will harm anything (the relaying to
the Windows CI is flaky if the build server cannot deal with the
load anyway), so I would rather have this early in 'next', while we
deal with a few other topics that Windows build is not happy with
that are on 'pu'.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2] travis-ci: retry if Git for Windows CI returns HTTP error 502 or 503
2017-05-09 6:31 ` Junio C Hamano
@ 2017-05-09 17:40 ` Lars Schneider
2017-05-09 23:50 ` Junio C Hamano
0 siblings, 1 reply; 5+ messages in thread
From: Lars Schneider @ 2017-05-09 17:40 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Johannes.Schindelin
> On 09 May 2017, at 07:31, Junio C Hamano <gitster@pobox.com> wrote:
>
> Lars Schneider <larsxschneider@gmail.com> writes:
>
>> The Git for Windows CI web app sometimes returns HTTP errors of
>> "502 bad gateway" or "503 service unavailable" [1]. We also need to
>> check the HTTP content because the GfW web app seems to pass through
>> (error) results from other Azure calls with HTTP code 200.
>> Wait a little and retry the request if this happens.
>>
>> [1] https://docs.microsoft.com/en-in/azure/app-service-web/app-service-web-troubleshoot-http-502-http-503
>>
>> Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
>> ---
>>
>> Hi Junio,
>>
>> I can't really test this as my TravisCI account does not have the
>> extended timeout and I am unable to reproduce the error.
>>
>> It would be great if we could test this is a little bit in pu.
>
> This has been in 'pu' for a while.
>
> As the patch simply discards 502 (and others), it is unclear if the
> failing test on 'next' is now gone, or the attempt to run 'pu'
> happened to be lucky not to get one, from the output we can see in
> https://travis-ci.org/git/git/jobs/229867212
>
> Are you comfortable enough to move this forward?
Yes, please move it forward. I haven't seen a "502 - Web server
received an invalid response" on pu for a while. That means the
patch should work as expected.
Unrelated to this patch I have, however, seen two kinds of timeouts:
(1) Timeout in the "notStarted" state. This job eventually finished
with a failure but it did start only *after* 3h:
https://travis-ci.org/git/git/jobs/230225611
(2) Timeout in the "in progress" state. This job eventually finished
successfully but it took longer than 3h:
https://travis-ci.org/git/git/jobs/229867248
Right now the timeout generates potential false negative results.
I would like to change that and respond with a successful build
*before* we approach the 3h timeout. This means we could generate
false positives. Although this is not ideal, I think that is the better
compromise as a failing Windows build would usually fail quickly
(e.g. in the compile step).
What do you guys think? Would you be OK with that reasoning?
If the Git for Windows builds get more stable over time then
we could reevaluate this compromise.
- Lars
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2] travis-ci: retry if Git for Windows CI returns HTTP error 502 or 503
2017-05-09 17:40 ` Lars Schneider
@ 2017-05-09 23:50 ` Junio C Hamano
0 siblings, 0 replies; 5+ messages in thread
From: Junio C Hamano @ 2017-05-09 23:50 UTC (permalink / raw)
To: Lars Schneider; +Cc: git, Johannes.Schindelin
Lars Schneider <larsxschneider@gmail.com> writes:
>>> It would be great if we could test this is a little bit in pu.
>>
>> This has been in 'pu' for a while.
>>
>> As the patch simply discards 502 (and others), it is unclear if the
>> failing test on 'next' is now gone, or the attempt to run 'pu'
>> happened to be lucky not to get one, from the output we can see in
>> https://travis-ci.org/git/git/jobs/229867212
>>
>> Are you comfortable enough to move this forward?
>
> Yes, please move it forward. I haven't seen a "502 - Web server
> received an invalid response" on pu for a while. That means the
> patch should work as expected.
Will do, thanks.
> Unrelated to this patch I have, however, seen two kinds of timeouts:
>
> (1) Timeout in the "notStarted" state. This job eventually finished
> with a failure but it did start only *after* 3h:
> https://travis-ci.org/git/git/jobs/230225611
>
> (2) Timeout in the "in progress" state. This job eventually finished
> successfully but it took longer than 3h:
> https://travis-ci.org/git/git/jobs/229867248
>
> Right now the timeout generates potential false negative results.
> I would like to change that and respond with a successful build
> *before* we approach the 3h timeout. This means we could generate
> false positives. Although this is not ideal, I think that is the better
> compromise as a failing Windows build would usually fail quickly
> (e.g. in the compile step).
>
> What do you guys think? Would you be OK with that reasoning?
> If the Git for Windows builds get more stable over time then
> we could reevaluate this compromise.
I'd rather see a false breakage on Windows build (i.e. "this might
have succeeded given enough time, but it didn't finish within the
alloted time") than a false sucess (i.e. "we successfully launched
and the build is still running, so let's assume the test succeeds").
Because I do not pay attention to what the overall build page [*1*]
says about a particular branch tip, and I instead look at the
summary list of the indiviaul "Build Jobs", e.g. [*2*]), seeing
errored/failed on [*1*] does not bother me personally, if that is
what you are getting at.
[References]
*1* https://travis-ci.org/git/git/builds/
*2* https://travis-ci.org/git/git/builds/230235081
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2017-05-09 23:51 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-03 21:50 [PATCH v2] travis-ci: retry if Git for Windows CI returns HTTP error 502 or 503 Lars Schneider
2017-05-04 9:19 ` Johannes Schindelin
2017-05-09 6:31 ` Junio C Hamano
2017-05-09 17:40 ` Lars Schneider
2017-05-09 23:50 ` Junio C Hamano
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.