git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/9] http-fetch fixes
@ 2006-02-01 11:28 Mark Wooding
  2006-02-01 11:44 ` [PATCH 1/9] http-fetch: Mark slots as `watched' to stop them being reused Mark Wooding
                   ` (9 more replies)
  0 siblings, 10 replies; 19+ messages in thread
From: Mark Wooding @ 2006-02-01 11:28 UTC (permalink / raw)
  To: git

This patch series fixes the git-http-fetch bug I reported in
`git-http-fetch failure/segfault -- alas no patch'.  The preliminary
patch I was working on /did/ actually fix the bug I found, but uncovered
a bunch more, which I think I've finally got to the bottom of.

The series applies to commit 1506fc34f7585880aeeb12b5fdfe2de4800f9df5.

Particularly: 

  1. watch-slot -- my fix for the actual bug.  Different from Nick
     Hangeval's separate-status structure, it works by locking slots
     that people still want answers from.  I've turned the in_use slot
     field into a bit mask and added a reference counter.  I think you
     can use Nick's patch instead if you prefer it.

  2, 3. list-corruption, abort-object-request -- fix bugs uncovered by
     watch-slot.  See the patch emails for the details.

  4, 5, 6, 7. multi-fdset, fix-rename-message, curl-verbose,
     par-control-flow-tidy -- various other relatively trivial changes I
     noticed while doing the main work above.  fix-rename-message is
     actually a bug, but if you think some other fix is better, that's
     cool; the others are cosmetic things.

  8, 9.  sanity-check-active-slots, sanity-check-object-queue --
     debugging code I threw in while I was trying to track down the main
     bugs.  I'm not going to cry if these aren't accepted (indeed, I
     separated them out and put them at the end specifically!) but they
     might prove useful to someone, so I thought I'd send 'em along
     anyway.

-- [mdw]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/9] http-fetch: Mark slots as `watched' to stop them being reused.
  2006-02-01 11:28 [PATCH 0/9] http-fetch fixes Mark Wooding
@ 2006-02-01 11:44 ` Mark Wooding
  2006-02-01 11:44 ` [PATCH 2/9] http-fetch: Fix object list corruption in fill_active_slots() Mark Wooding
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Mark Wooding @ 2006-02-01 11:44 UTC (permalink / raw)
  To: git

From: Mark Wooding <mdw@distorted.org.uk>

Previously, it was possible for the slot passed to run_active_slot() to
be reused before that function returned, rendering its completion status
useless.  This is particularly noticeable when fetching a repository
with a wide, flat directory structure (lots of blobs in the top-level
tree), a single unpacked commit which changes just one or two blobs, and
the rest of the history in one pack.  When fetch_indices() gets the
pack-list, the slot is (usually!) reused for some blob object which is
packed, so it fails with 404; fetch_indices() decides that the pack-list
fetch failed, and nothing therefore fetches the pack.

We fix this by adding a feature to lock slots against reuse.

Signed-off-by: Mark Wooding <mdw@distorted.org.uk>
---

 http-fetch.c |    6 +++---
 http.c       |   43 +++++++++++++++++++++++++++++++++----------
 http.h       |    6 +++++-
 3 files changed, 41 insertions(+), 14 deletions(-)

diff --git a/http-fetch.c b/http-fetch.c
index 61b2188..4b1b62d 100644
--- a/http-fetch.c
+++ b/http-fetch.c
@@ -465,13 +465,13 @@ static void process_alternates_response(
 				base);
 			curl_easy_setopt(slot->curl, CURLOPT_URL,
 					 alt_req->url);
-			active_requests++;
-			slot->in_use = 1;
+			if (!slot->in_use)
+				active_requests++;
+			slot->in_use |= SLOTUSE_ACTIVE;
 			if (start_active_slot(slot)) {
 				return;
 			} else {
 				got_alternates = -1;
-				slot->in_use = 0;
 				return;
 			}
 		}
diff --git a/http.c b/http.c
index 75e6717..498b4ae 100644
--- a/http.c
+++ b/http.c
@@ -262,7 +262,7 @@ void http_cleanup(void)
 
 	while (slot != NULL) {
 #ifdef USE_CURL_MULTI
-		if (slot->in_use) {
+		if (slot->in_use & SLOTUSE_ACTIVE) {
 			curl_easy_getinfo(slot->curl,
 					  CURLINFO_EFFECTIVE_URL,
 					  &wait_url);
@@ -311,6 +311,7 @@ struct active_request_slot *get_active_s
 		newslot->curl = NULL;
 		newslot->in_use = 0;
 		newslot->next = NULL;
+		newslot->nrefs = 0;
 
 		slot = active_queue_head;
 		if (slot == NULL) {
@@ -333,7 +334,7 @@ struct active_request_slot *get_active_s
 	}
 
 	active_requests++;
-	slot->in_use = 1;
+	slot->in_use |= SLOTUSE_ACTIVE;
 	slot->local = NULL;
 	slot->callback_data = NULL;
 	slot->callback_func = NULL;
@@ -351,8 +352,9 @@ int start_active_slot(struct active_requ
 
 	if (curlm_result != CURLM_OK &&
 	    curlm_result != CURLM_CALL_MULTI_PERFORM) {
-		active_requests--;
-		slot->in_use = 0;
+		slot->in_use &= ~SLOTUSE_ACTIVE;
+		if (!slot->in_use)
+			active_requests--;
 		return 0;
 	}
 #endif
@@ -375,6 +377,24 @@ void step_active_slots(void)
 }
 #endif
 
+static void watch_active_slot(struct active_request_slot *slot)
+{
+	if (!slot->in_use)
+		active_requests++;
+	slot->in_use |= SLOTUSE_REF;
+	slot->nrefs++;
+}
+
+static void unwatch_active_slot(struct active_request_slot *slot)
+{
+	slot->nrefs--;
+	if (!slot->nrefs) {
+		slot->in_use &= ~SLOTUSE_REF;
+		if (!slot->in_use)
+			active_requests--;
+	}
+}
+
 void run_active_slot(struct active_request_slot *slot)
 {
 #ifdef USE_CURL_MULTI
@@ -386,7 +406,8 @@ void run_active_slot(struct active_reque
 	int max_fd;
 	struct timeval select_timeout;
 
-	while (slot->in_use) {
+	watch_active_slot(slot);
+	while (slot->in_use & SLOTUSE_ACTIVE) {
 		data_received = 0;
 		step_active_slots();
 
@@ -397,7 +418,7 @@ void run_active_slot(struct active_reque
 			last_pos = current_pos;
 		}
 
-		if (slot->in_use && !data_received) {
+		if ((slot->in_use & SLOTUSE_ACTIVE) && !data_received) {
 			max_fd = 0;
 			FD_ZERO(&readfds);
 			FD_ZERO(&writefds);
@@ -408,8 +429,9 @@ void run_active_slot(struct active_reque
 			       &excfds, &select_timeout);
 		}
 	}
+	unwatch_active_slot(slot);
 #else
-	while (slot->in_use) {
+	while (slot->in_use & SLOTUSE_ACTIVE) {
 		slot->curl_result = curl_easy_perform(slot->curl);
 		finish_active_slot(slot);
 	}
@@ -418,8 +440,9 @@ void run_active_slot(struct active_reque
 
 static void finish_active_slot(struct active_request_slot *slot)
 {
-        active_requests--;
-        slot->in_use = 0;
+	slot->in_use &= ~SLOTUSE_ACTIVE;
+	if (!slot->in_use)
+		active_requests--;
         curl_easy_getinfo(slot->curl, CURLINFO_HTTP_CODE, &slot->http_code);
  
         /* Run callback if appropriate */
@@ -433,7 +456,7 @@ void finish_all_active_slots(void)
 	struct active_request_slot *slot = active_queue_head;
 
 	while (slot != NULL)
-		if (slot->in_use) {
+		if (slot->in_use & SLOTUSE_ACTIVE) {
 			run_active_slot(slot);
 			slot = active_queue_head;
 		} else {
diff --git a/http.h b/http.h
index ed4ea33..a66e06a 100644
--- a/http.h
+++ b/http.h
@@ -26,7 +26,8 @@ struct active_request_slot
 {
 	CURL *curl;
 	FILE *local;
-	int in_use;
+	unsigned in_use;
+	int nrefs;
 	CURLcode curl_result;
 	long http_code;
 	void *callback_data;
@@ -34,6 +35,9 @@ struct active_request_slot
 	struct active_request_slot *next;
 };
 
+#define SLOTUSE_ACTIVE 1u
+#define SLOTUSE_REF 2u
+
 struct buffer
 {
         size_t posn;

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 2/9] http-fetch: Fix object list corruption in fill_active_slots().
  2006-02-01 11:28 [PATCH 0/9] http-fetch fixes Mark Wooding
  2006-02-01 11:44 ` [PATCH 1/9] http-fetch: Mark slots as `watched' to stop them being reused Mark Wooding
@ 2006-02-01 11:44 ` Mark Wooding
  2006-02-01 11:44 ` [PATCH 3/9] http-fetch: Abort requests for objects which arrived in packs Mark Wooding
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Mark Wooding @ 2006-02-01 11:44 UTC (permalink / raw)
  To: git

From: Mark Wooding <mdw@distorted.org.uk>

In fill_active_slots() -- if we find an object which has already arrived,
say as part of a pack, /don't/ remove it from the list.  It's already been
prefetched and someone will ask for it later.  Just label it as done and
carry blithely on.  (As it was, the code would dereference a freed object
to continue through the list anyway.)

Signed-off-by: Mark Wooding <mdw@distorted.org.uk>
---

 http-fetch.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/http-fetch.c b/http-fetch.c
index 4b1b62d..b1ee836 100644
--- a/http-fetch.c
+++ b/http-fetch.c
@@ -312,7 +312,7 @@ void fill_active_slots(void)
 	while (active_requests < max_requests && obj_req != NULL) {
 		if (obj_req->state == WAITING) {
 			if (has_sha1_file(obj_req->sha1))
-				release_object_request(obj_req);
+				obj_req->state = COMPLETE;
 			else
 				start_object_request(obj_req);
 			curl_multi_perform(curlm, &num_transfers);

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 3/9] http-fetch: Abort requests for objects which arrived in packs
  2006-02-01 11:28 [PATCH 0/9] http-fetch fixes Mark Wooding
  2006-02-01 11:44 ` [PATCH 1/9] http-fetch: Mark slots as `watched' to stop them being reused Mark Wooding
  2006-02-01 11:44 ` [PATCH 2/9] http-fetch: Fix object list corruption in fill_active_slots() Mark Wooding
@ 2006-02-01 11:44 ` Mark Wooding
  2006-02-01 17:12   ` Nick Hengeveld
  2006-02-01 11:44 ` [PATCH 4/9] http-fetch: Actually watch the file descriptors of interest Mark Wooding
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 19+ messages in thread
From: Mark Wooding @ 2006-02-01 11:44 UTC (permalink / raw)
  To: git

From: Mark Wooding <mdw@distorted.org.uk>

In fetch_object, there's a call to release an object request if the
object mysteriously arrived, say in a pack.  Unfortunately, the fetch
attempt for this object might already be in progress, and we'll leak the
descriptor.  Instead, try to tidy away the request.

Signed-off-by: Mark Wooding <mdw@distorted.org.uk>
---

 http-fetch.c |   16 +++++++++++++++-
 http.c       |   20 ++++++++++++++++++--
 http.h       |    1 +
 3 files changed, 34 insertions(+), 3 deletions(-)

diff --git a/http-fetch.c b/http-fetch.c
index b1ee836..8656070 100644
--- a/http-fetch.c
+++ b/http-fetch.c
@@ -770,6 +770,20 @@ static int fetch_pack(struct alt_base *r
 	return 0;
 }
 
+static void abort_object_request(struct object_request *obj_req)
+{
+	if (obj_req->local >= 0) {
+		close(obj_req->local);
+		obj_req->local = -1;
+	}
+	unlink(obj_req->tmpfile);
+	if (obj_req->slot) {
+ 		release_active_slot(obj_req->slot);
+		obj_req->slot = NULL;
+	}
+	release_object_request(obj_req);
+}
+
 static int fetch_object(struct alt_base *repo, unsigned char *sha1)
 {
 	char *hex = sha1_to_hex(sha1);
@@ -782,7 +796,7 @@ static int fetch_object(struct alt_base 
 		return error("Couldn't find request for %s in the queue", hex);
 
 	if (has_sha1_file(obj_req->sha1)) {
-		release_object_request(obj_req);
+		abort_object_request(obj_req);
 		return 0;
 	}
 
diff --git a/http.c b/http.c
index 498b4ae..0a70e1c 100644
--- a/http.c
+++ b/http.c
@@ -438,11 +438,27 @@ void run_active_slot(struct active_reque
 #endif
 }
 
-static void finish_active_slot(struct active_request_slot *slot)
+static void closedown_active_slot(struct active_request_slot *slot)
 {
-	slot->in_use &= ~SLOTUSE_ACTIVE;
+        slot->in_use &= ~SLOTUSE_ACTIVE;
 	if (!slot->in_use)
 		active_requests--;
+}
+
+void release_active_slot(struct active_request_slot *slot)
+{
+	closedown_active_slot(slot);
+	if (slot->curl) {
+		curl_multi_remove_handle(curlm, slot->curl);
+		curl_easy_cleanup(slot->curl);
+		slot->curl = NULL;
+	}
+	fill_active_slots();
+}
+
+static void finish_active_slot(struct active_request_slot *slot)
+{
+	closedown_active_slot(slot);
         curl_easy_getinfo(slot->curl, CURLINFO_HTTP_CODE, &slot->http_code);
  
         /* Run callback if appropriate */
diff --git a/http.h b/http.h
index a66e06a..d028a5d 100644
--- a/http.h
+++ b/http.h
@@ -58,6 +58,7 @@ extern struct active_request_slot *get_a
 extern int start_active_slot(struct active_request_slot *slot);
 extern void run_active_slot(struct active_request_slot *slot);
 extern void finish_all_active_slots(void);
+extern void release_active_slot(struct active_request_slot *slot);
 
 #ifdef USE_CURL_MULTI
 extern void fill_active_slots(void);

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 4/9] http-fetch: Actually watch the file descriptors of interest.
  2006-02-01 11:28 [PATCH 0/9] http-fetch fixes Mark Wooding
                   ` (2 preceding siblings ...)
  2006-02-01 11:44 ` [PATCH 3/9] http-fetch: Abort requests for objects which arrived in packs Mark Wooding
@ 2006-02-01 11:44 ` Mark Wooding
  2006-02-01 15:03   ` Nick Hengeveld
  2006-02-01 11:44 ` [PATCH 5/9] http-fetch: Fix message reporting rename of object file Mark Wooding
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 19+ messages in thread
From: Mark Wooding @ 2006-02-01 11:44 UTC (permalink / raw)
  To: git

From: Mark Wooding <mdw@distorted.org.uk>

Presumably this was just some kind of oversight.

Signed-off-by: Mark Wooding <mdw@distorted.org.uk>
---

 http.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/http.c b/http.c
index 0a70e1c..521323c 100644
--- a/http.c
+++ b/http.c
@@ -425,6 +425,8 @@ void run_active_slot(struct active_reque
 			FD_ZERO(&excfds);
 			select_timeout.tv_sec = 0;
 			select_timeout.tv_usec = 50000;
+			curl_multi_fdset(curlm, &readfds, &writefds,
+					 &excfds, &max_fd);
 			select(max_fd, &readfds, &writefds,
 			       &excfds, &select_timeout);
 		}

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 5/9] http-fetch: Fix message reporting rename of object file.
  2006-02-01 11:28 [PATCH 0/9] http-fetch fixes Mark Wooding
                   ` (3 preceding siblings ...)
  2006-02-01 11:44 ` [PATCH 4/9] http-fetch: Actually watch the file descriptors of interest Mark Wooding
@ 2006-02-01 11:44 ` Mark Wooding
  2006-02-01 11:44 ` [PATCH 6/9] http: Turn on verbose Curl messages if GIT_CURL_VERBOSE set in environment Mark Wooding
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Mark Wooding @ 2006-02-01 11:44 UTC (permalink / raw)
  To: git

From: Mark Wooding <mdw@distorted.org.uk>

move_temp_to_file returns 0 or -1.  This is not a good thing to pass to
strerror(3).  Fortunately, someone already reported the error, so don't
worry too much.

Signed-off-by: Mark Wooding <mdw@distorted.org.uk>
---

 http-fetch.c |    5 ++---
 1 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/http-fetch.c b/http-fetch.c
index 8656070..f1aac14 100644
--- a/http-fetch.c
+++ b/http-fetch.c
@@ -831,9 +831,8 @@ static int fetch_object(struct alt_base 
 	} else if (memcmp(obj_req->sha1, obj_req->real_sha1, 20)) {
 		ret = error("File %s has bad hash\n", hex);
 	} else if (obj_req->rename < 0) {
-		ret = error("unable to write sha1 filename %s: %s",
-			    obj_req->filename,
-			    strerror(obj_req->rename));
+		ret = error("unable to write sha1 filename %s",
+			    obj_req->filename);
 	}
 
 	release_object_request(obj_req);

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 6/9] http: Turn on verbose Curl messages if GIT_CURL_VERBOSE set in environment
  2006-02-01 11:28 [PATCH 0/9] http-fetch fixes Mark Wooding
                   ` (4 preceding siblings ...)
  2006-02-01 11:44 ` [PATCH 5/9] http-fetch: Fix message reporting rename of object file Mark Wooding
@ 2006-02-01 11:44 ` Mark Wooding
  2006-02-01 11:44 ` [PATCH 7/9] http-fetch: Tidy control flow in process_alternate_response Mark Wooding
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Mark Wooding @ 2006-02-01 11:44 UTC (permalink / raw)
  To: git

From: Mark Wooding <mdw@distorted.org.uk>

Signed-off-by: Mark Wooding <mdw@distorted.org.uk>
---

 http.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/http.c b/http.c
index 521323c..d0c92dc 100644
--- a/http.c
+++ b/http.c
@@ -192,6 +192,9 @@ static CURL* get_curl_handle(void)
 
 	curl_easy_setopt(result, CURLOPT_FOLLOWLOCATION, 1);
 
+	if (getenv("GIT_CURL_VERBOSE"))
+		curl_easy_setopt(result, CURLOPT_VERBOSE, 1);
+
 	return result;
 }
 

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 7/9] http-fetch: Tidy control flow in process_alternate_response
  2006-02-01 11:28 [PATCH 0/9] http-fetch fixes Mark Wooding
                   ` (5 preceding siblings ...)
  2006-02-01 11:44 ` [PATCH 6/9] http: Turn on verbose Curl messages if GIT_CURL_VERBOSE set in environment Mark Wooding
@ 2006-02-01 11:44 ` Mark Wooding
  2006-02-01 11:44 ` [PATCH 8/9] http: Paranoid sanity checking for active slots Mark Wooding
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Mark Wooding @ 2006-02-01 11:44 UTC (permalink / raw)
  To: git

From: Mark Wooding <mdw@distorted.org.uk>

It's a bit convoluted.  Tidy it up.

Signed-off-by: Mark Wooding <mdw@distorted.org.uk>
---

 http-fetch.c |    7 ++-----
 1 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/http-fetch.c b/http-fetch.c
index f1aac14..4aa5a11 100644
--- a/http-fetch.c
+++ b/http-fetch.c
@@ -468,12 +468,9 @@ static void process_alternates_response(
 			if (!slot->in_use)
 				active_requests++;
 			slot->in_use |= SLOTUSE_ACTIVE;
-			if (start_active_slot(slot)) {
-				return;
-			} else {
+			if (!start_active_slot(slot))
 				got_alternates = -1;
-				return;
-			}
+			return;
 		}
 	} else if (slot->curl_result != CURLE_OK) {
 		if (slot->http_code != 404 &&

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 8/9] http: Paranoid sanity checking for active slots.
  2006-02-01 11:28 [PATCH 0/9] http-fetch fixes Mark Wooding
                   ` (6 preceding siblings ...)
  2006-02-01 11:44 ` [PATCH 7/9] http-fetch: Tidy control flow in process_alternate_response Mark Wooding
@ 2006-02-01 11:44 ` Mark Wooding
  2006-02-01 11:44 ` [PATCH 9/9] http-fetch: Paranoid sanity checking for the object queue Mark Wooding
  2006-02-01 15:30 ` [PATCH 0/9] http-fetch fixes Uwe Zeisberger
  9 siblings, 0 replies; 19+ messages in thread
From: Mark Wooding @ 2006-02-01 11:44 UTC (permalink / raw)
  To: git

From: Mark Wooding <mdw@distorted.org.uk>

Probably not wanted in the mainline, but very useful when debugging.

Signed-off-by: Mark Wooding <mdw@distorted.org.uk>
---

 http-fetch.c |    2 ++
 http.c       |   28 ++++++++++++++++++++++++++++
 http.h       |    1 +
 3 files changed, 31 insertions(+), 0 deletions(-)

diff --git a/http-fetch.c b/http-fetch.c
index 4aa5a11..fa3eb4a 100644
--- a/http-fetch.c
+++ b/http-fetch.c
@@ -465,9 +465,11 @@ static void process_alternates_response(
 				base);
 			curl_easy_setopt(slot->curl, CURLOPT_URL,
 					 alt_req->url);
+			sanity_check_active_slots();
 			if (!slot->in_use)
 				active_requests++;
 			slot->in_use |= SLOTUSE_ACTIVE;
+			sanity_check_active_slots();
 			if (!start_active_slot(slot))
 				got_alternates = -1;
 			return;
diff --git a/http.c b/http.c
index d0c92dc..8087ca0 100644
--- a/http.c
+++ b/http.c
@@ -1,3 +1,4 @@
+#include <assert.h>
 #include "http.h"
 
 int data_received;
@@ -348,19 +349,36 @@ struct active_request_slot *get_active_s
 	return slot;
 }
 
+void sanity_check_active_slots(void)
+{
+	struct active_request_slot *slot;
+	int n = 0;
+
+	for (slot = active_queue_head; slot; slot = slot->next) {
+		assert(!(slot->in_use & SLOTUSE_REF) == !slot->nrefs);
+		if (slot->in_use)
+			n++;
+	}
+	assert(n == active_requests);
+}
+
 int start_active_slot(struct active_request_slot *slot)
 {
+	sanity_check_active_slots();
 #ifdef USE_CURL_MULTI
 	CURLMcode curlm_result = curl_multi_add_handle(curlm, slot->curl);
 
 	if (curlm_result != CURLM_OK &&
 	    curlm_result != CURLM_CALL_MULTI_PERFORM) {
+		assert(slot->in_use & SLOTUSE_ACTIVE);
 		slot->in_use &= ~SLOTUSE_ACTIVE;
 		if (!slot->in_use)
 			active_requests--;
+		assert(active_requests >= 0);
 		return 0;
 	}
 #endif
+	sanity_check_active_slots();
 	return 1;
 }
 
@@ -382,20 +400,26 @@ void step_active_slots(void)
 
 static void watch_active_slot(struct active_request_slot *slot)
 {
+	sanity_check_active_slots();
 	if (!slot->in_use)
 		active_requests++;
 	slot->in_use |= SLOTUSE_REF;
 	slot->nrefs++;
+	sanity_check_active_slots();
 }
 
 static void unwatch_active_slot(struct active_request_slot *slot)
 {
+	sanity_check_active_slots();
+	assert(slot->nrefs);
 	slot->nrefs--;
 	if (!slot->nrefs) {
 		slot->in_use &= ~SLOTUSE_REF;
 		if (!slot->in_use)
 			active_requests--;
+		assert(active_requests >= 0);
 	}
+	sanity_check_active_slots();
 }
 
 void run_active_slot(struct active_request_slot *slot)
@@ -445,9 +469,13 @@ void run_active_slot(struct active_reque
 
 static void closedown_active_slot(struct active_request_slot *slot)
 {
+	sanity_check_active_slots();
+	assert(slot->in_use & SLOTUSE_ACTIVE);
         slot->in_use &= ~SLOTUSE_ACTIVE;
 	if (!slot->in_use)
 		active_requests--;
+	assert(active_requests >= 0);
+	sanity_check_active_slots();
 }
 
 void release_active_slot(struct active_request_slot *slot)
diff --git a/http.h b/http.h
index d028a5d..a2e48e8 100644
--- a/http.h
+++ b/http.h
@@ -59,6 +59,7 @@ extern int start_active_slot(struct acti
 extern void run_active_slot(struct active_request_slot *slot);
 extern void finish_all_active_slots(void);
 extern void release_active_slot(struct active_request_slot *slot);
+extern void sanity_check_active_slots(void);
 
 #ifdef USE_CURL_MULTI
 extern void fill_active_slots(void);

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 9/9] http-fetch: Paranoid sanity checking for the object queue.
  2006-02-01 11:28 [PATCH 0/9] http-fetch fixes Mark Wooding
                   ` (7 preceding siblings ...)
  2006-02-01 11:44 ` [PATCH 8/9] http: Paranoid sanity checking for active slots Mark Wooding
@ 2006-02-01 11:44 ` Mark Wooding
  2006-02-01 15:30 ` [PATCH 0/9] http-fetch fixes Uwe Zeisberger
  9 siblings, 0 replies; 19+ messages in thread
From: Mark Wooding @ 2006-02-01 11:44 UTC (permalink / raw)
  To: git

From: Mark Wooding <mdw@distorted.org.uk>

Probably not wanted in the mainline, but useful for debugging.

Signed-off-by: Mark Wooding <mdw@distorted.org.uk>
---

 http-fetch.c |   70 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 70 insertions(+), 0 deletions(-)

diff --git a/http-fetch.c b/http-fetch.c
index fa3eb4a..80eaa2f 100644
--- a/http-fetch.c
+++ b/http-fetch.c
@@ -3,6 +3,7 @@
 #include "pack.h"
 #include "fetch.h"
 #include "http.h"
+#include <assert.h>
 
 #define PREV_BUF_SIZE 4096
 #define RANGE_HEADER_SIZE 30
@@ -28,6 +29,13 @@ enum object_request_state {
 	COMPLETE,
 };
 
+typedef struct stash {
+	struct stash *next;
+	unsigned char sha1[20];
+	int activep;
+	int tickp;
+} stash;
+
 struct object_request
 {
 	unsigned char sha1[20];
@@ -47,6 +55,7 @@ struct object_request
 	int rename;
 	struct active_request_slot *slot;
 	struct object_request *next;
+	stash *tick;
 };
 
 struct alternates_request {
@@ -57,8 +66,60 @@ struct alternates_request {
 	int http_specific;
 };
 
+static stash *seen = 0;
+
+static stash *findstash(unsigned char *sha1)
+{
+	stash *s;
+	for (s = seen; s; s = s->next) {
+		if (memcmp(sha1, s->sha1, 20) == 0)
+			return s;
+	}
+	return NULL;
+}
+
+static stash *enstash(unsigned char *sha1)
+{
+	stash *s;
+	if ((s = findstash(sha1)) != NULL)
+		assert(!s->activep);
+	else {
+		s = xmalloc(sizeof(*s));
+		memcpy(s->sha1, sha1, 20);
+		s->next = seen;
+		seen = s;
+	}
+	s->activep = 1;
+	return s;
+}
+
+static void unstash(unsigned char *sha1)
+{
+	stash *s = findstash(sha1);
+	assert(s && s->activep);
+	s->activep = 0;
+}
+
 static struct object_request *object_queue_head = NULL;
 
+static void sanity_check_stash(void)
+{
+	stash *s;
+	struct object_request *obj_req;
+
+	for (s = seen; s; s = s->next)
+		s->tickp = 0;
+	for (obj_req = object_queue_head; obj_req; obj_req = obj_req->next) {
+		assert(obj_req->tick);
+		s = obj_req->tick;
+		assert(memcmp(s->sha1, obj_req->sha1, 20) == 0);
+		assert(s->activep);
+		s->tickp = 1;
+	}
+	for (s = seen; s; s = s->next)
+		assert(!s->tickp == !s->activep);
+}
+
 static size_t fwrite_sha1_file(void *ptr, size_t eltsize, size_t nmemb,
 			       void *data)
 {
@@ -287,6 +348,7 @@ static void release_object_request(struc
 {
 	struct object_request *entry = object_queue_head;
 
+	sanity_check_stash();
 	if (obj_req->local != -1)
 		error("fd leakage in release: %d", obj_req->local);
 	if (obj_req == object_queue_head) {
@@ -298,6 +360,8 @@ static void release_object_request(struc
 			entry->next = entry->next->next;
 	}
 
+	unstash(obj_req->sha1);
+	sanity_check_stash();
 	free(obj_req->url);
 	free(obj_req);
 }
@@ -309,6 +373,7 @@ void fill_active_slots(void)
 	struct active_request_slot *slot = active_queue_head;
 	int num_transfers;
 
+	sanity_check_stash();
 	while (active_requests < max_requests && obj_req != NULL) {
 		if (obj_req->state == WAITING) {
 			if (has_sha1_file(obj_req->sha1))
@@ -327,6 +392,7 @@ void fill_active_slots(void)
 		}
 		slot = slot->next;
 	}				
+	sanity_check_stash();
 }
 #endif
 
@@ -347,6 +413,7 @@ void prefetch(unsigned char *sha1)
 		 "%s.temp", filename);
 	newreq->next = NULL;
 
+	sanity_check_stash();
 	if (object_queue_head == NULL) {
 		object_queue_head = newreq;
 	} else {
@@ -356,6 +423,8 @@ void prefetch(unsigned char *sha1)
 		}
 		tail->next = newreq;
 	}
+	newreq->tick = enstash(sha1);
+	sanity_check_stash();
 
 #ifdef USE_CURL_MULTI
 	fill_active_slots();
@@ -789,6 +858,7 @@ static int fetch_object(struct alt_base 
 	int ret = 0;
 	struct object_request *obj_req = object_queue_head;
 
+	sanity_check_stash();
 	while (obj_req != NULL && memcmp(obj_req->sha1, sha1, 20))
 		obj_req = obj_req->next;
 	if (obj_req == NULL)

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/9] http-fetch: Actually watch the file descriptors of interest.
  2006-02-01 11:44 ` [PATCH 4/9] http-fetch: Actually watch the file descriptors of interest Mark Wooding
@ 2006-02-01 15:03   ` Nick Hengeveld
  0 siblings, 0 replies; 19+ messages in thread
From: Nick Hengeveld @ 2006-02-01 15:03 UTC (permalink / raw)
  To: Mark Wooding; +Cc: git

On Wed, Feb 01, 2006 at 11:44:33AM +0000, Mark Wooding wrote:

> Presumably this was just some kind of oversight.

I was never able to make this work reliably across multiple versions of
curl.

> Signed-off-by: Mark Wooding <mdw@distorted.org.uk>
> ---
> 
>  http.c |    2 ++
>  1 files changed, 2 insertions(+), 0 deletions(-)
> 
> diff --git a/http.c b/http.c
> index 0a70e1c..521323c 100644
> --- a/http.c
> +++ b/http.c
> @@ -425,6 +425,8 @@ void run_active_slot(struct active_reque
>  			FD_ZERO(&excfds);
>  			select_timeout.tv_sec = 0;
>  			select_timeout.tv_usec = 50000;
> +			curl_multi_fdset(curlm, &readfds, &writefds,
> +					 &excfds, &max_fd);
>  			select(max_fd, &readfds, &writefds,
>  			       &excfds, &select_timeout);
>  		}
> 
> -
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
For a successful technology, reality must take precedence over public
relations, for nature cannot be fooled.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/9] http-fetch fixes
  2006-02-01 11:28 [PATCH 0/9] http-fetch fixes Mark Wooding
                   ` (8 preceding siblings ...)
  2006-02-01 11:44 ` [PATCH 9/9] http-fetch: Paranoid sanity checking for the object queue Mark Wooding
@ 2006-02-01 15:30 ` Uwe Zeisberger
  2006-02-01 15:47   ` Mark Wooding
  9 siblings, 1 reply; 19+ messages in thread
From: Uwe Zeisberger @ 2006-02-01 15:30 UTC (permalink / raw)
  To: Mark Wooding; +Cc: git

Hello Mark,

Mark Wooding wrote:
> This patch series fixes the git-http-fetch bug I reported in
> `git-http-fetch failure/segfault -- alas no patch'.  The preliminary
> patch I was working on /did/ actually fix the bug I found, but uncovered
> a bunch more, which I think I've finally got to the bottom of.
> 
> The series applies to commit 1506fc34f7585880aeeb12b5fdfe2de4800f9df5.
after reverting c8568e139ed2149fbfb7ef9a8d819d5b6b7c554f it applies to
8233340ce6eb700eb2cd9c0fef4d1705997c499b (=current master), too.
    
With these patches applied, I get now a Segfault, while cloning u-boot.

        walk a7b9fb9110e3c0be644b3e2c8f397f606138a710
        got 98dce899a97f7998b11f58e8c7897ba9f7e0b95a
        got 0cfa422c45d2e2b6023b74a3784034bb18a7eb73
        got 9341e20e95c5ff5c36a8f9506b4039bbc6aefd43
        /home/uzeisberger/usr/bin/git-clone: line 42: 24002 Segmentation fault
        git-http-fetch -v -a -w "$name" "$name" "$1/"

Actually now I cannot reproduce it anymore.  I added some debugging code
s.t. I can give more details if it reoccurs.

Best regards
Uwe

-- 
Uwe Zeisberger

http://www.google.com/search?q=5+choose+3

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/9] http-fetch fixes
  2006-02-01 15:30 ` [PATCH 0/9] http-fetch fixes Uwe Zeisberger
@ 2006-02-01 15:47   ` Mark Wooding
  2006-02-02  3:02     ` Junio C Hamano
  0 siblings, 1 reply; 19+ messages in thread
From: Mark Wooding @ 2006-02-01 15:47 UTC (permalink / raw)
  To: git

Uwe Zeisberger <zeisberg@informatik.uni-freiburg.de> wrote:

>> The series applies to commit 1506fc34f7585880aeeb12b5fdfe2de4800f9df5.
> after reverting c8568e139ed2149fbfb7ef9a8d819d5b6b7c554f it applies to
> 8233340ce6eb700eb2cd9c0fef4d1705997c499b (=current master), too.

Good-oh.  (I did pull the latest tip, and then panicked because Nick's
change had already made it out.)

> With these patches applied, I get now a Segfault, while cloning
> u-boot.

Oh. :-(  

> Actually now I cannot reproduce it anymore.  I added some debugging code
> s.t. I can give more details if it reoccurs.

Please do.  I've just tried, and I can't reproduce the problem either.

The watch-slots patch seems to uncover all manner of brokenness in
http-fetch.c -- or I've violated some unwritten assumptions of a very
deep nature, which is always possible.

Maybe the right thing to do is to go with Nick's separate-status-
structures patch as an actual mainline fix, with my patches for
http-fetch.c's other bugs as a belt-and-braces.

-- [mdw]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/9] http-fetch: Abort requests for objects which arrived in packs
  2006-02-01 11:44 ` [PATCH 3/9] http-fetch: Abort requests for objects which arrived in packs Mark Wooding
@ 2006-02-01 17:12   ` Nick Hengeveld
  2006-02-01 17:23     ` Mark Wooding
  0 siblings, 1 reply; 19+ messages in thread
From: Nick Hengeveld @ 2006-02-01 17:12 UTC (permalink / raw)
  To: Mark Wooding; +Cc: git

On Wed, Feb 01, 2006 at 11:44:31AM +0000, Mark Wooding wrote:

> +void release_active_slot(struct active_request_slot *slot)
> +{
> +	closedown_active_slot(slot);
> +	if (slot->curl) {
> +		curl_multi_remove_handle(curlm, slot->curl);
> +		curl_easy_cleanup(slot->curl);
> +		slot->curl = NULL;
> +	}
> +	fill_active_slots();
> +}

Does it make sense to call curl_easy_cleanup here?  This will close
persistent server connections and cause another connection startup if
the slot is reused.

-- 
For a successful technology, reality must take precedence over public
relations, for nature cannot be fooled.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/9] http-fetch: Abort requests for objects which arrived in packs
  2006-02-01 17:12   ` Nick Hengeveld
@ 2006-02-01 17:23     ` Mark Wooding
  2006-02-06 23:11       ` Nick Hengeveld
  0 siblings, 1 reply; 19+ messages in thread
From: Mark Wooding @ 2006-02-01 17:23 UTC (permalink / raw)
  To: git

Nick Hengeveld <nickh@reactrix.com> wrote:
> Does it make sense to call curl_easy_cleanup here?  This will close
> persistent server connections and cause another connection startup if
> the slot is reused.

I'm by no means a Curl expert.  But I scoured the docs for another way
of interrupting the current fetch and I couldn't find anything.  It
really didn't seem like a good idea to leave the handle as it was,
having (possibly) half-fetched a thing we weren't actually interested in
-- that way lies confusion.

Besides, in practice, what's happened is that the object we asked for is
in a pack which we've just collected.  Unless the upstream repository
hasn't been git-prune-packed, our fetch is going to fail with a 404
anyway, at which point Curl /will/ close the connection and make another
one next time.  So, in real life, it makes no difference.

-- [mdw]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/9] http-fetch fixes
  2006-02-01 15:47   ` Mark Wooding
@ 2006-02-02  3:02     ` Junio C Hamano
  2006-02-03 20:20       ` Mark Wooding
  0 siblings, 1 reply; 19+ messages in thread
From: Junio C Hamano @ 2006-02-02  3:02 UTC (permalink / raw)
  To: Mark Wooding; +Cc: git, Uwe Zeisberger, Nick Hengeveld

Mark Wooding <mdw@distorted.org.uk> writes:

> Uwe Zeisberger <zeisberg@informatik.uni-freiburg.de> wrote:
>
>> With these patches applied, I get now a Segfault, while cloning
>> u-boot.
> ...
> Maybe the right thing to do is to go with Nick's separate-status-
> structures patch as an actual mainline fix, with my patches for
> http-fetch.c's other bugs as a belt-and-braces.

Thanks.

I briefly looked at the series; the following may be ready to be
applied:

[PATCH 2/9] http-fetch: Fix object list corruption in fill_activ...
[PATCH 5/9] http-fetch: Fix message reporting rename of object file
[PATCH 6/9] http: Turn on verbose Curl messages if GIT_CURL_VERB...
[PATCH 7/9] http-fetch: Tidy control flow in process_alternate_r...

For now however I'll wait and see for a while, in case you can
reproduce the breakage Uwe found and add fixes to your set.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/9] http-fetch fixes
  2006-02-02  3:02     ` Junio C Hamano
@ 2006-02-03 20:20       ` Mark Wooding
  2006-02-03 20:42         ` Junio C Hamano
  0 siblings, 1 reply; 19+ messages in thread
From: Mark Wooding @ 2006-02-03 20:20 UTC (permalink / raw)
  To: git

Junio C Hamano <junkio@cox.net> wrote:

> I briefly looked at the series; the following may be ready to be
> applied:
>
> [PATCH 2/9] http-fetch: Fix object list corruption in fill_activ...
> [PATCH 5/9] http-fetch: Fix message reporting rename of object file
> [PATCH 6/9] http: Turn on verbose Curl messages if GIT_CURL_VERB...
> [PATCH 7/9] http-fetch: Tidy control flow in process_alternate_r...
>
> For now however I'll wait and see for a while, in case you can
> reproduce the breakage Uwe found and add fixes to your set.

I've not had any more trouble with this since, but I'm still worried
that the http fetcher is rather fragile. :-(

What was wrong with 3/9, by the way?  (It's the abort_active_slot patch
-- which does fix a real bug.)  Is it just that it doesn't apply cleanly
without the first one?  If so, I can send out a new patch easily enough.

-- [mdw]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/9] http-fetch fixes
  2006-02-03 20:20       ` Mark Wooding
@ 2006-02-03 20:42         ` Junio C Hamano
  0 siblings, 0 replies; 19+ messages in thread
From: Junio C Hamano @ 2006-02-03 20:42 UTC (permalink / raw)
  To: Mark Wooding; +Cc: git

Mark Wooding <mdw@distorted.org.uk> writes:

> Junio C Hamano <junkio@cox.net> wrote:
>
>> I briefly looked at the series; the following may be ready to be
>> applied:
>>
>> [PATCH 2/9] http-fetch: Fix object list corruption in fill_activ...
>> [PATCH 5/9] http-fetch: Fix message reporting rename of object file
>> [PATCH 6/9] http: Turn on verbose Curl messages if GIT_CURL_VERB...
>> [PATCH 7/9] http-fetch: Tidy control flow in process_alternate_r...
>>
>> For now however I'll wait and see for a while, in case you can
>> reproduce the breakage Uwe found and add fixes to your set.
>
> I've not had any more trouble with this since, but I'm still worried
> that the http fetcher is rather fragile. :-(
>
> What was wrong with 3/9, by the way?  (It's the abort_active_slot patch
> -- which does fix a real bug.)

I saw Nick raised an alternative and you responded but I did not
see Nick agreeing or disagreeing to it, and I haven't spent
enough time on that to convince myself which way to go.  That is
why.

Maybe this weekend.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/9] http-fetch: Abort requests for objects which arrived in packs
  2006-02-01 17:23     ` Mark Wooding
@ 2006-02-06 23:11       ` Nick Hengeveld
  0 siblings, 0 replies; 19+ messages in thread
From: Nick Hengeveld @ 2006-02-06 23:11 UTC (permalink / raw)
  To: Mark Wooding; +Cc: git

On Wed, Feb 01, 2006 at 05:23:21PM +0000, Mark Wooding wrote:

> I'm by no means a Curl expert.  But I scoured the docs for another way
> of interrupting the current fetch and I couldn't find anything.  It
> really didn't seem like a good idea to leave the handle as it was,
> having (possibly) half-fetched a thing we weren't actually interested in
> -- that way lies confusion.

My impression from the curl documentation was that removing the curl
handle from the multi session was enough, but the documentation doesn't
describe what happens to such a request.

> Besides, in practice, what's happened is that the object we asked for is
> in a pack which we've just collected.  Unless the upstream repository
> hasn't been git-prune-packed, our fetch is going to fail with a 404
> anyway, at which point Curl /will/ close the connection and make another
> one next time.  So, in real life, it makes no difference.

Good point, and it does seem safer to fully clean up the connection
rather than leave it in a questionable state.

-- 
For a successful technology, reality must take precedence over public
relations, for nature cannot be fooled.

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2006-02-06 23:12 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-02-01 11:28 [PATCH 0/9] http-fetch fixes Mark Wooding
2006-02-01 11:44 ` [PATCH 1/9] http-fetch: Mark slots as `watched' to stop them being reused Mark Wooding
2006-02-01 11:44 ` [PATCH 2/9] http-fetch: Fix object list corruption in fill_active_slots() Mark Wooding
2006-02-01 11:44 ` [PATCH 3/9] http-fetch: Abort requests for objects which arrived in packs Mark Wooding
2006-02-01 17:12   ` Nick Hengeveld
2006-02-01 17:23     ` Mark Wooding
2006-02-06 23:11       ` Nick Hengeveld
2006-02-01 11:44 ` [PATCH 4/9] http-fetch: Actually watch the file descriptors of interest Mark Wooding
2006-02-01 15:03   ` Nick Hengeveld
2006-02-01 11:44 ` [PATCH 5/9] http-fetch: Fix message reporting rename of object file Mark Wooding
2006-02-01 11:44 ` [PATCH 6/9] http: Turn on verbose Curl messages if GIT_CURL_VERBOSE set in environment Mark Wooding
2006-02-01 11:44 ` [PATCH 7/9] http-fetch: Tidy control flow in process_alternate_response Mark Wooding
2006-02-01 11:44 ` [PATCH 8/9] http: Paranoid sanity checking for active slots Mark Wooding
2006-02-01 11:44 ` [PATCH 9/9] http-fetch: Paranoid sanity checking for the object queue Mark Wooding
2006-02-01 15:30 ` [PATCH 0/9] http-fetch fixes Uwe Zeisberger
2006-02-01 15:47   ` Mark Wooding
2006-02-02  3:02     ` Junio C Hamano
2006-02-03 20:20       ` Mark Wooding
2006-02-03 20:42         ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).