git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Git-aware HTTP transport
@ 2008-08-26  1:26 Shawn O. Pearce
  2008-08-26  2:34 ` H. Peter Anvin
  2013-02-13  1:34 ` Git-aware HTTP transport docs H. Peter Anvin
  0 siblings, 2 replies; 57+ messages in thread
From: Shawn O. Pearce @ 2008-08-26  1:26 UTC (permalink / raw)
  To: git, H. Peter Anvin

I spent some time on Friday kicking around how the fetch part of
an HTTP protocol could be implemented.  What I seem to have settled
on at this point in time is a more condensed version of the native
git protocol, batched into 256 commit blocks.

--8<--
Smart HTTP transfer protocols
=============================

Git supports two HTTP based transfer protocols.  A "dumb" protocol
which requires only a standard HTTP server on the server end of the
connection, and a "smart" protocol which requires a Git aware CGI
(or server module).  This document describes the "smart" protocol.

As a design feature smart clients can automatically translate and
upgrade "dumb" protocol URLs.  This permits all users to have the
same published URL, with the peers automatically choosing to use
the most efficient transport available to them.

Authentication
--------------

Standard HTTP authentication is used if authentication is required
to access a repository, and must be configured and enforced by the
HTTP server software itself.

Stateless
---------

The protocol, much like its underlying HTTP, is stateless, from the
perspective of the HTTP server side.  All state must be retained and
managed by the client.  This permits round-robin load-balancing on
the server side, among many other implementation details.

HTTP/1.1 Preference
-------------------

For performance reasons the HTTP/1.1 chunked transfer encoding
is used whenever possible to transfer variable length objects.
This avoids needing to produce large results in memory to compute
the proper content-length.

Detecting Smart Servers
-----------------------

HTTP clients can detect a smart Git-aware server by HEADing
$repo/backend.git-http and looking for a 302 redirect to the
repository's smart service URL:

	C: HEAD /path/to/repository.git/backend.git-http HTTP/1.1

	S: HTTP/1.1 302 Found
	S: Location: /git/path/to/repository.git

A dumb server would respond with a 304 Not Found (or 200 OK).

Smart servers may send a redirect to any URL that does not
contain query args (e.g. "foo?repo=path.git" is invalid).
The URL must be sufficient to provide the location of the
repository to the smart service code.

A valid redirect can be to yourself, for example:

	C: HEAD /path/to/repository.git/backend.git-http HTTP/1.1

	S: HTTP/1.1 302 Found
	S: Location: /path/to/repository.git/backend.git-http/.

All subsequent communcation for this transaction is done through
the smart service URL ($ssurl), not the original URL.

GET $ssurl/refs
---------------

Obtains the available refs from the remote repository.  The response
is a sequence of refs, one per Git packet line.  The final packet
line has a length of 0 to indicate the end.  This is basically
the same protocol that is used by the git-upload-pack service to
advertise the available refs.

	C: GET $ssurl/refs HTTP/1.1

	S: HTTP/1.1 200 OK
	S: Content-Type: application/x-git-refs
	S:
	S: 003295dcfa3633004da0049d3d0fa03f80589cbcaf31 HEAD
	S: 003e95dcfa3633004da0049d3d0fa03f80589cbcaf31 refs/heads/maint
	S: 003fd049f6c27a2244e12041955e262a404c7faba355 refs/heads/master
	S: 003b2cb58b79488a98d2721cea644875a8dd0026b115 refs/heads/pu
	S: 0000

POST $ssurl/upload-pack
-----------------------

Prepares an estimated minimal pack to transfer new objects to the
client.

The computation to select the minimal pack proceeds as follows
(c = client, s = server):

 init step:
 (c) Use /refs to obtain the advertised refs.
 (c) Place any object seen in /refs into set ADVERTISED.

 (c) Build a set, WANT, of the objects from ADVERTISED the client
     wants to fetch, based on what it saw from /refs.

 (c) Start a queue, C_PENDING, ordered by commit time (popping newest
     first).  Add all client refs.  When a commit is popped from the
     queue its parents should be automatically inserted back.  Commits
     should only enter the queue once.

 one compute step:
 (c) Send a /upload-pack request:

	C: POST $ssurl/upload-pack HTTP/1.1
	C: Content-Type: application/x-git-uploadpack
	C: Content-Length: ...
	C:
	C: 0009want
	C: 0xxx<WANT list>
	C: 000bcommon
	C: 0xxx<COMMON list>
	C: 0009have
	C: 0xxx<HAVE list>
	C: 0000

     The stream is organized into "sections", where each section is
     composed of two git pkt-lines.  The first pkt-line provides the
     name of the section ("want", "have", "common").  The second
     pkt-line has the binary SHA-1 ids which compose that section.

     The "want" section is required.  The other sections ("have",
     "common") are optional.  A missing "want" section should be
     answered with a "400 Bad Request".

     Sections must appear in the following order, if they appear
     at all in the request stream:

       * want
       * common
       * have

     Each section may appear multiple times.  Client implementions
     are encouraged to use as few sections as possible, however the
     limit of 64k per pkt-line limits the number of ids to 3,276 per
     section entry.

     The stream is terminated by a pkt-line flush ("0000").

     The HAVE list is created by popping the first 256 commits
     from C_PENDING.  Less can be supplied if C_PENDING empties.

  (s) Parse the /upload-pack request.

      Verify all objects in WANT are reachable from refs.  As
      this may require walking backwards through history to
      the very beginning on invalid requests the server may
      use a reasonable limit of commits (e.g. 1000) walked
      beyond any ref tip before giving up.

      If any WANT object is not reachable, send a 409 error:

	S: HTTP/1.1 409 Conflict
	S: Content-Type: application/x-git-error
	S:
	S: %s not reachable

     Create an empty list, S_COMMON.

     If 'common' was sent:

     Load all objects into S_COMMON.

     If 'have' was sent:

     Loop through the objects in the order supplied by the client.
     For each object, if the server has the object reachable from
     a ref, add it to S_COMMON.  If a commit is added to S_COMMON,
     do not add any ancestors, even if they also appear in HAVE.

  (s) Send the /upload-pack response:

	S: HTTP/1.1 200 OK
	S: Content-Type: application/x-git-uploadpack

	S: 000bcommon
	S: 0xxx<S_COMMON list>
	S: 0000

     The stream formatting rules are the same as the request.

     The section "common" details the contents of S_COMMON,
     that is all objects from HAVE that the server also has.

     If the server has found a closed set of objects to pack,
     it replies with the pack and not x-git-uploadpack response.

	S: HTTP/1.1 200 OK
	S: Content-Type: application/x-git-pack

	S: 000c.PACK...

     The returned stream is the side-band-64k protocol supported
     by the git-upload-pack service, and the pack is embedded into
     stream 1.  Progress messages from the server side may appear
     in stream 2.

  (c) Parse the /upload-pack response:

      If the Content-Type is application/x-git-uploadpack:

      Reset COMMON to the items in S_COMMON.  The new S_COMMON
      should be a superset of the existing COMMON set.

      Remove all items in S_COMMON, and all of their ancestors,
      from PENDING.

      Do another /compute-common step.

      If the Content-Type is application/x-git-pack:

      Process the pack stream and update the local refs.


POST $ssurl/receive-pack
------------------------

TBD: Still a work in progress.

Uploads a pack and updates refs.  The start of the stream is the
commands to update the refs and the remainder of the stream is the
pack file itself.  See git-receive-pack and its network protocol
in pack-protocol.txt, as this is essentially the same.

	C: POST /path/to/repository.git/receive-pack HTTP/1.0
	C: Content-Type: application/x-git-receivepack
	C: Transfer-Encoding: chunked
	C:
	C: 103
	C: 006395dcfa3633004da0049d3d0fa03f80589cbcaf31 d049f6c27a2244e12041955e262a404c7faba355 refs/heads/maint
	C: 4
	C: 0000
	C: 12
	C: PACK
	...
	C: 0

	S: HTTP/1.0 200 OK
	S: Content-type: application/x-git-receive-pack-status
	S: Transfer-Encoding: chunked
	S:
	S: ...<output of receive-pack>...


-- 
Shawn.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-08-26  1:26 Git-aware HTTP transport Shawn O. Pearce
@ 2008-08-26  2:34 ` H. Peter Anvin
  2008-08-26  3:45   ` Shawn O. Pearce
  2008-08-26 14:58   ` Shawn O. Pearce
  2013-02-13  1:34 ` Git-aware HTTP transport docs H. Peter Anvin
  1 sibling, 2 replies; 57+ messages in thread
From: H. Peter Anvin @ 2008-08-26  2:34 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: git

This is a bit more detailed review than I have done in the past (I'm 
actually in town...) so please pardon me for commenting on things that 
has been dealt with in the past.

Overall, I really keeping in mind that you're doing a layer on top of 
HTTP, and resist the temptation to delve into details how things are to 
be implemented at the HTTP level.

The HTTP layer, furthermore, has a couple of important properties:

- GET requests may be cached, even if you tell it not to.
   (Some proxies, transparent or not, ignore caching directives.)
- POST requests generally will not.

So don't implement things as GET requests unless you genuinely can deal 
with the request being cached.  Using POST requests throughout seems 
like a safer bet to me; on the other hand, since the only use of GET is 
obtaining a list of refs the worst thing that can happen, I presume, is 
additional latency for the user behind the proxy.

Again, please don't take this as anything other than technical review 
type criticism.  I'm obviously really happy about the project and want 
to see it happen.

I do have one, very specific question: would the load on the server be 
lower if it was using a stateful protocol (like the standard git 
protocol)?  If there is value in the server maintaining state, then I 
would like to suggest a slightly different protocol.

	-hpa


Shawn O. Pearce wrote:
> 
> HTTP/1.1 Preference
> -------------------
> 
> For performance reasons the HTTP/1.1 chunked transfer encoding
> is used whenever possible to transfer variable length objects.
> This avoids needing to produce large results in memory to compute
> the proper content-length.
> 

This piece is unnecessary; it's a detail of the underlying HTTP layer.

> Detecting Smart Servers
> -----------------------
> 
> HTTP clients can detect a smart Git-aware server by HEADing
> $repo/backend.git-http and looking for a 302 redirect to the
> repository's smart service URL:
> 
> 	C: HEAD /path/to/repository.git/backend.git-http HTTP/1.1
> 
> 	S: HTTP/1.1 302 Found
> 	S: Location: /git/path/to/repository.git
> 
> A dumb server would respond with a 304 Not Found (or 200 OK).
> 
> Smart servers may send a redirect to any URL that does not
> contain query args (e.g. "foo?repo=path.git" is invalid).
> The URL must be sufficient to provide the location of the
> repository to the smart service code.
> 
> A valid redirect can be to yourself, for example:
> 
> 	C: HEAD /path/to/repository.git/backend.git-http HTTP/1.1
> 
> 	S: HTTP/1.1 302 Found
> 	S: Location: /path/to/repository.git/backend.git-http/.
> 
> All subsequent communcation for this transaction is done through
> the smart service URL ($ssurl), not the original URL.

I actually suggest embedding the forwarding URL into an ordinary 
payload.  Instead of a HEAD request here, then do a GET (or, even 
better, POST) and get the redirected URL in return.

Why?  Because it's common enough to redirect entire trees, and use of 
HTTP-layer redirections here is an unnecessary layering violation.

If you insist on using a HTTP status code, I would claim that 303 is a 
better status code.

> GET $ssurl/refs
> ---------------
> 
> Obtains the available refs from the remote repository.  The response
> is a sequence of refs, one per Git packet line.  The final packet
> line has a length of 0 to indicate the end.  This is basically
> the same protocol that is used by the git-upload-pack service to
> advertise the available refs.
> 
> 	C: GET $ssurl/refs HTTP/1.1
> 
> 	S: HTTP/1.1 200 OK
> 	S: Content-Type: application/x-git-refs
> 	S:
> 	S: 003295dcfa3633004da0049d3d0fa03f80589cbcaf31 HEAD
> 	S: 003e95dcfa3633004da0049d3d0fa03f80589cbcaf31 refs/heads/maint
> 	S: 003fd049f6c27a2244e12041955e262a404c7faba355 refs/heads/master
> 	S: 003b2cb58b79488a98d2721cea644875a8dd0026b115 refs/heads/pu
> 	S: 0000
> 
> POST $ssurl/upload-pack
> -----------------------
> 
> Prepares an estimated minimal pack to transfer new objects to the
> client.
> 
> The computation to select the minimal pack proceeds as follows
> (c = client, s = server):
> 
>  init step:
>  (c) Use /refs to obtain the advertised refs.
>  (c) Place any object seen in /refs into set ADVERTISED.
> 
>  (c) Build a set, WANT, of the objects from ADVERTISED the client
>      wants to fetch, based on what it saw from /refs.
> 
>  (c) Start a queue, C_PENDING, ordered by commit time (popping newest
>      first).  Add all client refs.  When a commit is popped from the
>      queue its parents should be automatically inserted back.  Commits
>      should only enter the queue once.
> 
>  one compute step:
>  (c) Send a /upload-pack request:
> 
> 	C: POST $ssurl/upload-pack HTTP/1.1
> 	C: Content-Type: application/x-git-uploadpack

Instead of "application/x-git-blah" I would suggest using 
"application/x-git; action=blah"; that way we can probably even register 
application/git with IANA.

> 	C: Content-Length: ...
> 	C:
> 	C: 0009want
> 	C: 0xxx<WANT list>
> 	C: 000bcommon
> 	C: 0xxx<COMMON list>
> 	C: 0009have
> 	C: 0xxx<HAVE list>
> 	C: 0000
> 
>      The stream is organized into "sections", where each section is
>      composed of two git pkt-lines.  The first pkt-line provides the
>      name of the section ("want", "have", "common").  The second
>      pkt-line has the binary SHA-1 ids which compose that section.
> 
>      The "want" section is required.  The other sections ("have",
>      "common") are optional.  A missing "want" section should be
>      answered with a "400 Bad Request".
> 
>      Sections must appear in the following order, if they appear
>      at all in the request stream:
> 
>        * want
>        * common
>        * have
> 
>      Each section may appear multiple times.  Client implementions
>      are encouraged to use as few sections as possible, however the
>      limit of 64k per pkt-line limits the number of ids to 3,276 per
>      section entry.
> 
>      The stream is terminated by a pkt-line flush ("0000").
> 
>      The HAVE list is created by popping the first 256 commits
>      from C_PENDING.  Less can be supplied if C_PENDING empties.
> 
>   (s) Parse the /upload-pack request.
> 
>       Verify all objects in WANT are reachable from refs.  As
>       this may require walking backwards through history to
>       the very beginning on invalid requests the server may
>       use a reasonable limit of commits (e.g. 1000) walked
>       beyond any ref tip before giving up.
> 
>       If any WANT object is not reachable, send a 409 error:

Again, I think the 409 error code here is an unnecessary layering 
violation.  It's simply Yet Another Thing that an HTTP proxy can screw 
up.  Having the HTTP server return a normal 200 reply (meaning that the 
*transport* succeeded) and have the error embedded in a lower layer 
should avoid that class of problems.

> 	S: HTTP/1.1 409 Conflict
> 	S: Content-Type: application/x-git-error
> 	S:
> 	S: %s not reachable
> 
>      Create an empty list, S_COMMON.
> 
>      If 'common' was sent:
> 
>      Load all objects into S_COMMON.
> 
>      If 'have' was sent:
> 
>      Loop through the objects in the order supplied by the client.
>      For each object, if the server has the object reachable from
>      a ref, add it to S_COMMON.  If a commit is added to S_COMMON,
>      do not add any ancestors, even if they also appear in HAVE.
> 
>   (s) Send the /upload-pack response:
> 
> 	S: HTTP/1.1 200 OK
> 	S: Content-Type: application/x-git-uploadpack
> 
> 	S: 000bcommon
> 	S: 0xxx<S_COMMON list>
> 	S: 0000
> 
>      The stream formatting rules are the same as the request.
> 
>      The section "common" details the contents of S_COMMON,
>      that is all objects from HAVE that the server also has.
> 
>      If the server has found a closed set of objects to pack,
>      it replies with the pack and not x-git-uploadpack response.
> 
> 	S: HTTP/1.1 200 OK
> 	S: Content-Type: application/x-git-pack
> 
> 	S: 000c.PACK...
> 
>      The returned stream is the side-band-64k protocol supported
>      by the git-upload-pack service, and the pack is embedded into
>      stream 1.  Progress messages from the server side may appear
>      in stream 2.
> 
>   (c) Parse the /upload-pack response:
> 
>       If the Content-Type is application/x-git-uploadpack:
> 
>       Reset COMMON to the items in S_COMMON.  The new S_COMMON
>       should be a superset of the existing COMMON set.
> 
>       Remove all items in S_COMMON, and all of their ancestors,
>       from PENDING.
> 
>       Do another /compute-common step.
> 
>       If the Content-Type is application/x-git-pack:
> 
>       Process the pack stream and update the local refs.
> 
> 
> POST $ssurl/receive-pack
> ------------------------
> 
> TBD: Still a work in progress.
> 
> Uploads a pack and updates refs.  The start of the stream is the
> commands to update the refs and the remainder of the stream is the
> pack file itself.  See git-receive-pack and its network protocol
> in pack-protocol.txt, as this is essentially the same.
> 
> 	C: POST /path/to/repository.git/receive-pack HTTP/1.0
> 	C: Content-Type: application/x-git-receivepack
> 	C: Transfer-Encoding: chunked
> 	C:
> 	C: 103
> 	C: 006395dcfa3633004da0049d3d0fa03f80589cbcaf31 d049f6c27a2244e12041955e262a404c7faba355 refs/heads/maint
> 	C: 4
> 	C: 0000
> 	C: 12
> 	C: PACK
> 	...
> 	C: 0
> 
> 	S: HTTP/1.0 200 OK
> 	S: Content-type: application/x-git-receive-pack-status
> 	S: Transfer-Encoding: chunked
> 	S:
> 	S: ...<output of receive-pack>...
> 
> 

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-08-26  2:34 ` H. Peter Anvin
@ 2008-08-26  3:45   ` Shawn O. Pearce
  2008-08-26  3:59     ` david
  2008-08-26  4:14     ` H. Peter Anvin
  2008-08-26 14:58   ` Shawn O. Pearce
  1 sibling, 2 replies; 57+ messages in thread
From: Shawn O. Pearce @ 2008-08-26  3:45 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: git

"H. Peter Anvin" <hpa@zytor.com> wrote:
> So don't implement things as GET requests unless you genuinely can deal  
> with the request being cached.  Using POST requests throughout seems  
> like a safer bet to me; on the other hand, since the only use of GET is  
> obtaining a list of refs the worst thing that can happen, I presume, is  
> additional latency for the user behind the proxy.

This is a good point.  There is probably not any reason to cache the
refs content if we don't also support caching the pack files.  So in
this latest draft I have moved the ref listing to also be a POST.

> I do have one, very specific question: would the load on the server be  
> lower if it was using a stateful protocol (like the standard git  
> protocol)?  If there is value in the server maintaining state, then I  
> would like to suggest a slightly different protocol.

Its possible the load would be lower, but it would complicate the
server implementation considerably.  Looking at the algorithm used
to compute upload-pack the server has relatively little to do in
any request.

Validation of WANT should be just matching the requested objects
against the refs; most clients will be asking for the current tips.
A client that started the process just before a fast-forward push
may incur at most a few hundred commit walk during its last couple
of computation round-trips.

Marking commits COMMON is just a matter of looking them up in the
database and setting their flags.

Evaluation of the HAVE list avoids duplicates in a well-behaved
client.  So the server sees each candidate commit from a client
only once, even if it spans multiple upload-pack requests.

So I think the cost may actually break even with a stateful protocol
if we imagine that the server is actually a farm of systems and
simple round-robin load-balancing is being done in front of the
Git-aware server.

I'd really like to keep the protocol stateless on the server side, as
this makes it easier to embed into certain commerical server farms.

> Shawn O. Pearce wrote:
>>
>> HTTP/1.1 Preference
>
> This piece is unnecessary; it's a detail of the underlying HTTP layer.

Gone from the latest draft.

>> Detecting Smart Servers
...
> I actually suggest embedding the forwarding URL into an ordinary  
> payload.  Instead of a HEAD request here, then do a GET (or, even  
> better, POST) and get the redirected URL in return.
>
> Why?  Because it's common enough to redirect entire trees, and use of  
> HTTP-layer redirections here is an unnecessary layering violation.

This has been completely rewritten to not use URL redirection at all.

--8<--
Smart HTTP transfer protocols
=============================

Git supports two HTTP based transfer protocols.  A "dumb" protocol
which requires only a standard HTTP server on the server end of the
connection, and a "smart" protocol which requires a Git aware CGI
(or server module).  This document describes the "smart" protocol.

As a design feature smart clients can automatically translate and
upgrade "dumb" protocol URLs.  This permits all users to have the
same published URL, with the peers automatically choosing to use
the most efficient transport available to them.

HTTP Transport
--------------

All requests are encoded as HTTP POST requests to the smart service
URL, "$url/backend.git-http/$service".

All responses are encoded as 200 Ok responses, even if the server
side has "failed" the request.  Service specific success/failure
codes are embedded in the content.

Authentication
--------------

Standard HTTP authentication is used if authentication is required
to access a repository, and must be configured and enforced by the
HTTP server software itself.

Stateless
---------

The protocol, much like its underlying HTTP, is stateless, from the
perspective of the HTTP server side.  All state must be retained and
managed by the client.  This permits round-robin load-balancing on
the server side, among many other implementation details.

Content Type
------------

All requests/responses use "application/x-git" as the content type.
Action specific subtypes are specified by the parameter "service",
e.g. "application/x-git; service=upload-pack".

Detecting Smart Servers
-----------------------

HTTP clients can detect a smart Git-aware server by sending
a request to service "show-ref".

A Git-aware server will respond with a valid response (see below).
A dumb server should respond with an error message. 

Service show-ref
----------------

Obtains the available refs from the remote repository.

URL: $url/backend.git-http/show-ref
Content-Type: application/x-git; service=show-ref

The request is an empty body.

The response is a sequence of refs, one per Git packet line.
The final packet line has a length of 0 to indicate the end.

	S: 003295dcfa3633004da0049d3d0fa03f80589cbcaf31 HEAD
	S: 003e95dcfa3633004da0049d3d0fa03f80589cbcaf31 refs/heads/maint
	S: 003fd049f6c27a2244e12041955e262a404c7faba355 refs/heads/master
	S: 003b2cb58b79488a98d2721cea644875a8dd0026b115 refs/heads/pu
	S: 0000

Service upload-pack
-------------------

Prepares an estimated minimal pack to transfer new objects to the
client.

URL: $url/backend.git-http/upload-pack
Content-Type: application/x-git; service=upload-pack

The computation to select the minimal pack proceeds as follows
(c = client, s = server):

 init step:
 (c) Use show-ref to obtain the advertised refs.
 (c) Place any object seen in show-ref into set ADVERTISED.

 (c) Build a set, WANT, of the objects from ADVERTISED the client
     wants to fetch, based on what it saw from show-ref.

 (c) Start a queue, C_PENDING, ordered by commit time (popping newest
     first).  Add all client refs.  When a commit is popped from the
     queue its parents should be automatically inserted back.  Commits
     should only enter the queue once.

 one compute step:
 (c) Send an upload-pack request:

	C: 0009want
	C: 0xxx<WANT list>
	C: 000bcommon
	C: 0xxx<COMMON list>
	C: 0009have
	C: 0xxx<HAVE list>
	C: 0000

     The stream is organized into "sections", where each section is
     composed of two git pkt-lines.  The first pkt-line provides the
     name of the section ("want", "have", "common").  The second
     pkt-line has the binary SHA-1 ids which compose that section.

     The "want" section is required.  The other sections ("have",
     "common") are optional.  A missing "want" section should be
     answered with an error.

     Sections must appear in the following order, if they appear
     at all in the request stream:

       * want
       * common
       * have

     Each section may appear multiple times.  Client implementions
     are encouraged to use as few sections as possible, however the
     limit of 64k per pkt-line limits the number of ids to 3,276 per
     section entry.

     The stream is terminated by a pkt-line flush ("0000").

     The HAVE list is created by popping the first 256 commits
     from C_PENDING.  Less can be supplied if C_PENDING empties.

  (s) Parse the upload-pack request:

      Verify all objects in WANT are reachable from refs.  As
      this may require walking backwards through history to
      the very beginning on invalid requests the server may
      use a reasonable limit of commits (e.g. 1000) walked
      beyond any ref tip before giving up.

      If no WANT objects are received, send an error:

	S: 0019status error no want

      If any WANT object is not reachable, send an error:

	S: 001estatus error invalid want

     Create an empty list, S_COMMON.

     If 'common' was sent:

     Load all objects into S_COMMON.

     If 'have' was sent:

     Loop through the objects in the order supplied by the client.
     For each object, if the server has the object reachable from
     a ref, add it to S_COMMON.  If a commit is added to S_COMMON,
     do not add any ancestors, even if they also appear in HAVE.

  (s) Send the upload-pack response:

     If the server has found a closed set of objects to pack,
     it replies with the pack.

	S: 0010status pack
	S: 000c.PACK...

     The returned stream is the side-band-64k protocol supported
     by the git-upload-pack service, and the pack is embedded into
     stream 1.  Progress messages from the server side may appear
     in stream 2.

     If the server wants more information, it replies with a
     status continue response:

	S: 0014status continue
	S: 000bcommon
	S: 0xxx<S_COMMON list>
	S: 0000

     The stream formatting rules are the same as the request.

     The section "common" details the contents of S_COMMON,
     that is all objects from HAVE that the server also has.

  (c) Parse the upload-pack response:

      If the status pkt-line is "status pack:"

      Process the pack stream and update the local refs.

      If the status pkt-line is "status continue":

      Reset COMMON to the items in S_COMMON.  The new S_COMMON
      should be a superset of the existing COMMON set.

      Remove all items in S_COMMON, and all of their ancestors,
      from PENDING.

      Do another compute step.


Service receive-pack
--------------------

Uploads a pack and updates refs.

URL: $url/backend.git-http/receive-pack
Content-Type: application/x-git; service=receive-pack

The start of the stream is the commands to update the refs and
the remainder of the stream is the pack file itself.  See
git-receive-pack and its network protocol in pack-protocol.txt,
as this is essentially the same.

	C: 006395dcfa3633004da0049d3d0fa03f80589cbcaf31 d049f6c27a2244e12041955e262a404c7faba355 refs/heads/maint
	C: 0000
	C: PACK...

	S: ...<output of receive-pack>...


-- 
Shawn.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-08-26  3:45   ` Shawn O. Pearce
@ 2008-08-26  3:59     ` david
  2008-08-26  4:15       ` H. Peter Anvin
  2008-08-26 17:01       ` Nicolas Pitre
  2008-08-26  4:14     ` H. Peter Anvin
  1 sibling, 2 replies; 57+ messages in thread
From: david @ 2008-08-26  3:59 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: H. Peter Anvin, git

On Mon, 25 Aug 2008, Shawn O. Pearce wrote:

> "H. Peter Anvin" <hpa@zytor.com> wrote:
>> So don't implement things as GET requests unless you genuinely can deal
>> with the request being cached.  Using POST requests throughout seems
>> like a safer bet to me; on the other hand, since the only use of GET is
>> obtaining a list of refs the worst thing that can happen, I presume, is
>> additional latency for the user behind the proxy.
>
> This is a good point.  There is probably not any reason to cache the
> refs content if we don't also support caching the pack files.  So in
> this latest draft I have moved the ref listing to also be a POST.

on the other hand, it would be a good thing if pack files could be cached.

in a peer-peer git environment the cache would not be used very much, but 
when you have a large number of people tracking a central repository (or 
even a pseudo-central one like the kernel) you have a lot of people 
upgrading from one point to the next point.

and for cloneing (and especially thing like linux-next where you 
essentially re-clone daily) letting the pack get cached is probably a very 
good thing.

I know it would be another round-trip, but how painful would it be to 
compute what the contents of a pack would be (what objects would be in it, 
not calculating the deltas nessasary for a full pack file), and return 
that to the client so that the client could do a GET for the pack itself.

if that exact pack happens to be in the cache, great, if not the server 
takes the data from the client and creates a pack file with those objects 
in it.

David Lang

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-08-26  3:45   ` Shawn O. Pearce
  2008-08-26  3:59     ` david
@ 2008-08-26  4:14     ` H. Peter Anvin
  1 sibling, 0 replies; 57+ messages in thread
From: H. Peter Anvin @ 2008-08-26  4:14 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: git

Shawn O. Pearce wrote:
> 
> So I think the cost may actually break even with a stateful protocol
> if we imagine that the server is actually a farm of systems and
> simple round-robin load-balancing is being done in front of the
> Git-aware server.
> 
> I'd really like to keep the protocol stateless on the server side, as
> this makes it easier to embed into certain commerical server farms.
> 

Indeed.  It was a question, not a statement of any sort.  I was curious 
about the answer.

I really like the new draft, with the one consideration below.

> 
> HTTP Transport
> --------------
> 
> All requests are encoded as HTTP POST requests to the smart service
> URL, "$url/backend.git-http/$service".
> 
> All responses are encoded as 200 Ok responses, even if the server
> side has "failed" the request.  Service specific success/failure
> codes are embedded in the content.
> 

I still would like to have an indirection step at the start, in order to 
keep a single client on a server in the case of skew.  I suggest simply 
do it as HTTP POST $url/backend.git-http, empty body, and return a URL 
prefix to use for the remainder of the session.  That way a server who 
wants a stateful setup can return a URL which contains a session cookie; 
others can return a URL containing a target server, and finally others 
can simply return the requesting URL.

	-hpa

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-08-26  3:59     ` david
@ 2008-08-26  4:15       ` H. Peter Anvin
  2008-08-26  4:25         ` david
  2008-08-26 17:01       ` Nicolas Pitre
  1 sibling, 1 reply; 57+ messages in thread
From: H. Peter Anvin @ 2008-08-26  4:15 UTC (permalink / raw)
  To: david; +Cc: Shawn O. Pearce, git

david@lang.hm wrote:
> 
> on the other hand, it would be a good thing if pack files could be cached.
> 
> in a peer-peer git environment the cache would not be used very much, 
> but when you have a large number of people tracking a central repository 
> (or even a pseudo-central one like the kernel) you have a lot of people 
> upgrading from one point to the next point.
> 

Worth noting that this also applies to the raw git protocol.

	-hpa

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-08-26  4:15       ` H. Peter Anvin
@ 2008-08-26  4:25         ` david
  2008-08-26  4:42           ` H. Peter Anvin
  2008-08-26  4:45           ` Imran M Yousuf
  0 siblings, 2 replies; 57+ messages in thread
From: david @ 2008-08-26  4:25 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Shawn O. Pearce, git

On Mon, 25 Aug 2008, H. Peter Anvin wrote:

> david@lang.hm wrote:
>> 
>> on the other hand, it would be a good thing if pack files could be cached.
>> 
>> in a peer-peer git environment the cache would not be used very much, but 
>> when you have a large number of people tracking a central repository (or 
>> even a pseudo-central one like the kernel) you have a lot of people 
>> upgrading from one point to the next point.
>> 
>
> Worth noting that this also applies to the raw git protocol.

IIRC the native git server will use existing packs when it can.

it would be interesting to modify git to record what packs it generates 
and then see how much a big server (like kernel.org) would re-use a pack 
under different caching strategies.

David Lang

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-08-26  4:25         ` david
@ 2008-08-26  4:42           ` H. Peter Anvin
  2008-08-26  4:45           ` Imran M Yousuf
  1 sibling, 0 replies; 57+ messages in thread
From: H. Peter Anvin @ 2008-08-26  4:42 UTC (permalink / raw)
  To: david; +Cc: Shawn O. Pearce, git

david@lang.hm wrote:
> On Mon, 25 Aug 2008, H. Peter Anvin wrote:
> 
>> david@lang.hm wrote:
>>>
>>> on the other hand, it would be a good thing if pack files could be 
>>> cached.
>>>
>>> in a peer-peer git environment the cache would not be used very much, 
>>> but when you have a large number of people tracking a central 
>>> repository (or even a pseudo-central one like the kernel) you have a 
>>> lot of people upgrading from one point to the next point.
>>>
>>
>> Worth noting that this also applies to the raw git protocol.
> 
> IIRC the native git server will use existing packs when it can.
> 

Yes (and the smart http server should, too).  However, neither of them 
can currently generate new packfiles and save them for future use in a 
separate directory from the repository tree.

	-hpa

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-08-26  4:25         ` david
  2008-08-26  4:42           ` H. Peter Anvin
@ 2008-08-26  4:45           ` Imran M Yousuf
  1 sibling, 0 replies; 57+ messages in thread
From: Imran M Yousuf @ 2008-08-26  4:45 UTC (permalink / raw)
  To: david; +Cc: H. Peter Anvin, Shawn O. Pearce, git

On Tue, Aug 26, 2008 at 10:25 AM,  <david@lang.hm> wrote:
> On Mon, 25 Aug 2008, H. Peter Anvin wrote:
>
>> david@lang.hm wrote:
>>>
>>> on the other hand, it would be a good thing if pack files could be
>>> cached.
>>>
>>> in a peer-peer git environment the cache would not be used very much, but
>>> when you have a large number of people tracking a central repository (or
>>> even a pseudo-central one like the kernel) you have a lot of people
>>> upgrading from one point to the next point.
>>>
>>
>> Worth noting that this also applies to the raw git protocol.
>
> IIRC the native git server will use existing packs when it can.
>
> it would be interesting to modify git to record what packs it generates and
> then see how much a big server (like kernel.org) would re-use a pack under
> different caching strategies.

I fully agree with the caching logic as well. In this regard I was
thinking whether the protocol could be modified a bit to accommodate
it or not. From initial proposal GET was dropped because there will be
caching, which I also agree :), and we need GET in order to achieve
cache - so I would have done something such as - initial request would
be POST and if there is no change and cache can be used I would
redirect it to a equivalen GET URL and if cache is invalid (which the
server can track by pinging the GET URL) serve directly through the
POST method untill either the GET is out of the cache or is updated.

- Imran

>
> David Lang
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>



-- 
Imran M Yousuf
Email: imran@smartitengineering.com
Blog: http://imyousuf-tech.blogs.smartitengineering.com/
Mobile: +880-1711402557

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-08-26  2:34 ` H. Peter Anvin
  2008-08-26  3:45   ` Shawn O. Pearce
@ 2008-08-26 14:58   ` Shawn O. Pearce
  2008-08-26 16:14     ` Shawn O. Pearce
  2008-08-26 16:33     ` H. Peter Anvin
  1 sibling, 2 replies; 57+ messages in thread
From: Shawn O. Pearce @ 2008-08-26 14:58 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: git

"H. Peter Anvin" <hpa@zytor.com> wrote:
>> Detecting Smart Servers
>> -----------------------
>>
>> HTTP clients can detect a smart Git-aware server by HEADing
>> $repo/backend.git-http and looking for a 302 redirect to the
>> repository's smart service URL:
...
>> All subsequent communcation for this transaction is done through
>> the smart service URL ($ssurl), not the original URL.
>
> I actually suggest embedding the forwarding URL into an ordinary  
> payload.  Instead of a HEAD request here, then do a GET (or, even  
> better, POST) and get the redirected URL in return.
>
> Why?  Because it's common enough to redirect entire trees, and use of  
> HTTP-layer redirections here is an unnecessary layering violation.

Hmm.  I'm actually thinking the exact opposite here.  My rationale
for putting the response as a standard HTTP 302/303 style redirect
is to permit hardware load balancers or Apache mod_rewrite rules
to implement simple load balancing with a HTTP redirect.

If we embed the redirect URL into the payload then configuring that
will become a lot more complex.  At the minimum you may have to
make up a dummy file for each server (holding the response payload)
then then let mod_rewrite rewrite the request internally to make
Apache serve that file.  Ugly.

> If you insist on using a HTTP status code, I would claim that 303 is a  
> better status code.

Ok.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-08-26 14:58   ` Shawn O. Pearce
@ 2008-08-26 16:14     ` Shawn O. Pearce
  2008-08-26 16:33     ` H. Peter Anvin
  1 sibling, 0 replies; 57+ messages in thread
From: Shawn O. Pearce @ 2008-08-26 16:14 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: git

"Shawn O. Pearce" <spearce@spearce.org> wrote:
> "H. Peter Anvin" <hpa@zytor.com> wrote:
> >
> > I actually suggest embedding the forwarding URL into an ordinary  
> > payload.  Instead of a HEAD request here, then do a GET (or, even  
> > better, POST) and get the redirected URL in return.
> 
> Hmm.  I'm actually thinking the exact opposite here.

Here's the delta from the last draft I emailed.  Its basically just
about this redirect stuff.

diff --git a/Documentation/technical/http-protocol.txt b/Documentation/technical/http-protocol.txt
index 99d7623..a3f7379 100644
--- a/Documentation/technical/http-protocol.txt
+++ b/Documentation/technical/http-protocol.txt
@@ -43,14 +43,40 @@ All requests/responses use "application/x-git" as the content type.
 Action specific subtypes are specified by the parameter "service",
 e.g. "application/x-git; service=upload-pack".
 
+Redirects
+---------
+
+If a POST request results in an HTTP 302 or 303 redirect response
+clients should retry the request by updating the URL and POSTing
+the request to the new location.
+
+If the new request is successful clients should trim off the
+trailing "/backend.git/$service" portion of the new loaction
+and use the remainder as the base URL for future requests in
+the same transaction.
+
+This redirection permits Apache's mod_rewrite (and many other
+servers) to implement a form of round-robin load balancing by
+redirecting all requests to a generic host to a specific host.
+
 Detecting Smart Servers
 -----------------------
 
 HTTP clients can detect a smart Git-aware server by sending
 a request to service "show-ref".
 
-A Git-aware server will respond with a valid response (see below).
-A dumb server should respond with an error message. 
+A Git-aware server will respond with a valid response.  Clients
+must check the following properties to prevent being fooled by
+misconfigured servers:
+
+  * HTTP status code is 200.
+  * Content-Type is "application/x-git; service=show-ref"
+  * The body can be parsed without errors.  The length of
+    each pkt-line must be 4 valid hex digits.
+
+A dumb server will respond with a non-200 HTTP status code.
+A misconfigured server may respond with a normal 200 status
+code, but an incorrect content type.
 
 Service show-ref
 ----------------

-- 
Shawn.

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-08-26 14:58   ` Shawn O. Pearce
  2008-08-26 16:14     ` Shawn O. Pearce
@ 2008-08-26 16:33     ` H. Peter Anvin
  2008-08-26 17:26       ` Shawn O. Pearce
  1 sibling, 1 reply; 57+ messages in thread
From: H. Peter Anvin @ 2008-08-26 16:33 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: git

Shawn O. Pearce wrote:
> 
> Hmm.  I'm actually thinking the exact opposite here.  My rationale
> for putting the response as a standard HTTP 302/303 style redirect
> is to permit hardware load balancers or Apache mod_rewrite rules
> to implement simple load balancing with a HTTP redirect.
> 
> If we embed the redirect URL into the payload then configuring that
> will become a lot more complex.  At the minimum you may have to
> make up a dummy file for each server (holding the response payload)
> then then let mod_rewrite rewrite the request internally to make
> Apache serve that file.  Ugly.
> 

No, you're thinking backwards.  What you want is the standard HTTP 
redirect load balancing to take effect *before* the initial request is 
serviced.  The front-end load balancer will take effect on the initial 
request, and then redirect the request to a node (via a 302 reply.)  The 
target node then sends a self-referencing URL to keep the service local, 
if that is desired -- otherwise it doesn't.

Again, the 300-class redirect is treated as a part of the HTTP transport 
in this case; it doesn't have to be visible to the RPC layer.  However, 
in order to maintain the integrity of an interchange, we do need an 
additional level of redirection visible to the RPC layer.

> If we embed the redirect URL into the payload then configuring that
> will become a lot more complex.  At the minimum you may have to
> make up a dummy file for each server (holding the response payload)
> then then let mod_rewrite rewrite the request internally to make
> Apache serve that file.  Ugly.

A very simple CGI/PHP script will do this, and it's really very very 
trivial to set up.

Please keep in mind I'm not talking hypotheticals at all.  What you have 
proposed is actually a lot uglier for kernel.org to implement, simply 
because we try to stay with strict IP-based vhosting

	-hpa

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-08-26  3:59     ` david
  2008-08-26  4:15       ` H. Peter Anvin
@ 2008-08-26 17:01       ` Nicolas Pitre
  2008-08-26 17:03         ` Shawn O. Pearce
  1 sibling, 1 reply; 57+ messages in thread
From: Nicolas Pitre @ 2008-08-26 17:01 UTC (permalink / raw)
  To: david; +Cc: Shawn O. Pearce, H. Peter Anvin, git

On Mon, 25 Aug 2008, david@lang.hm wrote:

> and for cloneing (and especially thing like linux-next where you essentially
> re-clone daily) letting the pack get cached is probably a very good thing.

I hope that people recloning linux-next daily are very few.  This is an 
incredible waste of bandwidth, regardless of the protocol used, dumb or 
not.  A standard fetch with a remote tracking branch (with -f or with a 
plus sign on the "fetch" line in your config file) should be all that's 
needed to significantly reduce the amount of data needed to transfer.


Nicolas

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-08-26 17:01       ` Nicolas Pitre
@ 2008-08-26 17:03         ` Shawn O. Pearce
  0 siblings, 0 replies; 57+ messages in thread
From: Shawn O. Pearce @ 2008-08-26 17:03 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: david, H. Peter Anvin, git

Nicolas Pitre <nico@cam.org> wrote:
> On Mon, 25 Aug 2008, david@lang.hm wrote:
> 
> > and for cloneing (and especially thing like linux-next where you essentially
> > re-clone daily) letting the pack get cached is probably a very good thing.
> 
> I hope that people recloning linux-next daily are very few.  This is an 
> incredible waste of bandwidth, regardless of the protocol used, dumb or 
> not.  A standard fetch with a remote tracking branch (with -f or with a 
> plus sign on the "fetch" line in your config file) should be all that's 
> needed to significantly reduce the amount of data needed to transfer.

Or at least clone with --reference.  You get about the same benefit if
your local reference repository is fairly current, say with a stable
upstream like Linus' own tree.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-08-26 16:33     ` H. Peter Anvin
@ 2008-08-26 17:26       ` Shawn O. Pearce
  2008-08-26 22:38         ` H. Peter Anvin
  0 siblings, 1 reply; 57+ messages in thread
From: Shawn O. Pearce @ 2008-08-26 17:26 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: git

"H. Peter Anvin" <hpa@zytor.com> wrote:
> Shawn O. Pearce wrote:
>>
>> Hmm.  I'm actually thinking the exact opposite here.  My rationale
>> for putting the response as a standard HTTP 302/303 style redirect
>> is to permit hardware load balancers [...]
>> to implement simple load balancing with a HTTP redirect.
>
> No, you're thinking backwards.  What you want is the standard HTTP  
> redirect load balancing to take effect *before* the initial request is  
> serviced.
...
> Please keep in mind I'm not talking hypotheticals at all.  What you have  
> proposed is actually a lot uglier for kernel.org to implement, simply  
> because we try to stay with strict IP-based vhosting

Discard my prior patch from today.

This is a patch to last night's full document edition
(http://article.gmane.org/gmane.comp.version-control.git/93704)
and addresses only the issue of redirects.

--8<--
diff --git a/Documentation/technical/http-protocol.txt b/Documentation/technical/http-protocol.txt
index 99d7623..99dc88d 100644
--- a/Documentation/technical/http-protocol.txt
+++ b/Documentation/technical/http-protocol.txt
@@ -43,14 +43,34 @@ All requests/responses use "application/x-git" as the content type.
 Action specific subtypes are specified by the parameter "service",
 e.g. "application/x-git; service=upload-pack".
 
+HTTP Redirects
+--------------
+
+If a POST request results in an HTTP 302 or 303 redirect response
+clients should retry the request by updating the URL and POSTing
+the same request to the new location.  Subsequent requests should
+still be sent to the original URL.
+
 Detecting Smart Servers
 -----------------------
 
 HTTP clients can detect a smart Git-aware server by sending
 a request to service "show-ref".
 
-A Git-aware server will respond with a valid response (see below).
-A dumb server should respond with an error message. 
+A Git-aware server will respond with a valid response.  Clients
+must check the following properties to prevent being fooled by
+misconfigured servers:
+
+  * HTTP status code is 200.
+  * Content-Type is "application/x-git; service=show-ref"
+  * The body can be parsed without errors.  The length of
+    each pkt-line must be 4 valid hex digits.
+
+A dumb server will respond with a non-200 HTTP status code.
+A misconfigured server may respond with a normal 200 status
+code, but an incorrect content type, or an invalid leading
+4 byte sequence for a pkt-line (e.g. "<htm" or "<!DO" are
+not valid lengths).
 
 Service show-ref
 ----------------
@@ -62,15 +82,46 @@ Content-Type: application/x-git; service=show-ref
 
 The request is an empty body.
 
-The response is a sequence of refs, one per Git packet line.
-The final packet line has a length of 0 to indicate the end.
+The response is a pkt-line with "refs", followed by zero
+or more ref pkt-lines ("$id $name"), and a final pkt-line
+with a length of 0:
 
+	S: 0009refs
 	S: 003295dcfa3633004da0049d3d0fa03f80589cbcaf31 HEAD
 	S: 003e95dcfa3633004da0049d3d0fa03f80589cbcaf31 refs/heads/maint
 	S: 003fd049f6c27a2244e12041955e262a404c7faba355 refs/heads/master
 	S: 003b2cb58b79488a98d2721cea644875a8dd0026b115 refs/heads/pu
 	S: 0000
 
+The response may begin with an optional redirect to a new service
+URL for the repository:
+
+	S: 0028redirect http://s1.example.com/git/
+	S: 0009refs
+	S: 003295dcfa3633004da0049d3d0fa03f80589cbcaf31 HEAD
+	S: 003fd049f6c27a2244e12041955e262a404c7faba355 refs/heads/master
+	S: 0000
+
+or be composed of only a redirect:
+
+	S: 0028redirect http://s1.example.com/git/
+	S: 0000
+
+If a redirect is returned the client should update itself
+to use the new URL as the location for future requests.
+A server may use the redirect to request that the client
+"pin" itself to a particular server for the remainder of
+the current transaction.
+
+The URL listed in any redirect should be the base URL
+without any query args.  The client will automatically
+append "/backend.git-http/$service" as it makes each
+future request.
+
+If no "refs" line was received in the response, but
+a "redirect" was received, the client should retry
+its request at the new location before giving up.
+
 Service upload-pack
 -------------------
 
-- 
Shawn.

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-08-26 17:26       ` Shawn O. Pearce
@ 2008-08-26 22:38         ` H. Peter Anvin
  2008-08-27  2:51           ` Imran M Yousuf
  2008-08-28  3:50           ` Shawn O. Pearce
  0 siblings, 2 replies; 57+ messages in thread
From: H. Peter Anvin @ 2008-08-26 22:38 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: git

Shawn O. Pearce wrote:
> 
> Discard my prior patch from today.
> 
> This is a patch to last night's full document edition
> (http://article.gmane.org/gmane.comp.version-control.git/93704)
> and addresses only the issue of redirects.
> 

Looks great to me.

	-hpa

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-08-26 22:38         ` H. Peter Anvin
@ 2008-08-27  2:51           ` Imran M Yousuf
  2008-08-28  3:50           ` Shawn O. Pearce
  1 sibling, 0 replies; 57+ messages in thread
From: Imran M Yousuf @ 2008-08-27  2:51 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Shawn O. Pearce, git

On Wed, Aug 27, 2008 at 4:38 AM, H. Peter Anvin <hpa@zytor.com> wrote:
> Shawn O. Pearce wrote:
>>
>> Discard my prior patch from today.
>>
>> This is a patch to last night's full document edition
>> (http://article.gmane.org/gmane.comp.version-control.git/93704)
>> and addresses only the issue of redirects.
>>
>
> Looks great to me.

This looks really good! The redirect idea just seems cool!

- Imran

>
>        -hpa
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

-- 
Imran M Yousuf
Email: imran@smartitengineering.com
Blog: http://imyousuf-tech.blogs.smartitengineering.com/
Mobile: +880-1711402557

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-08-26 22:38         ` H. Peter Anvin
  2008-08-27  2:51           ` Imran M Yousuf
@ 2008-08-28  3:50           ` Shawn O. Pearce
  2008-08-28  4:37             ` H. Peter Anvin
  2008-08-28  4:42             ` Junio C Hamano
  1 sibling, 2 replies; 57+ messages in thread
From: Shawn O. Pearce @ 2008-08-28  3:50 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: git

"H. Peter Anvin" <hpa@zytor.com> wrote:
>
> Looks great to me.

So this is what may be the final draft of the HTTP protocol.
I've added stuff about capability selection between the peers for
future expansion support.  The upload-pack service has a better
use of it than receive-pack.  Otherwise it is what I think you are
agreeing to above.  ;-)

I'm hoping to start implementating a prototype of this on Friday.
I may do it in JGit first; the transport infrastructure there is
a lot more modular so experimentation should be quicker.  I would
obviously also implement it in C Git, unless someone else comes
along and beats me to it.  This project is only a fraction of my
total Git time in any given week.  :-|

--8<--
Smart HTTP transfer protocols
=============================

Git supports two HTTP based transfer protocols.  A "dumb" protocol
which requires only a standard HTTP server on the server end of the
connection, and a "smart" protocol which requires a Git aware CGI
(or server module).  This document describes the "smart" protocol.

As a design feature smart clients can automatically translate and
upgrade "dumb" protocol URLs.  This permits all users to have the
same published URL, with the peers automatically choosing to use
the most efficient transport available to them.

HTTP Transport
--------------

All requests are encoded as HTTP POST requests to the smart service
URL, "$url/backend.git-http/$service".

All responses are encoded as 200 Ok responses, even if the server
side has "failed" the request.  Service specific success/failure
codes are embedded in the content.

Authentication
--------------

Standard HTTP authentication is used if authentication is required
to access a repository, and must be configured and enforced by the
HTTP server software itself.

Stateless
---------

The protocol, much like its underlying HTTP, is stateless, from the
perspective of the HTTP server side.  All state must be retained and
managed by the client.  This permits round-robin load-balancing on
the server side, among many other implementation details.

Content Type
------------

All requests/responses use "application/x-git" as the content type.
Action specific subtypes are specified by the parameter "service",
e.g. "application/x-git; service=upload-pack".

HTTP Redirects
--------------

If a POST request results in an HTTP 302 or 303 redirect response
clients should retry the request by updating the URL and POSTing
the same request to the new location.  Subsequent requests should
still be sent to the original URL.

Detecting Smart Servers
-----------------------

HTTP clients can detect a smart Git-aware server by sending
a request to service "show-ref".

A Git-aware server will respond with a valid response.  Clients
must check the following properties to prevent being fooled by
misconfigured servers:

  * HTTP status code is 200.
  * Content-Type is "application/x-git; service=show-ref"
  * The body can be parsed without errors.  The length of
    each pkt-line must be 4 valid hex digits.

A dumb server will respond with a non-200 HTTP status code.
A misconfigured server may respond with a normal 200 status
code, but an incorrect content type, or an invalid leading
4 byte sequence for a pkt-line (e.g. "<htm" or "<!DO" are
not valid lengths).

Service show-ref
----------------

Obtains the available refs from the remote repository.

URL: $url/backend.git-http/show-ref
Content-Type: application/x-git; service=show-ref

The request is an empty body.

The response is a pkt-line with "refs", followed by zero
or more ref pkt-lines ("$id $name"), and a final pkt-line
with a length of 0:

	S: 0009refs
	S: 003295dcfa3633004da0049d3d0fa03f80589cbcaf31 HEAD
	S: 003e95dcfa3633004da0049d3d0fa03f80589cbcaf31 refs/heads/maint
	S: 003fd049f6c27a2244e12041955e262a404c7faba355 refs/heads/master
	S: 003b2cb58b79488a98d2721cea644875a8dd0026b115 refs/heads/pu
	S: 0000

The response may begin with an optional redirect to a new service
URL for the repository:

	S: 0028redirect http://s1.example.com/git/
	S: 0009refs
	S: 003295dcfa3633004da0049d3d0fa03f80589cbcaf31 HEAD
	S: 003fd049f6c27a2244e12041955e262a404c7faba355 refs/heads/master
	S: 0000

or be composed of only a redirect:

	S: 0028redirect http://s1.example.com/git/
	S: 0000

If a redirect is returned the client should update itself
to use the new URL as the location for future requests.
A server may use the redirect to request that the client
"pin" itself to a particular server for the remainder of
the current transaction.

The URL listed in any redirect should be the base URL
without any query args.  The client will automatically
append "/backend.git-http/$service" as it makes each
future request.

If no "refs" line was received in the response, but
a "redirect" was received, the client should retry
its request at the new location before giving up.

Service upload-pack
-------------------

Prepares an estimated minimal pack to transfer new objects to the
client.

URL: $url/backend.git-http/upload-pack
Content-Type: application/x-git; service=upload-pack

The computation to select the minimal pack proceeds as follows
(c = client, s = server):

 init step:
 (c) Use show-ref to obtain the advertised refs.
 (c) Place any object seen in show-ref into set ADVERTISED.

 (c) Build a set, WANT, of the objects from ADVERTISED the client
     wants to fetch, based on what it saw from show-ref.

 (c) Start a queue, C_PENDING, ordered by commit time (popping newest
     first).  Add all client refs.  When a commit is popped from the
     queue its parents should be automatically inserted back.  Commits
     should only enter the queue once.

 one compute step:
 (c) Send an upload-pack request:

	C: 0011capabilities
	C: 0024thin-pack include-tag ofs-delta
	C: 0009want
	C: 0xxx<WANT list>
	C: 000bcommon
	C: 0xxx<COMMON list>
	C: 0009have
	C: 0xxx<HAVE list>
	C: 0000

     The stream is organized into "sections", where each section is
     composed of two git pkt-lines.  The first pkt-line provides the
     name of the section ("capabilities", "want", "have", "common").
     The second pkt-line has the binary SHA-1 ids which compose that
     section.

     The "want" section is required.  The other sections ("have",
     "common") are optional.  A missing "want" section should be
     answered with an error.

     Sections must appear in the following order, if they appear
     at all in the request stream:

       * capabilities
       * want
       * common
       * have

     Each section may appear multiple times.  Client implementions
     are encouraged to use as few sections as possible, however the
     limit of 64k per pkt-line limits the number of ids to 3,276 per
     section entry.

     The stream is terminated by a pkt-line flush ("0000").

     The HAVE list is created by popping the first 256 commits
     from C_PENDING.  Less can be supplied if C_PENDING empties.

  (s) Parse the upload-pack request:

      Verify all objects in WANT are reachable from refs.  As
      this may require walking backwards through history to
      the very beginning on invalid requests the server may
      use a reasonable limit of commits (e.g. 1000) walked
      beyond any ref tip before giving up.

      If no WANT objects are received, send an error:

	S: 0019status error no want

      If any WANT object is not reachable, send an error:

	S: 001estatus error invalid want

     Create an empty list, S_COMMON.

     If 'common' was sent:

     Load all objects into S_COMMON.

     If 'have' was sent:

     Loop through the objects in the order supplied by the client.
     For each object, if the server has the object reachable from
     a ref, add it to S_COMMON.  If a commit is added to S_COMMON,
     do not add any ancestors, even if they also appear in HAVE.

  (s) Send the upload-pack response:

     If the server has found a closed set of objects to pack, it
     replies with the pack and the enabled capabilities.  The set
     of enabled capabilities is limited to the intersection of
     what the client requested and what the server supports.

	S: 0010status pack
	S: 0011capabilities
	S: 0024thin-pack include-tag ofs-delta
	S: 000c.PACK...

     The returned stream is the side-band-64k protocol supported
     by the git-upload-pack service, and the pack is embedded into
     stream 1.  Progress messages from the server side may appear
     in stream 2.

     If the server wants more information, it replies with a
     status continue response:

	S: 0014status continue
	S: 000bcommon
	S: 0xxx<S_COMMON list>
	S: 0000

     The stream formatting rules are the same as the request.

     The section "common" details the contents of S_COMMON,
     that is all objects from HAVE that the server also has.

  (c) Parse the upload-pack response:

      If the status pkt-line is "status pack:"

      Process the pack stream and update the local refs.

      If the status pkt-line is "status continue":

      Reset COMMON to the items in S_COMMON.  The new S_COMMON
      should be a superset of the existing COMMON set.

      Remove all items in S_COMMON, and all of their ancestors,
      from PENDING.

      Do another compute step.


Service receive-pack
--------------------

Uploads a pack and updates refs.

URL: $url/backend.git-http/receive-pack
Content-Type: application/x-git; service=receive-pack

The start of the stream is the commands to update the refs and
the remainder of the stream is the pack file itself.  See
git-receive-pack and its network protocol in pack-protocol.txt,
as this is essentially the same.

	C: 0011capabilities
	C: 0005
	C: 006395dcfa3633004da0049d3d0fa03f80589cbcaf31 d049f6c27a2244e12041955e262a404c7faba355 refs/heads/maint
	C: 0000
	C: PACK...

	S: 0011capabilities
	S: 0005
	S: ...<output of receive-pack>...

The capabilities are handled exactly as in the fetch protocol,
however the server may reject a pack and its associated commands
if an invalid capability request is made by the client, or the
client has assumed a pack capability that the server does not
have support for.  In the latter case the server must still send
the capabilities key in the response so the client can correct
itself and try again.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-08-28  3:50           ` Shawn O. Pearce
@ 2008-08-28  4:37             ` H. Peter Anvin
  2008-08-28  4:42               ` Shawn O. Pearce
  2008-08-28  6:40               ` Imran M Yousuf
  2008-08-28  4:42             ` Junio C Hamano
  1 sibling, 2 replies; 57+ messages in thread
From: H. Peter Anvin @ 2008-08-28  4:37 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: git

Shawn O. Pearce wrote:
> 
> So this is what may be the final draft of the HTTP protocol.
> I've added stuff about capability selection between the peers for
> future expansion support.  The upload-pack service has a better
> use of it than receive-pack.  Otherwise it is what I think you are
> agreeing to above.  ;-)
> 

It looks good to me.  I *really* like the option of combining a redirect 
with a refs list in one reply; this will make things substantially 
easier do deploy on kernel.org, and saves a round trip to boot.

Just an implementation detail for the server, however: for an *empty* 
repository (one which has no refs at all), the server needs to *not* 
transmit the redirect, or there will be a loop :)  It is unnecessary, 
anyway, since there is inherently nothing to do.

	-hpa

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-08-28  4:37             ` H. Peter Anvin
@ 2008-08-28  4:42               ` Shawn O. Pearce
  2008-08-28  4:58                 ` H. Peter Anvin
  2008-08-28  6:40               ` Imran M Yousuf
  1 sibling, 1 reply; 57+ messages in thread
From: Shawn O. Pearce @ 2008-08-28  4:42 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: git

"H. Peter Anvin" <hpa@zytor.com> wrote:
> Shawn O. Pearce wrote:
>>
>> So this is what may be the final draft of the HTTP protocol.
>
> It looks good to me.  I *really* like the option of combining a redirect  
> with a refs list in one reply; this will make things substantially  
> easier do deploy on kernel.org, and saves a round trip to boot.

Yea, I had a draft that didn't combine these and I realized how
stupid that was.  So I allowed them to appear together if the
server operator wants to do that.

> Just an implementation detail for the server, however: for an *empty*  
> repository (one which has no refs at all), the server needs to *not*  
> transmit the redirect, or there will be a loop :)  It is unnecessary,  
> anyway, since there is inherently nothing to do.

Actually that's not true.  A correct client won't loop.

An empty repository is required to send "refs" section header.
So the client will see the "refs" header and know that the complete
set of refs is following.  Only nothing follows, so it knows the
complete set is the empty set.

A redirect with no ref data won't have the "refs" section header.
So the client knows that it cannot conclude anything from that
exchange and must follow the redirect.

An empty repository sending a redirect will send both "redirect"
and "refs", but no refs follow the "refs" section header.  So the
client knows that it is empty and it does not need to follow the
redirect it received.

Now if the server is stupid and keeps sending a redirect with no
refs header, yea, the client can loop.  So the clients should have
a maximum recursion limit configured into them, just like a good
browser would, so you can't get stuck in an A->B->C->A loop.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-08-28  3:50           ` Shawn O. Pearce
  2008-08-28  4:37             ` H. Peter Anvin
@ 2008-08-28  4:42             ` Junio C Hamano
  2008-08-28 14:57               ` Shawn O. Pearce
  2008-08-28 17:05               ` H. Peter Anvin
  1 sibling, 2 replies; 57+ messages in thread
From: Junio C Hamano @ 2008-08-28  4:42 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: H. Peter Anvin, git

"Shawn O. Pearce" <spearce@spearce.org> writes:

> HTTP Redirects
> --------------
>
> If a POST request results in an HTTP 302 or 303 redirect response
> clients should retry the request by updating the URL and POSTing
> the same request to the new location.  Subsequent requests should
> still be sent to the original URL.

At the first reading I was confused because this seemed to contradict with
the server pinning that is done by the payload level redirect.

> Service upload-pack
> -------------------
>
> Prepares an estimated minimal pack to transfer new objects to the
> client.
>
> URL: $url/backend.git-http/upload-pack
> Content-Type: application/x-git; service=upload-pack
>
> The computation to select the minimal pack proceeds as follows
> (c = client, s = server):
>
>  init step:
>  (c) Use show-ref to obtain the advertised refs.
>  (c) Place any object seen in show-ref into set ADVERTISED.
>
>  (c) Build a set, WANT, of the objects from ADVERTISED the client
>      wants to fetch, based on what it saw from show-ref.
>
>  (c) Start a queue, C_PENDING, ordered by commit time (popping newest
>      first).  Add all client refs.  When a commit is popped from the
>      queue its parents should be automatically inserted back.  Commits
>      should only enter the queue once.
>
>  one compute step:
>  (c) Send an upload-pack request:
>
> 	C: 0011capabilities
> 	C: 0024thin-pack include-tag ofs-delta
> 	C: 0009want
> 	C: 0xxx<WANT list>
> 	C: 000bcommon
> 	C: 0xxx<COMMON list>
> 	C: 0009have
> 	C: 0xxx<HAVE list>
> 	C: 0000
>
>      The stream is organized into "sections", where each section is
>      composed of two git pkt-lines.  The first pkt-line provides the
>      name of the section ("capabilities", "want", "have", "common").
>      The second pkt-line has the binary SHA-1 ids which compose that
>      section.

It appears that you really meant "Binary", as opposed to "Hexadecimal"
that show-ref example illustrate, judging from the later 3,276 number.
I'd prefer hexadecimal here.

As a protocol specification, you'd eventually need to describe the
pkt-line format, namely, (1) four hexadecimal digits that represents the
length of the line (including that four bytes), followed by that many
number of bytes as the line's payload, or (2) "0000" which is "flush".
Also typically the text based line payload is LF terminated (hence the
four-hexdigit length counts the terminating LF).  Also "capabilities" need
to be defined.

>   (s) Parse the upload-pack request:
>
>       Verify all objects in WANT are reachable from refs.  As
>       this may require walking backwards through history to
>       the very beginning on invalid requests the server may
>       use a reasonable limit of commits (e.g. 1000) walked
>       beyond any ref tip before giving up.

I suspect moving as much work to the client side by erroring out and
having the client restart from show-ref might be a better tradeoff (also
this has been advertised as a security feature on the native protocol
side).

>       If any WANT object is not reachable, send an error:
>
> 	S: 001estatus error invalid want
>
>      Create an empty list, S_COMMON.
>
>      If 'common' was sent:
>
>      Load all objects into S_COMMON.

Security?  Error out if some of them do not exist on the server end, at
least.

>   (s) Send the upload-pack response:
>
>      If the server has found a closed set of objects to pack, it
>      replies with the pack and the enabled capabilities.  The set
>      of enabled capabilities is limited to the intersection of
>      what the client requested and what the server supports.

Define "closed set".

>      The stream formatting rules are the same as the request.
>
>      The section "common" details the contents of S_COMMON,
>      that is all objects from HAVE that the server also has.

An object in HAVE that exists on the server end can be a descendant of
many other HAVEs. Answering with that youngest one alone is enough,
without the other HAVEs the server end also has as its ancestors, as they
are redundant information.

>   (c) Parse the upload-pack response:
>
>       If the status pkt-line is "status pack:"
>
>       Process the pack stream and update the local refs.
>
>       If the status pkt-line is "status continue":
>
>       Reset COMMON to the items in S_COMMON.  The new S_COMMON
>       should be a superset of the existing COMMON set.

Is there a way to detect bad clients that does not obey this rule without
server side states?

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-08-28  4:42               ` Shawn O. Pearce
@ 2008-08-28  4:58                 ` H. Peter Anvin
  0 siblings, 0 replies; 57+ messages in thread
From: H. Peter Anvin @ 2008-08-28  4:58 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: git

Shawn O. Pearce wrote:
> 
>> Just an implementation detail for the server, however: for an *empty*  
>> repository (one which has no refs at all), the server needs to *not*  
>> transmit the redirect, or there will be a loop :)  It is unnecessary,  
>> anyway, since there is inherently nothing to do.
> 
> Actually that's not true.  A correct client won't loop.
> 
> An empty repository is required to send "refs" section header.
> So the client will see the "refs" header and know that the complete
> set of refs is following.  Only nothing follows, so it knows the
> complete set is the empty set.
> 
> A redirect with no ref data won't have the "refs" section header.
> So the client knows that it cannot conclude anything from that
> exchange and must follow the redirect.
> 

Ah, good point.

	-hpa

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-08-28  4:37             ` H. Peter Anvin
  2008-08-28  4:42               ` Shawn O. Pearce
@ 2008-08-28  6:40               ` Imran M Yousuf
  1 sibling, 0 replies; 57+ messages in thread
From: Imran M Yousuf @ 2008-08-28  6:40 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Shawn O. Pearce, git

On Thu, Aug 28, 2008 at 10:37 AM, H. Peter Anvin <hpa@zytor.com> wrote:
> Shawn O. Pearce wrote:
>>
>> So this is what may be the final draft of the HTTP protocol.
>> I've added stuff about capability selection between the peers for
>> future expansion support.  The upload-pack service has a better
>> use of it than receive-pack.  Otherwise it is what I think you are
>> agreeing to above.  ;-)
>>
>
> It looks good to me.  I *really* like the option of combining a redirect
> with a refs list in one reply; this will make things substantially easier do
> deploy on kernel.org, and saves a round trip to boot.

I agree, this is a very cool feature of the protocol...

- Imran

>
> Just an implementation detail for the server, however: for an *empty*
> repository (one which has no refs at all), the server needs to *not*
> transmit the redirect, or there will be a loop :)  It is unnecessary,
> anyway, since there is inherently nothing to do.
>
>        -hpa
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>



-- 
Imran M Yousuf
Entrepreneur & Software Engineer
Smart IT Engineering
Dhaka, Bangladesh
Email: imran@smartitengineering.com
Blog: http://imyousuf-tech.blogs.smartitengineering.com/
Mobile: +880-1711402557

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-08-28  4:42             ` Junio C Hamano
@ 2008-08-28 14:57               ` Shawn O. Pearce
  2008-08-28 17:26                 ` david
  2008-08-29  4:02                 ` Junio C Hamano
  2008-08-28 17:05               ` H. Peter Anvin
  1 sibling, 2 replies; 57+ messages in thread
From: Shawn O. Pearce @ 2008-08-28 14:57 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: H. Peter Anvin, git

Junio C Hamano <gitster@pobox.com> wrote:
> "Shawn O. Pearce" <spearce@spearce.org> writes:
> 
> > HTTP Redirects
> > --------------
> >
> > If a POST request results in an HTTP 302 or 303 redirect response
> > clients should retry the request by updating the URL and POSTing
> > the same request to the new location.  Subsequent requests should
> > still be sent to the original URL.
> 
> At the first reading I was confused because this seemed to contradict with
> the server pinning that is done by the payload level redirect.

This is meant to help load balancing initially target to a server.
I think its also reasonable to honor a transport level redirect,
much as we honor whatever route IP gives us (not that we have a
lot of choice - or even want one at that level).
 
> > Service upload-pack
> > -------------------
> >  one compute step:
> >  (c) Send an upload-pack request:
> >
> > 	C: 0011capabilities
> > 	C: 0024thin-pack include-tag ofs-delta
> > 	C: 0009want
> > 	C: 0xxx<WANT list>
> > 	C: 000bcommon
> > 	C: 0xxx<COMMON list>
> > 	C: 0009have
> > 	C: 0xxx<HAVE list>
> > 	C: 0000
> >
> >      The stream is organized into "sections", where each section is
> >      composed of two git pkt-lines.  The first pkt-line provides the
> >      name of the section ("capabilities", "want", "have", "common").
> >      The second pkt-line has the binary SHA-1 ids which compose that
> >      section.
> 
> It appears that you really meant "Binary", as opposed to "Hexadecimal"
> that show-ref example illustrate, judging from the later 3,276 number.
> I'd prefer hexadecimal here.

Yes, I really did mean for this part of the protocol to be in binary.
We have to exchange a bunch of commits to figure out what is common.
The binary form is 1/2 the size of the hexadecimal form, resulting
in fewer TCP packets for the same request.

Reading/writing the SHA-1s in binary is usually faster than doing
it in hex; you don't have to go through the formatting routines.
So there's a few less CPU cycles on the server end.

But the rest of the protocol is in hex and ASCII, so I guess it
does make sense to make this part be in hex too. I can change it
in the next draft.
 
> As a protocol specification, you'd eventually need to describe the
> pkt-line format, namely, (1) four hexadecimal digits that represents the
> length of the line (including that four bytes), followed by that many
> number of bytes as the line's payload, or (2) "0000" which is "flush".
> Also typically the text based line payload is LF terminated (hence the
> four-hexdigit length counts the terminating LF).

Yes.  I'll add that into the next draft.

> Also "capabilities" need
> to be defined.

Well, currently its just room for expansion.  But I'll try to define
it out better.  My initial thought is to do something like we have
in the native protocol where there are capability names "hidden"
on the end of the first pkt-line.  Only I'm making it explicit.
 
> >   (s) Parse the upload-pack request:
> >
> >       Verify all objects in WANT are reachable from refs.  As
> >       this may require walking backwards through history to
> >       the very beginning on invalid requests the server may
> >       use a reasonable limit of commits (e.g. 1000) walked
> >       beyond any ref tip before giving up.
> 
> I suspect moving as much work to the client side by erroring out and
> having the client restart from show-ref might be a better tradeoff (also
> this has been advertised as a security feature on the native protocol
> side).

I'm concerned about livelock.  If the client sees something in
show-ref, starts upload-pack and gets 2 round-trips into the common
exchange and then someone updates a ref the client wants, the client
has to go back to the beginning and start all over.

But if the object they want is still reachable (within a reasonable
distance) from the current refs, what is the harm in letting the
client see the stale view?  Especially since grabbing the most
current refs would still make that object available to the client?

Remember that is how the native protocol behaves.  You get a single
upload-pack process which has grabbed a snapshot of the refs.
If they change during the want-have-ack-nack exchange the client
doesn't get kicked out and asked to start all over again.  Same idea.
 
> >       If any WANT object is not reachable, send an error:
> >
> > 	S: 001estatus error invalid want
> >
> >      Create an empty list, S_COMMON.
> >
> >      If 'common' was sent:
> >
> >      Load all objects into S_COMMON.
> 
> Security?  Error out if some of them do not exist on the server end, at
> least.

I think I can add something saying its a protocol error if that
happens.  Its not a security risk, remember the S_COMMON set
eventually turns into the

  git rev-list --objects-edge $WANT --not $S_COMMON \
  | git pack-objects --stdout

Sending an S_COMMON the server doesn't have just causes it to fail.

If the server silently prunes out ones it doesn't know its not a
concern.  If the server has it, but it isn't advertised in a ref,
its also not a security risk.  No data from those objects is sent
back to the client.
 
> >   (s) Send the upload-pack response:
> >
> >      If the server has found a closed set of objects to pack, it
> >      replies with the pack and the enabled capabilities.  The set
> >      of enabled capabilities is limited to the intersection of
> >      what the client requested and what the server supports.
> 
> Define "closed set".

Yea, not only that I don't describe how the client can give up and
just ask for everything that is left.  Like say on an initial clone.
 
> >      The stream formatting rules are the same as the request.
> >
> >      The section "common" details the contents of S_COMMON,
> >      that is all objects from HAVE that the server also has.
> 
> An object in HAVE that exists on the server end can be a descendant of
> many other HAVEs. Answering with that youngest one alone is enough,
> without the other HAVEs the server end also has as its ancestors, as they
> are redundant information.

Yes, obviously.  I must not have made that clear here.  I'll try
to improve the language.
 
> >   (c) Parse the upload-pack response:
> >
> >       If the status pkt-line is "status pack:"
> >
> >       Process the pack stream and update the local refs.
> >
> >       If the status pkt-line is "status continue":
> >
> >       Reset COMMON to the items in S_COMMON.  The new S_COMMON
> >       should be a superset of the existing COMMON set.
> 
> Is there a way to detect bad clients that does not obey this rule without
> server side states?

No.  Is that really a concern though?

The worst a bad client can do here is cause itself to receive
more data than it wants by refusing to put things into COMMON.
Eventually it gives up and just clones the entire repository.  How is
that any different from a well behaved client doing an initial clone?

A bad client could also stick random things into COMMON.  If the
server doesn't have the object we error out (as you suggest above)
during the next call.  So the client has only DOS'd the server.
It can DOS the server easier other ways.

A bad client could stick only part of what S_COMMON into COMMON.
That may cause it to get a bigger pack file than it asked for as the
rev-list call won't be as limited.  How is that any different from
a well behaved client that is really behind and has a lot to fetch?

-- 
Shawn.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-08-28  4:42             ` Junio C Hamano
  2008-08-28 14:57               ` Shawn O. Pearce
@ 2008-08-28 17:05               ` H. Peter Anvin
  2008-08-28 17:10                 ` Shawn O. Pearce
  1 sibling, 1 reply; 57+ messages in thread
From: H. Peter Anvin @ 2008-08-28 17:05 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Shawn O. Pearce, git

Junio C Hamano wrote:

> 
> It appears that you really meant "Binary", as opposed to "Hexadecimal"
> that show-ref example illustrate, judging from the later 3,276 number.
> I'd prefer hexadecimal here.
> 

I *think* the "native" git protocol uses binary here.  It makes sense to 
be consistent, to allow them to share code?

	-hpa

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-08-28 17:05               ` H. Peter Anvin
@ 2008-08-28 17:10                 ` Shawn O. Pearce
  2008-08-28 17:20                   ` H. Peter Anvin
  0 siblings, 1 reply; 57+ messages in thread
From: Shawn O. Pearce @ 2008-08-28 17:10 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Junio C Hamano, git

"H. Peter Anvin" <hpa@zytor.com> wrote:
> Junio C Hamano wrote:
>
>>
>> It appears that you really meant "Binary", as opposed to "Hexadecimal"
>> that show-ref example illustrate, judging from the later 3,276 number.
>> I'd prefer hexadecimal here.
>>
>
> I *think* the "native" git protocol uses binary here.  It makes sense to  
> be consistent, to allow them to share code?

No, the native protocol is horribly verbose here:

	0032want ac3abe10ed54d512fbbaeb7cef19972eedd8e4a8
	0032want 404c3bbec34f5c65c5024c856eed4dbbfc27831e
	0032want 9bcc7aff6095549c1425aef6ca0034c47189705d
	0032have 471287a3c311e486206d3c6ff94faf3dfffc736c
	0032have 48f27055a4fa5f4da8234f44808f0b0c70629218
	0032have d4cc612f218b3dd3b831e3b976bf85165cd4f3d4
	...

so its doing it in hex, and its using 10 bytes of "framing" for
every SHA-1 it sends as each is sent in its own pkt-line with the
have/want header.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-08-28 17:10                 ` Shawn O. Pearce
@ 2008-08-28 17:20                   ` H. Peter Anvin
  2008-08-28 17:26                     ` Shawn O. Pearce
  0 siblings, 1 reply; 57+ messages in thread
From: H. Peter Anvin @ 2008-08-28 17:20 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Junio C Hamano, git

Shawn O. Pearce wrote:
> "H. Peter Anvin" <hpa@zytor.com> wrote:
>> Junio C Hamano wrote:
>>
>>> It appears that you really meant "Binary", as opposed to "Hexadecimal"
>>> that show-ref example illustrate, judging from the later 3,276 number.
>>> I'd prefer hexadecimal here.
>>>
>> I *think* the "native" git protocol uses binary here.  It makes sense to  
>> be consistent, to allow them to share code?
> 
> No, the native protocol is horribly verbose here:
> 
> 	0032want ac3abe10ed54d512fbbaeb7cef19972eedd8e4a8
> 	0032want 404c3bbec34f5c65c5024c856eed4dbbfc27831e
> 	0032want 9bcc7aff6095549c1425aef6ca0034c47189705d
> 	0032have 471287a3c311e486206d3c6ff94faf3dfffc736c
> 	0032have 48f27055a4fa5f4da8234f44808f0b0c70629218
> 	0032have d4cc612f218b3dd3b831e3b976bf85165cd4f3d4
> 	...
> 
> so its doing it in hex, and its using 10 bytes of "framing" for
> every SHA-1 it sends as each is sent in its own pkt-line with the
> have/want header.
> 

Hm.  It's probably not enough data to worry significantly about.

	-hpa

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-08-28 14:57               ` Shawn O. Pearce
@ 2008-08-28 17:26                 ` david
  2008-08-28 17:28                   ` Shawn O. Pearce
  2008-08-28 17:43                   ` H. Peter Anvin
  2008-08-29  4:02                 ` Junio C Hamano
  1 sibling, 2 replies; 57+ messages in thread
From: david @ 2008-08-28 17:26 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Junio C Hamano, H. Peter Anvin, git

On Thu, 28 Aug 2008, Shawn O. Pearce wrote:

> Junio C Hamano <gitster@pobox.com> wrote:
>> "Shawn O. Pearce" <spearce@spearce.org> writes:
>>
>> It appears that you really meant "Binary", as opposed to "Hexadecimal"
>> that show-ref example illustrate, judging from the later 3,276 number.
>> I'd prefer hexadecimal here.
>
> Yes, I really did mean for this part of the protocol to be in binary.
> We have to exchange a bunch of commits to figure out what is common.
> The binary form is 1/2 the size of the hexadecimal form, resulting
> in fewer TCP packets for the same request.

except that HTTP cannot transport binary data, if you feed it binary data 
it then encodes it into 7-bit safe forms for transport.

while it's true that it can pack it more efficiantly than hex, it's not 
double the density.

> Reading/writing the SHA-1s in binary is usually faster than doing
> it in hex; you don't have to go through the formatting routines.
> So there's a few less CPU cycles on the server end.

except you then need to go through the formatting routines to send it via 
HTTP.

David Lang

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-08-28 17:20                   ` H. Peter Anvin
@ 2008-08-28 17:26                     ` Shawn O. Pearce
  2008-08-28 17:44                       ` H. Peter Anvin
  2008-08-28 18:40                       ` Nicolas Pitre
  0 siblings, 2 replies; 57+ messages in thread
From: Shawn O. Pearce @ 2008-08-28 17:26 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Junio C Hamano, git

"H. Peter Anvin" <hpa@zytor.com> wrote:
> Shawn O. Pearce wrote:
>> "H. Peter Anvin" <hpa@zytor.com> wrote:
>>>
>>> I *think* the "native" git protocol uses binary here.  It makes sense 
>>> to  be consistent, to allow them to share code?
>>
>> No, the native protocol is horribly verbose here:
>>
>> 	0032want ac3abe10ed54d512fbbaeb7cef19972eedd8e4a8
>> 	...
>>
>> so its doing it in hex, and its using 10 bytes of "framing" for
>> every SHA-1 it sends as each is sent in its own pkt-line with the
>> have/want header.
>
> Hm.  It's probably not enough data to worry significantly about.

Should I change the HTTP protocol then to use the same format,
so they have a better chance at sharing code between them?

-- 
Shawn.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-08-28 17:26                 ` david
@ 2008-08-28 17:28                   ` Shawn O. Pearce
  2008-08-28 17:37                     ` david
  2008-08-28 17:43                   ` H. Peter Anvin
  1 sibling, 1 reply; 57+ messages in thread
From: Shawn O. Pearce @ 2008-08-28 17:28 UTC (permalink / raw)
  To: david; +Cc: Junio C Hamano, H. Peter Anvin, git

david@lang.hm wrote:
> On Thu, 28 Aug 2008, Shawn O. Pearce wrote:
>>> "Shawn O. Pearce" <spearce@spearce.org> writes:
>>
>> Yes, I really did mean for this part of the protocol to be in binary.
>
> except that HTTP cannot transport binary data, if you feed it binary data 
> it then encodes it into 7-bit safe forms for transport.

So then how does it transport a GIF file to my browser?  uuencoded?
Last time I read the RFCs I was pretty certain HTTP is 8-bit clean
in both directions.

Of course this may all be moot.  I think we're moving in a direction
of matching the git native protocol more exactly.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-08-28 17:28                   ` Shawn O. Pearce
@ 2008-08-28 17:37                     ` david
  2008-08-28 17:38                       ` Daniel Stenberg
                                         ` (2 more replies)
  0 siblings, 3 replies; 57+ messages in thread
From: david @ 2008-08-28 17:37 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Junio C Hamano, H. Peter Anvin, git

On Thu, 28 Aug 2008, Shawn O. Pearce wrote:

> david@lang.hm wrote:
>> On Thu, 28 Aug 2008, Shawn O. Pearce wrote:
>>>> "Shawn O. Pearce" <spearce@spearce.org> writes:
>>>
>>> Yes, I really did mean for this part of the protocol to be in binary.
>>
>> except that HTTP cannot transport binary data, if you feed it binary data
>> it then encodes it into 7-bit safe forms for transport.
>
> So then how does it transport a GIF file to my browser?  uuencoded?

something like that. it uses the mimetype mechanisms to identify the 
various pieces and encodes each piece (if nothing else it needs to make 
sure that the mimetype seperators don't appear in the data) uuencode is 
one of the available mechanisms.

> Last time I read the RFCs I was pretty certain HTTP is 8-bit clean
> in both directions.

I could be wrong, but I'm pretty sure I'm not. to test this yourself find 
a webserver with an image file and retrieve it via telnet (telnet hostname 
80<enter>GET /path/to/file HTTP/1.0<enter><enter>) and what will come back 
will be text.

> Of course this may all be moot.  I think we're moving in a direction
> of matching the git native protocol more exactly.

true, but it's never a waste of time to learn something (whichever one of 
us is right :-)

David Lang

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-08-28 17:37                     ` david
@ 2008-08-28 17:38                       ` Daniel Stenberg
  2008-08-28 17:43                       ` Shawn O. Pearce
  2008-08-28 18:04                       ` Mike Hommey
  2 siblings, 0 replies; 57+ messages in thread
From: Daniel Stenberg @ 2008-08-28 17:38 UTC (permalink / raw)
  To: david; +Cc: git

On Thu, 28 Aug 2008, david@lang.hm wrote:

>>> except that HTTP cannot transport binary data, if you feed it binary data
>>> it then encodes it into 7-bit safe forms for transport.
>> 
>> So then how does it transport a GIF file to my browser?  uuencoded?
>
> something like that. it uses the mimetype mechanisms to identify the various 
> pieces and encodes each piece (if nothing else it needs to make sure that 
> the mimetype seperators don't appear in the data) uuencode is one of the 
> available mechanisms.

No. HTTP is 8bit clean and sends and receives binary just fine. You seem to 
think of SMTP or something.

-- 

  / daniel.haxx.se

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-08-28 17:26                 ` david
  2008-08-28 17:28                   ` Shawn O. Pearce
@ 2008-08-28 17:43                   ` H. Peter Anvin
  2008-08-28 18:12                     ` david
  1 sibling, 1 reply; 57+ messages in thread
From: H. Peter Anvin @ 2008-08-28 17:43 UTC (permalink / raw)
  To: david; +Cc: Shawn O. Pearce, Junio C Hamano, git

david@lang.hm wrote:
> except that HTTP cannot transport binary data, if you feed it binary 
> data it then encodes it into 7-bit safe forms for transport.

Total utter bunk.  You're thinking of email and news, which had to deal 
with broken legacy code.

HTTP has *always* been binary clean.  It does not encode anything into 
7-bit safe anything.  The only "encoding" that it ever does is 
HTTP/1.1's chunked encoding, which is a way to deal with the fact that 
it might not always know the total length of the data before it starts 
the transfer; it sends the data in arbitrary-sized "chunks" prefixed by 
a byte count.  It does this to support connection caching in HTTP/1.1; 
HTTP/1.0 would simply close the connection to indicate end of (binary) data.

	-hpa

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-08-28 17:37                     ` david
  2008-08-28 17:38                       ` Daniel Stenberg
@ 2008-08-28 17:43                       ` Shawn O. Pearce
  2008-08-28 17:47                         ` H. Peter Anvin
  2008-08-28 18:04                       ` Mike Hommey
  2 siblings, 1 reply; 57+ messages in thread
From: Shawn O. Pearce @ 2008-08-28 17:43 UTC (permalink / raw)
  To: david; +Cc: Junio C Hamano, H. Peter Anvin, git

david@lang.hm wrote:
> On Thu, 28 Aug 2008, Shawn O. Pearce wrote:
>> david@lang.hm wrote:
>>> On Thu, 28 Aug 2008, Shawn O. Pearce wrote:
>>>>> "Shawn O. Pearce" <spearce@spearce.org> writes:
>>>>
>>>> Yes, I really did mean for this part of the protocol to be in binary.
>>>
>>> except that HTTP cannot transport binary data, if you feed it binary data
>>> it then encodes it into 7-bit safe forms for transport.
>>
>> So then how does it transport a GIF file to my browser?  uuencoded?
...
> I could be wrong, but I'm pretty sure I'm not. to test this yourself find 
> a webserver with an image file and retrieve it via telnet (telnet 
> hostname 80<enter>GET /path/to/file HTTP/1.0<enter><enter>) and what will 
> come back will be text.

  $ telnet www.google.com 80
  Trying 74.125.19.104...
  Connected to www.google.com (74.125.19.104).
  Escape character is '^]'.
  GET /intl/en_ALL/images/logo.gif HTTP/1.0
  
  HTTP/1.0 200 OK
  Content-Type: image/gif
  Last-Modified: Wed, 07 Jun 2006 19:38:24 GMT
  Expires: Sun, 17 Jan 2038 19:14:07 GMT
  Cache-Control: public
  Date: Thu, 28 Aug 2008 17:40:44 GMT
  Server: gws
  Content-Length: 8558
  X-Google-Backends: /bns/pq/borg/pq/bns/gws-prod/staticweb.staticfrontend.gws/16:9836,dauf30:80
  X-Google-Service: static
  X-Google-GFE-Request-Trace: dauf30:80,/bns/pq/borg/pq/bns/gws-prod/staticweb.staticfrontend.gws/16:9836,dauf30:80
  Connection: Close
  
  GIF89a	\x01n���������������έ	���\x18E�\x18I�\x104�\x10<��\x18�������\x10ƾ����\x18M������������セ�����$c!Y�����\x18QΜ�������e֮c��1e�J}<�������s�����9q�k

Very funny.  It trashed my tty.  Even reset won't restore the
settings.  Anyway.

I chose the Google logo on the Google homepage because I know we
try really hard to conform to standards, so we can have the biggest
possible user base.  Micro$oft or Yahoo! probably would have come
out the same way.  Or some image on kernel.org.

Anyway, I didn't send any browser data, so the server had to assume
the dumbest f'ing browser on the planet, and I got back binary data.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-08-28 17:26                     ` Shawn O. Pearce
@ 2008-08-28 17:44                       ` H. Peter Anvin
  2008-08-28 17:46                         ` Shawn O. Pearce
  2008-08-28 18:40                       ` Nicolas Pitre
  1 sibling, 1 reply; 57+ messages in thread
From: H. Peter Anvin @ 2008-08-28 17:44 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Junio C Hamano, git

Shawn O. Pearce wrote:
> 
> Should I change the HTTP protocol then to use the same format,
> so they have a better chance at sharing code between them?
> 

I leave that up to you and Junio.  My feel would that it's not worth 
optimizing the HTTP protocol separately.

	-hpa

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-08-28 17:44                       ` H. Peter Anvin
@ 2008-08-28 17:46                         ` Shawn O. Pearce
  0 siblings, 0 replies; 57+ messages in thread
From: Shawn O. Pearce @ 2008-08-28 17:46 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Junio C Hamano, git

"H. Peter Anvin" <hpa@zytor.com> wrote:
> Shawn O. Pearce wrote:
>>
>> Should I change the HTTP protocol then to use the same format,
>> so they have a better chance at sharing code between them?
>>
>
> I leave that up to you and Junio.  My feel would that it's not worth  
> optimizing the HTTP protocol separately.

Yea, I'm leaning towards just keeping them the same.  I may be
able to reuse a lot of code in JGit that way.  In C Git its going
to take some refactoring to disentagle the IO parts of fetch-pack
from the protocol, but I should be able to reuse a lot there too.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-08-28 17:43                       ` Shawn O. Pearce
@ 2008-08-28 17:47                         ` H. Peter Anvin
  0 siblings, 0 replies; 57+ messages in thread
From: H. Peter Anvin @ 2008-08-28 17:47 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: david, Junio C Hamano, git

Shawn O. Pearce wrote:
>   GET /intl/en_ALL/images/logo.gif HTTP/1.0

> Anyway, I didn't send any browser data, so the server had to assume
> the dumbest f'ing browser on the planet, and I got back binary data.

Slight nitpick: you did (HTTP/1.0).  If you really want to show the 
bottom-of-the barrel behaviour, drop that off which gets you HTTP 0.x 
behaviour -- not exactly commonly encountered today :)  HTTP 0.x had no 
provisions for a request header, so you should not need to send a blank 
line after the GET.

	-hpa

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-08-28 17:37                     ` david
  2008-08-28 17:38                       ` Daniel Stenberg
  2008-08-28 17:43                       ` Shawn O. Pearce
@ 2008-08-28 18:04                       ` Mike Hommey
  2 siblings, 0 replies; 57+ messages in thread
From: Mike Hommey @ 2008-08-28 18:04 UTC (permalink / raw)
  To: david; +Cc: Shawn O. Pearce, Junio C Hamano, H. Peter Anvin, git

On Thu, Aug 28, 2008 at 10:37:04AM -0700, david@lang.hm wrote:
> On Thu, 28 Aug 2008, Shawn O. Pearce wrote:
>
>> david@lang.hm wrote:
>>> On Thu, 28 Aug 2008, Shawn O. Pearce wrote:
>>>>> "Shawn O. Pearce" <spearce@spearce.org> writes:
>>>>
>>>> Yes, I really did mean for this part of the protocol to be in binary.
>>>
>>> except that HTTP cannot transport binary data, if you feed it binary data
>>> it then encodes it into 7-bit safe forms for transport.
>>
>> So then how does it transport a GIF file to my browser?  uuencoded?
>
> something like that. it uses the mimetype mechanisms to identify the  
> various pieces and encodes each piece (if nothing else it needs to make  
> sure that the mimetype seperators don't appear in the data) uuencode is  
> one of the available mechanisms.
>
>> Last time I read the RFCs I was pretty certain HTTP is 8-bit clean
>> in both directions.
>
> I could be wrong, but I'm pretty sure I'm not. to test this yourself find 
> a webserver with an image file and retrieve it via telnet (telnet 
> hostname 80<enter>GET /path/to/file HTTP/1.0<enter><enter>) and what will 
> come back will be text.

No it won't. Try it *yourself*.

$ nc www.google.com 80 | sed '1,/^\r$/d' > /tmp/logo.gif
GET /intl/en_ALL/images/logo.gif HTTP/1.1
Host: www.google.com
Connection: close

$ file /tmp/logo.gif 
/tmp/logo.gif: GIF image data, version 89a, 276 x 110

Mike

PS: sed only removes the HTTP response headers.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-08-28 17:43                   ` H. Peter Anvin
@ 2008-08-28 18:12                     ` david
  2008-08-28 18:14                       ` H. Peter Anvin
  0 siblings, 1 reply; 57+ messages in thread
From: david @ 2008-08-28 18:12 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Shawn O. Pearce, Junio C Hamano, git

On Thu, 28 Aug 2008, H. Peter Anvin wrote:

> david@lang.hm wrote:
>> except that HTTP cannot transport binary data, if you feed it binary data 
>> it then encodes it into 7-bit safe forms for transport.
>
> Total utter bunk.  You're thinking of email and news, which had to deal with 
> broken legacy code.
>
> HTTP has *always* been binary clean.  It does not encode anything into 7-bit 
> safe anything.  The only "encoding" that it ever does is HTTP/1.1's chunked 
> encoding, which is a way to deal with the fact that it might not always know 
> the total length of the data before it starts the transfer; it sends the data 
> in arbitrary-sized "chunks" prefixed by a byte count.  It does this to 
> support connection caching in HTTP/1.1; HTTP/1.0 would simply close the 
> connection to indicate end of (binary) data.

Ok, I was wrong, thanks to everyone for correcting me. I now know this.

David Lang

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-08-28 18:12                     ` david
@ 2008-08-28 18:14                       ` H. Peter Anvin
  2008-08-28 18:18                         ` david
  0 siblings, 1 reply; 57+ messages in thread
From: H. Peter Anvin @ 2008-08-28 18:14 UTC (permalink / raw)
  To: david; +Cc: Shawn O. Pearce, Junio C Hamano, git

david@lang.hm wrote:
> 
> Ok, I was wrong, thanks to everyone for correcting me. I now know this.
> 

It's an easy mistake to make given HTTP's apparent connections with 
MIME.  However, all it uses from MIME is the type tags, it doesn't use 
MIME encapsulation format at all.

	-hpa

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-08-28 18:14                       ` H. Peter Anvin
@ 2008-08-28 18:18                         ` david
  0 siblings, 0 replies; 57+ messages in thread
From: david @ 2008-08-28 18:18 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Shawn O. Pearce, Junio C Hamano, git

On Thu, 28 Aug 2008, H. Peter Anvin wrote:

> david@lang.hm wrote:
>> 
>> Ok, I was wrong, thanks to everyone for correcting me. I now know this.
>> 
>
> It's an easy mistake to make given HTTP's apparent connections with MIME. 
> However, all it uses from MIME is the type tags, it doesn't use MIME 
> encapsulation format at all.

as I said earlier, it's never a waste of time to learn things, no matter 
who is right. ;-)

David Lang

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-08-28 17:26                     ` Shawn O. Pearce
  2008-08-28 17:44                       ` H. Peter Anvin
@ 2008-08-28 18:40                       ` Nicolas Pitre
  2008-08-28 18:47                         ` H. Peter Anvin
  1 sibling, 1 reply; 57+ messages in thread
From: Nicolas Pitre @ 2008-08-28 18:40 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: H. Peter Anvin, Junio C Hamano, git

On Thu, 28 Aug 2008, Shawn O. Pearce wrote:

> "H. Peter Anvin" <hpa@zytor.com> wrote:
> > Shawn O. Pearce wrote:
> >> "H. Peter Anvin" <hpa@zytor.com> wrote:
> >>>
> >>> I *think* the "native" git protocol uses binary here.  It makes sense 
> >>> to  be consistent, to allow them to share code?
> >>
> >> No, the native protocol is horribly verbose here:
> >>
> >> 	0032want ac3abe10ed54d512fbbaeb7cef19972eedd8e4a8
> >> 	...
> >>
> >> so its doing it in hex, and its using 10 bytes of "framing" for
> >> every SHA-1 it sends as each is sent in its own pkt-line with the
> >> have/want header.
> >
> > Hm.  It's probably not enough data to worry significantly about.
> 
> Should I change the HTTP protocol then to use the same format,
> so they have a better chance at sharing code between them?

Given that the ref exchange happens on multiple lines (one ref per line) 
in the native protocol, and that your proposal is using one line for 
multiple refs, I don't see this as a big factor wrt code reuse.  Since 
you'll have separate "output" code anyway, why not simply going with 
refs in straight binary for the HTTP protocol?  Even the debugability of 
refs exchange in plain text is dubious especially with all refs on the 
same line (that'll be a pain to split refs out of a long stream of hex 
by hand).


Nicolas

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-08-28 18:40                       ` Nicolas Pitre
@ 2008-08-28 18:47                         ` H. Peter Anvin
  0 siblings, 0 replies; 57+ messages in thread
From: H. Peter Anvin @ 2008-08-28 18:47 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Shawn O. Pearce, Junio C Hamano, git

Nicolas Pitre wrote:
>> Should I change the HTTP protocol then to use the same format,
>> so they have a better chance at sharing code between them?
> 
> Given that the ref exchange happens on multiple lines (one ref per line) 
> in the native protocol, and that your proposal is using one line for 
> multiple refs, I don't see this as a big factor wrt code reuse.  Since 
> you'll have separate "output" code anyway, why not simply going with 
> refs in straight binary for the HTTP protocol?  Even the debugability of 
> refs exchange in plain text is dubious especially with all refs on the 
> same line (that'll be a pain to split refs out of a long stream of hex 
> by hand).

Well, I think the real question was to go to multiple lines, 
native-protocol style, or go to binary.

	-hpa

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-08-28 14:57               ` Shawn O. Pearce
  2008-08-28 17:26                 ` david
@ 2008-08-29  4:02                 ` Junio C Hamano
  2008-08-29  5:11                   ` H. Peter Anvin
  1 sibling, 1 reply; 57+ messages in thread
From: Junio C Hamano @ 2008-08-29  4:02 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: H. Peter Anvin, git

"Shawn O. Pearce" <spearce@spearce.org> writes:

> Junio C Hamano <gitster@pobox.com> wrote:
>> "Shawn O. Pearce" <spearce@spearce.org> writes:
>> 
>> > HTTP Redirects
>> > --------------
>> >
>> > If a POST request results in an HTTP 302 or 303 redirect response
>> > clients should retry the request by updating the URL and POSTing
>> > the same request to the new location.  Subsequent requests should
>> > still be sent to the original URL.
>> 
>> At the first reading I was confused because this seemed to contradict with
>> the server pinning that is done by the payload level redirect.
>
> This is meant to help load balancing initially target to a server.
> I think its also reasonable to honor a transport level redirect,
> much as we honor whatever route IP gives us (not that we have a
> lot of choice - or even want one at that level).

Yeah, I think understood what you are trying to achieve here after reading
the document twice.

I was just pointing out that the language (or presentation order) was
confusing to me and I needed to read these two sections twice to see the
difference between the two redirects.

>> Is there a way to detect bad clients that does not obey this rule without
>> server side states?
>
> No.  Is that really a concern though?

I was more concerned about a bad/broken client not giving up forever, and
not giving the server enough cue to give up, saying "I've conversed with
this guy long enough but haven't reached the conclusion yet --- there is
something wrong".  Even without server side states, if we were to trust
clients, we can add "this is Nth round" to the protocol to help the server
detect "long enough" part, but that somehow does not feel right.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-08-29  4:02                 ` Junio C Hamano
@ 2008-08-29  5:11                   ` H. Peter Anvin
  2008-08-29  6:50                     ` Junio C Hamano
  0 siblings, 1 reply; 57+ messages in thread
From: H. Peter Anvin @ 2008-08-29  5:11 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Shawn O. Pearce, git

Junio C Hamano wrote:
> 
>>> Is there a way to detect bad clients that does not obey this rule without
>>> server side states?
>> No.  Is that really a concern though?
> 
> I was more concerned about a bad/broken client not giving up forever, and
> not giving the server enough cue to give up, saying "I've conversed with
> this guy long enough but haven't reached the conclusion yet --- there is
> something wrong".  Even without server side states, if we were to trust
> clients, we can add "this is Nth round" to the protocol to help the server
> detect "long enough" part, but that somehow does not feel right.
> 

We should be able to detect either inconsistency, or lack of forward 
progress, but as long as there is forward progress made there doesn't 
seem to be a strong need to terminate.

	-hpa

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-08-29  5:11                   ` H. Peter Anvin
@ 2008-08-29  6:50                     ` Junio C Hamano
  2008-08-29 17:39                       ` Shawn O. Pearce
  0 siblings, 1 reply; 57+ messages in thread
From: Junio C Hamano @ 2008-08-29  6:50 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Shawn O. Pearce, git

"H. Peter Anvin" <hpa@zytor.com> writes:

> We should be able to detect either inconsistency, or lack of forward
> progress, but as long as there is forward progress made there doesn't
> seem to be a strong need to terminate.

Yeah, what I wanted to say was that it would be tricky to detect (any/lack
of) forward progress without having server side state and without trusting
the client.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-08-29  6:50                     ` Junio C Hamano
@ 2008-08-29 17:39                       ` Shawn O. Pearce
  2008-08-29 19:55                         ` Nicolas Pitre
  2008-09-01 16:05                         ` Tarmigan
  0 siblings, 2 replies; 57+ messages in thread
From: Shawn O. Pearce @ 2008-08-29 17:39 UTC (permalink / raw)
  To: Junio C Hamano, H. Peter Anvin; +Cc: git

Yet another draft follows.  I believe that I have covered all
comments with this draft.  But I welcome any additional ones,
as thus far it has been a very constructive process.

The updated protocol looks more like the current native protocol
does.  This should make it easier to reuse code between the two
protocol implementations.

--8<--
Smart HTTP transfer protocols
=============================

Git supports two HTTP based transfer protocols.  A "dumb" protocol
which requires only a standard HTTP server on the server end of the
connection, and a "smart" protocol which requires a Git aware CGI
(or server module).  This document describes the "smart" protocol.

As a design feature smart clients can automatically translate and
upgrade "dumb" protocol URLs.  This permits all users to have the
same published URL, with the peers automatically choosing to use
the most efficient transport available to them.

HTTP Transport
--------------

All requests are encoded as HTTP POST requests to the smart service
URL, "$url/backend.git-http/$service".

All responses are encoded as 200 Ok responses, even if the server
side has "failed" the request.  Service specific success/failure
codes are embedded in the content.

Authentication
--------------

Standard HTTP authentication is used if authentication is required
to access a repository, and must be configured and enforced by the
HTTP server software itself.

Stateless
---------

The protocol, much like its underlying HTTP, is stateless, from the
perspective of the HTTP server side.  All state must be retained and
managed by the client.  This permits round-robin load-balancing on
the server side, among many other implementation details.

Content Type
------------

All requests/responses use "application/x-git" as the content type.
Action specific subtypes are specified by the parameter "service",
e.g. "application/x-git; service=upload-pack".

HTTP Redirects
--------------

If a POST request results in an HTTP 302 or 303 redirect response
clients should retry the request by updating the URL and POSTing
the same request to the new location.  Subsequent requests should
still be sent to the original URL.

This redirect behavior is unrelated to the in-payload redirect
that is described below in "Service show-ref".

Detecting Smart Servers
-----------------------

HTTP clients can detect a smart Git-aware server by sending
a request to service "show-ref".

A Git-aware server will respond with a valid response.  Clients
must check the following properties to prevent being fooled by
misconfigured servers:

  * HTTP status code is 200.
  * Content-Type is "application/x-git; service=show-ref"
  * The body can be parsed without errors.  The length of
    each pkt-line must be 4 valid hex digits.

A dumb server will respond with a non-200 HTTP status code.
A misconfigured server may respond with a normal 200 status
code, but an incorrect content type, or an invalid leading
4 byte sequence for a pkt-line (e.g. "<htm" or "<!DO" are
not valid lengths).

pkt-line Format
---------------

Much of the payload is described around pkt-lines.

A pkt-line is a variable length binary string.  The first four bytes
of the line indicates the total length of the line, in hexadecimal.
The total length includes the 4 bytes used to denote the length.  A
line is usually terminated by an LF, which must be included in the
total length if present.

Binary data is permitted within a pkt-line so implementors should
ensure their pkt-line parsing/formatting routines are 8-bit clean.
The maximum length of a pkt-line's data is 65532 bytes (65536 - 4).

Examples (as C-style strings):

  pkt-line          actual value
  ---------------------------------
  "0006a\n"         "a\n"
  "0005a"           "a"
  "000bfoobar\n"    "foobar\n"
  "0004"            ""

A pkt-line with a length of 0 ("0000") is a special case and is
treated as a break or terminator in the payload.

Service show-ref
----------------

Obtains the available refs from the remote repository.

URL: $url/backend.git-http/show-ref
Content-Type: application/x-git; service=show-ref

The request is an empty body.

The response is a pkt-line with "refs", followed by zero
or more ref pkt-lines ("$id $name"), and a final pkt-line
with a length of 0:

	S: 0009refs
	S: 003295dcfa3633004da0049d3d0fa03f80589cbcaf31 HEAD
	S: 003e95dcfa3633004da0049d3d0fa03f80589cbcaf31 refs/heads/maint
	S: 003fd049f6c27a2244e12041955e262a404c7faba355 refs/heads/master
	S: 003b2cb58b79488a98d2721cea644875a8dd0026b115 refs/heads/pu
	S: 0000

The response may begin with an optional redirect to a new service
URL for the repository:

	S: 0028redirect http://s1.example.com/git/
	S: 0009refs
	S: 003295dcfa3633004da0049d3d0fa03f80589cbcaf31 HEAD
	S: 003fd049f6c27a2244e12041955e262a404c7faba355 refs/heads/master
	S: 0000

or be composed of only a redirect:

	S: 0028redirect http://s1.example.com/git/
	S: 0000

If a redirect is returned the client should update itself
to use the new URL as the location for future requests.
A server may use the redirect to request that the client
"pin" itself to a particular server for the remainder of
the current transaction.

The URL listed in any redirect should be the base URL
without any query args.  The client will automatically
append "/backend.git-http/$service" as it makes each
future request.

If no "refs" line was received in the response, but
a "redirect" was received, the client should retry
its request at the new location before giving up.

Service upload-pack
-------------------

Prepares an estimated minimal pack to transfer new objects to the
client.

URL: $url/backend.git-http/upload-pack
Content-Type: application/x-git; service=upload-pack

The computation to select the minimal pack proceeds as follows
(c = client, s = server):

 init step:
 (c) Use show-ref to obtain the advertised refs.
 (c) Place any object seen in show-ref into set ADVERTISED.

 (c) Build a set, WANT, of the objects from ADVERTISED the client
     wants to fetch, based on what it saw from show-ref.
 (c) Build an empty set, COMMON, to hold the objects that are later
     determined to be on both ends.

 (c) Start a queue, C_PENDING, ordered by commit time (popping newest
     first).  Add all client refs.  When a commit is popped from the
     queue its parents should be automatically inserted back.  Commits
     should only enter the queue once.

 one compute step:
 (c) Send an upload-pack request:

	C: 001bcapability include-tag
	C: 0019capability thin-pack
	....
	C: 0032want <WANT #1>...............................
	C: 0032want <WANT #2>...............................
	....
	C: 0034common <COMMON #1>.............................
	C: 0034common <COMMON #2>.............................
	....
	C: 0032have <HAVE #1>...............................
	C: 0032have <HAVE #2>...............................
	....
	C: 0000

     The stream is organized into "commands", with each command
     appearing by itself in a pkt-line.  Within a command line
     the text leading up to the first space is the command name,
     and the remainder of the line to the first LF is the value.
     Command lines are terminated with an LF as the last byte of
     the pkt-line value.

     Servers must ignore commands which they do not recognize.
     This permits newer clients to transmit additional data to
     an unknown server, in case the server is new enough to use
     the additional information.

     Commands must appear in the following order, if they appear
     at all in the request stream:

       * capability
       * want
       * common
       * have
       * give-up

     The stream is terminated by a pkt-line flush ("0000").

     The "capability" command requests a single protocol feature
     to be enabled by the server.  Typically these are used to
     describe aspects of the pack that will be returned.  See
     below for more details on the current capabilities.

     A single "want", "common", or "have" command has one hex
     formatted SHA-1 as its value.  Multiple SHA-1s can be sent
     by sending multiple commands.

     The HAVE list is created by popping the first 64 commits
     from C_PENDING.  Less can be supplied if C_PENDING empties.

     If the client has sent 256 HAVE commits and has not yet
     received one of those back from S_COMMON, or the client
     has emptied C_PENDING it should include a "give-up"
     command to let the server know it won't proceed:

	C: 000cgive-up

  (s) Parse the upload-pack request:

      Verify all objects in WANT are reachable from refs.  As
      this may require walking backwards through history to
      the very beginning on invalid requests the server may
      use a reasonable limit of commits (e.g. 1000) walked
      beyond any ref tip before giving up.

      If no WANT objects are received, send an error:

	S: 0019status error no want

      If any WANT object is not reachable, send an error:

	S: 001estatus error invalid want

     Create an empty list, S_COMMON.

     If 'common' was sent:

     Load all objects into S_COMMON.  If an object appears in
     'common' but the server does not have the object locally
     an error should be returned:

	S: 001estatus error invalid common

     If 'have' was sent:

     Loop through the objects in the order supplied by the client.
     For each object, if the server has the object reachable from
     a ref, add it to S_COMMON.  If a commit is added to S_COMMON,
     do not add any ancestors, even if they also appear in HAVE.

  (s) Send the upload-pack response:

     If the server has found a closed set of objects to pack or the
     request contains "give-up", it replies with the pack and the
     enabled capabilities.  The set of enabled capabilities is limited
     to the intersection of what the client requested and what the
     server supports.

	S: 0010status pack
	C: 001bcapability include-tag
	C: 0019capability thin-pack
	S: 000c.PACK...

     The returned stream is the side-band-64k protocol supported
     by the git-upload-pack service, and the pack is embedded into
     stream 1.  Progress messages from the server side may appear
     in stream 2.

     Here a "closed set of objects" is defined to have at least
     one path from every WANT to at least one COMMON object.

     If the server needs more information, it replies with a
     status continue response:

	S: 0014status continue
	S: 0034common <S_COMMON #1>...........................
	S: 0034common <S_COMMON #2>...........................
	...
	S: 0000

     The stream formatting rules are the same as the request.

     The "common" command details the contents of S_COMMON,
     that is all objects from HAVE that the server also has.

  (c) Parse the upload-pack response:

      If the status pkt-line is "status pack:"

      Process the pack stream and update the local refs.

      If the status pkt-line is "status continue":

      Reset COMMON to the items in S_COMMON.  The new S_COMMON
      should be a superset of the existing COMMON set.

      Remove all items in S_COMMON, and all of their ancestors,
      from PENDING.

      Do another compute step.

Capability include-tag
~~~~~~~~~~~~~~~~~~~~~~

When packing an object that an annotated tag points at, include
the tag object too.  Clients can request this if they want to
fetch tags, but don't know which tags they will need until after
they receive the branch data.  By enabling include-tag an entire
call to upload-pack can be avoided.

Capability thin-pack
~~~~~~~~~~~~~~~~~~~~

When packing a deltified object the base is not included if the
base is reachable from an object listed in the COMMON set by the
client.  This reduces the bandwidth required to transfer, but it
does slightly increase processing time for the client to save the
pack to disk.

Service receive-pack
--------------------

Uploads a pack and updates refs.

URL: $url/backend.git-http/receive-pack
Content-Type: application/x-git; service=receive-pack

The start of the stream is the commands to update the refs and
the remainder of the stream is the pack file itself.  See
git-receive-pack and its network protocol in pack-protocol.txt,
as this is essentially the same.

	C: 006395dcfa3633004da0049d3d0fa03f80589cbcaf31 d049f6c27a2244e12041955e262a404c7faba355 refs/heads/maint
	C: 0000
	C: PACK...

	S: 0005
	S: ...<output of receive-pack>...

The capabilities are handled exactly as in the fetch protocol,
however the server may reject a pack and its associated commands
if an invalid capability request is made by the client, or the
client has assumed a pack capability that the server does not
have support for.  In the latter case the server must still send
the capabilities key in the response so the client can correct
itself and try again.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-08-29 17:39                       ` Shawn O. Pearce
@ 2008-08-29 19:55                         ` Nicolas Pitre
  2008-09-01 16:05                         ` Tarmigan
  1 sibling, 0 replies; 57+ messages in thread
From: Nicolas Pitre @ 2008-08-29 19:55 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Junio C Hamano, H. Peter Anvin, git

On Fri, 29 Aug 2008, Shawn O. Pearce wrote:

> Yet another draft follows.  I believe that I have covered all
> comments with this draft.  But I welcome any additional ones,
> as thus far it has been a very constructive process.
> 
> The updated protocol looks more like the current native protocol
> does.  This should make it easier to reuse code between the two
> protocol implementations.
[...]

> pkt-line Format
> ---------------
> 
> Much of the payload is described around pkt-lines.
> 
> A pkt-line is a variable length binary string.  The first four bytes
> of the line indicates the total length of the line, in hexadecimal.
> The total length includes the 4 bytes used to denote the length.  A
> line is usually terminated by an LF, which must be included in the
> total length if present.
> 
> Binary data is permitted within a pkt-line so implementors should
> ensure their pkt-line parsing/formatting routines are 8-bit clean.
> The maximum length of a pkt-line's data is 65532 bytes (65536 - 4).

Shouldn't that be 65531, since you cannot represent 65536 with 4 hex 
digits?

> 	C: 001bcapability include-tag
> 	C: 0019capability thin-pack
> 	....
[...]
>      The "capability" command requests a single protocol feature
>      to be enabled by the server.  Typically these are used to
>      describe aspects of the pack that will be returned.  See
>      below for more details on the current capabilities.

Why not having all capabilities listed at once on a single line instead?  
That's more or less what the current protocol does already.


Nicolas

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-08-29 17:39                       ` Shawn O. Pearce
  2008-08-29 19:55                         ` Nicolas Pitre
@ 2008-09-01 16:05                         ` Tarmigan
  2008-09-01 16:13                           ` Tarmigan
  2008-09-02  6:06                           ` Shawn O. Pearce
  1 sibling, 2 replies; 57+ messages in thread
From: Tarmigan @ 2008-09-01 16:05 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Junio C Hamano, H. Peter Anvin, git

On Fri, Aug 29, 2008 at 10:39 AM, Shawn O. Pearce <spearce@spearce.org> wrote:
> Yet another draft follows.  I believe that I have covered all
> comments with this draft.  But I welcome any additional ones,
> as thus far it has been a very constructive process.

Sorry I'm jumping into this a bit late, but something just occurred to me.

>
> The updated protocol looks more like the current native protocol
> does.  This should make it easier to reuse code between the two
> protocol implementations.
>
> --8<--
> Smart HTTP transfer protocols

[...]

> HTTP Redirects
> --------------
>
> If a POST request results in an HTTP 302 or 303 redirect response
> clients should retry the request by updating the URL and POSTing
> the same request to the new location.  Subsequent requests should
> still be sent to the original URL.
>
> This redirect behavior is unrelated to the in-payload redirect
> that is described below in "Service show-ref".

I just want to see smart http could support a new feature (please yell
if git:// already supports this and I am not aware of it).   The idea
is from http://lkml.org/lkml/2008/8/21/347, the relevant portion
being:

Greg KH wrote:
>David Vrabel wrote:
>> Or you can pull the changes from the uwb branch of
>>
>> git://pear.davidvrabel.org.uk/git/uwb.git
>>
>> (Please don't clone the entire tree from here as I have very limited
>> bandwidth.)
>
> If this is an issue, I think you can use the --reference option to
> git-clone when creating the tree to reference an external tree (like
> Linus's).  That way you don't have the whole tree on your server for
> stuff like this.

I do not believe that the server (either git:// or http://) can
currently be setup with --reference to redirect to another server for
certain refs, but perhaps with smart http and the POST 302/303
redirect responses, this would now be possible as a way to reduce
bandwidth for people's home servers?  I have also seen similar
requests before ("don't pull the whole kernel from me, just add my
repo as a remote after you've cloned linus-2.6"), so for larger
projects, it might be a nice feature.  Would that be something
desirable to support?

Would the current proposal be able to support this kind of partial
redirect?  I don't quite see how it would, but it seems very close.
Perhaps if the show-ref redirect could appear partway through the
show-ref response and then the client could go off, fetch the some
refs from that server and then return to the original server for the
remainder?  Or maybe in the upload-pack negotiations, there could be a
special redirect command as part of the "status continue" response
that told the client to run off and look for a specific sha at another
url?  Something like

status continue

 S: 0014status continue
       S: 0034common <S_COMMON #1>...........................
       S: 0034common <S_COMMON #2>...........................
       ...


Otherwise, it looks very cool, but I have a few more minor questions
to help my general understanding...

>     If the client has sent 256 HAVE commits and has not yet
>     received one of those back from S_COMMON, or the client
>     has emptied C_PENDING it should include a "give-up"
>     command to let the server know it won't proceed:
>
>        C: 000cgive-up

What does the server do after a 000cgive-up ?  Does the server send
back a complete pack (like a new clone) or if not, how does clone work
over smart http?  Does that mean that if I fall more than 256 commits
behind, I have to redownload the whole repo?  Or am I missing
something about the the C_PENDING commits being sparse and doing some
kind of smart back-off (I'm not at all familiar with the existing
receive-pack/upload-pack)?

>  (s) Parse the upload-pack request:
>
>      Verify all objects in WANT are reachable from refs.  As
>      this may require walking backwards through history to
>      the very beginning on invalid requests the server may
>      use a reasonable limit of commits (e.g. 1000) walked
>      beyond any ref tip before giving up.
>
>      If no WANT objects are received, send an error:
>
>        S: 0019status error no want
>
>      If any WANT object is not reachable, send an error:
>
>        S: 001estatus error invalid want

So again, if the client falls more than 1000 commits behind (not hard
to do for example during the linux merge window), and then the client
WANTs HEAD^1001, what happens?  Does the get nothing from the server,
or does the client essentially reclone, or I am missing something?

>  (s) Send the upload-pack response:
>
>     If the server has found a closed set of objects to pack or the
>     request contains "give-up", it replies with the pack and the
>     enabled capabilities.  The set of enabled capabilities is limited
>     to the intersection of what the client requested and what the
>     server supports.
>
>        S: 0010status pack
>        C: 001bcapability include-tag
>        C: 0019capability thin-pack
>        S: 000c.PACK...

Should these be all S: ... ?

Thanks,
Tarmigan

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-09-01 16:05                         ` Tarmigan
@ 2008-09-01 16:13                           ` Tarmigan
  2008-09-02  6:06                           ` Shawn O. Pearce
  1 sibling, 0 replies; 57+ messages in thread
From: Tarmigan @ 2008-09-01 16:13 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Junio C Hamano, H. Peter Anvin, git

(Oops, hit send too early by mistake, so some of my thoughts were incomplete)

On Mon, Sep 1, 2008 at 9:05 AM, Tarmigan <tarmigan+git@gmail.com> wrote:
> On Fri, Aug 29, 2008 at 10:39 AM, Shawn O. Pearce <spearce@spearce.org> wrote:
>> Yet another draft follows.  I believe that I have covered all
>> comments with this draft.  But I welcome any additional ones,
>> as thus far it has been a very constructive process.
>
> Sorry I'm jumping into this a bit late, but something just occurred to me.
>
>>
>> The updated protocol looks more like the current native protocol
>> does.  This should make it easier to reuse code between the two
>> protocol implementations.
>>
>> --8<--
>> Smart HTTP transfer protocols
>
> [...]
>
>> HTTP Redirects
>> --------------
>>
>> If a POST request results in an HTTP 302 or 303 redirect response
>> clients should retry the request by updating the URL and POSTing
>> the same request to the new location.  Subsequent requests should
>> still be sent to the original URL.
>>
>> This redirect behavior is unrelated to the in-payload redirect
>> that is described below in "Service show-ref".
>
> I just want to see smart http could support a new feature (please yell
> if git:// already supports this and I am not aware of it).   The idea
> is from http://lkml.org/lkml/2008/8/21/347, the relevant portion
> being:
>
> Greg KH wrote:
>>David Vrabel wrote:
>>> Or you can pull the changes from the uwb branch of
>>>
>>> git://pear.davidvrabel.org.uk/git/uwb.git
>>>
>>> (Please don't clone the entire tree from here as I have very limited
>>> bandwidth.)
>>
>> If this is an issue, I think you can use the --reference option to
>> git-clone when creating the tree to reference an external tree (like
>> Linus's).  That way you don't have the whole tree on your server for
>> stuff like this.
>
> I do not believe that the server (either git:// or http://) can
> currently be setup with --reference to redirect to another server for
> certain refs, but perhaps with smart http and the POST 302/303
> redirect responses, this would now be possible as a way to reduce
> bandwidth for people's home servers?  I have also seen similar
> requests before ("don't pull the whole kernel from me, just add my
> repo as a remote after you've cloned linus-2.6"), so for larger
> projects, it might be a nice feature.  Would that be something
> desirable to support?
>
> Would the current proposal be able to support this kind of partial
> redirect?  I don't quite see how it would, but it seems very close.
> Perhaps if the show-ref redirect could appear partway through the
> show-ref response and then the client could go off, fetch the some
> refs from that server and then return to the original server for the
> remainder?  Or maybe in the upload-pack negotiations, there could be a
> special redirect command as part of the "status continue" response
> that told the client to run off and look for a specific sha at another
> url?  Something like
>
> status continue
>
>  S: 0014status continue
>       S: 0034common <S_COMMON #1>...........................
>       S: 0034common <S_COMMON #2>...........................
>       ...

I meant to write:

         S: 0014status continue
         S: 0034common <S_COMMON #1>...........................
         S: 0034common <S_COMMON #2>...........................
         S: 00xxredirect <WILL_BE_COMMON> <REMOTE_URL>

and then the client could go try the remote url, fetch that SHA and
ancestors, and then resume the upload pack negotiations with
<WILL_BE_COMMON> among the <COMMON> commits.   Obviously it's still
somewhat of a half baked idea, and would probably need some kind of
fallback, but does that seem like a reasonable thing to do and a
reasonable way to do it?

>
> Otherwise, it looks very cool, but I have a few more minor questions
> to help my general understanding...
>
>>     If the client has sent 256 HAVE commits and has not yet
>>     received one of those back from S_COMMON, or the client
>>     has emptied C_PENDING it should include a "give-up"
>>     command to let the server know it won't proceed:
>>
>>        C: 000cgive-up
>
> What does the server do after a 000cgive-up ?  Does the server send
> back a complete pack (like a new clone) or if not, how does clone work
> over smart http?  Does that mean that if I fall more than 256 commits
> behind, I have to redownload the whole repo?  Or am I missing
> something about the the C_PENDING commits being sparse and doing some
> kind of smart back-off (I'm not at all familiar with the existing
> receive-pack/upload-pack)?
>
>>  (s) Parse the upload-pack request:
>>
>>      Verify all objects in WANT are reachable from refs.  As
>>      this may require walking backwards through history to
>>      the very beginning on invalid requests the server may
>>      use a reasonable limit of commits (e.g. 1000) walked
>>      beyond any ref tip before giving up.
>>
>>      If no WANT objects are received, send an error:
>>
>>        S: 0019status error no want
>>
>>      If any WANT object is not reachable, send an error:
>>
>>        S: 001estatus error invalid want
>
> So again, if the client falls more than 1000 commits behind (not hard
> to do for example during the linux merge window), and then the client
> WANTs HEAD^1001, what happens?  Does the get nothing from the server,
> or does the client essentially reclone, or I am missing something?
>
>>  (s) Send the upload-pack response:
>>
>>     If the server has found a closed set of objects to pack or the
>>     request contains "give-up", it replies with the pack and the
>>     enabled capabilities.  The set of enabled capabilities is limited
>>     to the intersection of what the client requested and what the
>>     server supports.
>>
>>        S: 0010status pack
>>        C: 001bcapability include-tag
>>        C: 0019capability thin-pack
>>        S: 000c.PACK...
>
> Should these be all S: ... ?

Thanks,
Tarmigan

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-09-01 16:05                         ` Tarmigan
  2008-09-01 16:13                           ` Tarmigan
@ 2008-09-02  6:06                           ` Shawn O. Pearce
  2008-09-02  6:09                             ` H. Peter Anvin
  2008-09-02 18:20                             ` Tarmigan
  1 sibling, 2 replies; 57+ messages in thread
From: Shawn O. Pearce @ 2008-09-02  6:06 UTC (permalink / raw)
  To: Tarmigan; +Cc: Junio C Hamano, H. Peter Anvin, git

Tarmigan <tarmigan+git@gmail.com> wrote:
> On Fri, Aug 29, 2008 at 10:39 AM, Shawn O. Pearce <spearce@spearce.org> wrote:
> 
> I just want to see smart http could support a new feature (please yell
> if git:// already supports this and I am not aware of it).   The idea
> is from http://lkml.org/lkml/2008/8/21/347, the relevant portion
> being:
> 
> Greg KH wrote:
> >David Vrabel wrote:
> >> Or you can pull the changes from the uwb branch of
> >>
> >> git://pear.davidvrabel.org.uk/git/uwb.git
> >>
> >> (Please don't clone the entire tree from here as I have very limited
> >> bandwidth.)
> >
> > If this is an issue, I think you can use the --reference option to
> > git-clone when creating the tree to reference an external tree (like
> > Linus's).  That way you don't have the whole tree on your server for
> > stuff like this.
> 
> I do not believe that the server (either git:// or http://) can
> currently be setup with --reference to redirect to another server for
> certain refs,

Correct.  Today _none_ of the transport protocols allow the server
to force the client to use some sort of reference repository for an
initial clone.  There are likely two reasons for this:

 *) Its a lot simpler to program to just get everything from
    one location.

 *) If you really are forking an open source project then in
    some cases you may need to distribute the full source,
	not your delta.  You may just as well distribute the full
	source and call it a day.

The dumb http:// currently supports getting packs from a remote HTTP
server via its objects/info/http-alternates.  But the native and
rsync protocols don't support that.  The logic behind http-alternates
isn't to allow moving load onto a different server, but to make
the locally available alternate repository available through the
same web server.  The path on the UNIX filesystem that is used in
objects/info/alternates may not be the same path used in the web
server's namespace.

> but perhaps with smart http and the POST 302/303
> redirect responses, this would now be possible as a way to reduce
> bandwidth for people's home servers?  I have also seen similar
> requests before ("don't pull the whole kernel from me, just add my
> repo as a remote after you've cloned linus-2.6"), so for larger
> projects, it might be a nice feature.  Would that be something
> desirable to support?

I think this isn't a bad idea, but I'd rather have the server say
"In order to talk to me you need at least these objects in common
with me: ...".  If you don't have those then the user should go
find it on their own, rather than forcing them to a particular URL
and automatically following it.

I'm a little concerned about a US user putting a US mirror site
of kernel.org into the server and forcing a user in India to do a
full clone over the Atlantic links when they could have just used
a more local mirror for that initial "linus-2.6" clone.
 
> Otherwise, it looks very cool, but I have a few more minor questions
> to help my general understanding...
> 
> >     If the client has sent 256 HAVE commits and has not yet
> >     received one of those back from S_COMMON, or the client
> >     has emptied C_PENDING it should include a "give-up"
> >     command to let the server know it won't proceed:
> >
> >        C: 000cgive-up
> 
> What does the server do after a 000cgive-up ?  Does the server send
> back a complete pack (like a new clone) or if not, how does clone work
> over smart http?

When the server receives a "give-up" it needs to create a pack
based on "git rev-list --objects-boundary $WANT --not $COMMON".
If the set $COMMON is non-empty then its a partial pack; if $COMMON
is empty then its a full clone.  This is what the native protocol
does when the client gives up.

> Does that mean that if I fall more than 256 commits
> behind, I have to redownload the whole repo?

You are thinking the wrong way.  If you have more than 256 commits
that the other side doesn't have you may give up too early.
For that to be true you need to create 256 commits locally that
aren't on the remote peer and whose timestamps are all ahead of
the commits you last fetched from the remote peer.

Yes, it can happen.  But its less likely than you think because
we're talking about you doing 256 commits worth of development and
not picking up any new commits from remote peers in the middle of
that time period.  Get just one and it resets the counter back to
0 and allows it to try another 256 commits before giving up.

I should amend this section to talk about what giving up here
really means.  If we have nothing sent in common yet or maybe
very little sent in common we may have existing remote refs tied
to this URL in .git/config that can send, and we may have one or
more annotated tags that we know for a fact are in common as both
peers have the same tag name pointing to the same tag object.

A smart(er) client might try to toss some recently dated annotated
tags at the server before throwing a give-up if it would otherwise
throw a give-up.  Its likely to narrow the result set, and doesn't
hurt if it doesn't.

> >  (s) Parse the upload-pack request:
> >
> >      Verify all objects in WANT are reachable from refs.  As
> >      this may require walking backwards through history to
> >      the very beginning on invalid requests the server may
> >      use a reasonable limit of commits (e.g. 1000) walked
> >      beyond any ref tip before giving up.
> >
> >      If no WANT objects are received, send an error:
> >
> >        S: 0019status error no want
> >
> >      If any WANT object is not reachable, send an error:
> >
> >        S: 001estatus error invalid want
> 
> So again, if the client falls more than 1000 commits behind (not hard
> to do for example during the linux merge window), and then the client
> WANTs HEAD^1001, what happens?  Does the get nothing from the server,
> or does the client essentially reclone, or I am missing something?

Oh, this is a live-lock condition.  If the client grabs the list of
refs from the server, then has to wait 100 ms to get back to the
server and start upload-pack (due to latency) and in that 100ms
window Linus shoves a 1001 commit merge into his tree then yes,
the server may abort and tell the client "error invalid want".

At which point the client may try to restart from the beginning,
or just plain give up and tell the end user try again later.

This condition of 1000 is just some aribtrary limit to allow the
client to still continue with an in-progress download if right in
the middle of the client's RPCs the remote was modified by its owner.
 
> >  (s) Send the upload-pack response:
> >
> >     If the server has found a closed set of objects to pack or the
> >     request contains "give-up", it replies with the pack and the
> >     enabled capabilities.  The set of enabled capabilities is limited
> >     to the intersection of what the client requested and what the
> >     server supports.
> >
> >        S: 0010status pack
> >        C: 001bcapability include-tag
> >        C: 0019capability thin-pack
> >        S: 000c.PACK...
> 
> Should these be all S: ... ?

Yes, thanks.  I will make the correction.  Damn copy and paste.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-09-02  6:06                           ` Shawn O. Pearce
@ 2008-09-02  6:09                             ` H. Peter Anvin
  2008-09-02  6:13                               ` Shawn O. Pearce
  2008-09-02 18:20                             ` Tarmigan
  1 sibling, 1 reply; 57+ messages in thread
From: H. Peter Anvin @ 2008-09-02  6:09 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Tarmigan, Junio C Hamano, git

Shawn O. Pearce wrote:
> 
> Correct.  Today _none_ of the transport protocols allow the server
> to force the client to use some sort of reference repository for an
> initial clone.  There are likely two reasons for this:
> 
>  *) Its a lot simpler to program to just get everything from
>     one location.
> 
>  *) If you really are forking an open source project then in
>     some cases you may need to distribute the full source,
> 	not your delta.  You may just as well distribute the full
> 	source and call it a day.
> 

3) it encourages single points of failure.

	-hpa

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-09-02  6:09                             ` H. Peter Anvin
@ 2008-09-02  6:13                               ` Shawn O. Pearce
  0 siblings, 0 replies; 57+ messages in thread
From: Shawn O. Pearce @ 2008-09-02  6:13 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Tarmigan, Junio C Hamano, git

"H. Peter Anvin" <hpa@zytor.com> wrote:
> Shawn O. Pearce wrote:
>>
>> Correct.  Today _none_ of the transport protocols allow the server
>> to force the client to use some sort of reference repository for an
>> initial clone.  There are likely two reasons for this:
>>
>>  *) Its a lot simpler to program to just get everything from
>>     one location.
>>
>>  *) If you really are forking an open source project then in
>>     some cases you may need to distribute the full source,
>> 	not your delta.  You may just as well distribute the full
>> 	source and call it a day.
>>
>
> 3) it encourages single points of failure.

Or bad network usage, as I pointed out later about an India user
unknowingly being forced into a US based mirror when another was
closer to them.

I didn't make it clear in my response but I'm really against our
protocol having this sort of explicit redirect.  I'd rather put a
requirement in that says "Unless you have X,Y,Z in common with me
(directly or indirectly) I'm just not going to give you a pack".

FWIW that fixes an issue for me at day-job that people will be
cursing about later this year in public.  Not my fault.  We would
all rather just publish the entire repository.  Instead we have
to publish something that requires the user to clone it from
another source first, and use fetch or "clone --reference" to get
our updates.  *sigh*

-- 
Shawn.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport
  2008-09-02  6:06                           ` Shawn O. Pearce
  2008-09-02  6:09                             ` H. Peter Anvin
@ 2008-09-02 18:20                             ` Tarmigan
  1 sibling, 0 replies; 57+ messages in thread
From: Tarmigan @ 2008-09-02 18:20 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Junio C Hamano, H. Peter Anvin, git

On Mon, Sep 1, 2008 at 11:06 PM, Shawn O. Pearce <spearce@spearce.org> wrote:
>> What does the server do after a 000cgive-up ?  Does the server send
>> back a complete pack (like a new clone) or if not, how does clone work
>> over smart http?
>
> When the server receives a "give-up" it needs to create a pack
> based on "git rev-list --objects-boundary $WANT --not $COMMON".
> If the set $COMMON is non-empty then its a partial pack; if $COMMON
> is empty then its a full clone.  This is what the native protocol
> does when the client gives up.

OK, that makes sense now.

>> Does that mean that if I fall more than 256 commits
>> behind, I have to redownload the whole repo?
>
> You are thinking the wrong way.  If you have more than 256 commits
> that the other side doesn't have you may give up too early.
> For that to be true you need to create 256 commits locally that
> aren't on the remote peer and whose timestamps are all ahead of
> the commits you last fetched from the remote peer.
>
> Yes, it can happen.  But its less likely than you think because
> we're talking about you doing 256 commits worth of development and
> not picking up any new commits from remote peers in the middle of
> that time period.  Get just one and it resets the counter back to
> 0 and allows it to try another 256 commits before giving up.
>
> I should amend this section to talk about what giving up here
> really means.  If we have nothing sent in common yet or maybe
> very little sent in common we may have existing remote refs tied
> to this URL in .git/config that can send, and we may have one or
> more annotated tags that we know for a fact are in common as both
> peers have the same tag name pointing to the same tag object.
>
> A smart(er) client might try to toss some recently dated annotated
> tags at the server before throwing a give-up if it would otherwise
> throw a give-up.  Its likely to narrow the result set, and doesn't
> hurt if it doesn't.

Yes, throwing in tags and remotes as a last resort sounds like a good idea.

>> So again, if the client falls more than 1000 commits behind (not hard
>> to do for example during the linux merge window), and then the client
>> WANTs HEAD^1001, what happens?  Does the get nothing from the server,
>> or does the client essentially reclone, or I am missing something?
>
> Oh, this is a live-lock condition.  If the client grabs the list of
> refs from the server, then has to wait 100 ms to get back to the
> server and start upload-pack (due to latency) and in that 100ms
> window Linus shoves a 1001 commit merge into his tree then yes,
> the server may abort and tell the client "error invalid want".

Ahh, now I get it.  Somehow I forgot that the WANTs were only boundary
commits and not a list of all the commits that the client wants.

On Mon, Sep 1, 2008 at 11:13 PM, Shawn O. Pearce <spearce@spearce.org> wrote:
> "H. Peter Anvin" <hpa@zytor.com> wrote:
>> Shawn O. Pearce wrote:
>>>
>>> Correct.  Today _none_ of the transport protocols allow the server
>>> to force the client to use some sort of reference repository for an
>>> initial clone.  There are likely two reasons for this:
>>>
>>>  *) Its a lot simpler to program to just get everything from
>>>     one location.
>>>
>>>  *) If you really are forking an open source project then in
>>>     some cases you may need to distribute the full source,
>>>      not your delta.  You may just as well distribute the full
>>>      source and call it a day.
>>>
>>
>> 3) it encourages single points of failure.
>
> Or bad network usage, as I pointed out later about an India user
> unknowingly being forced into a US based mirror when another was
> closer to them.
>
> I didn't make it clear in my response but I'm really against our
> protocol having this sort of explicit redirect.  I'd rather put a
> requirement in that says "Unless you have X,Y,Z in common with me
> (directly or indirectly) I'm just not going to give you a pack".

OK, this all makes sense. http:// and git:// are probably the wrong
protocols to reduce bandwidth for the server for new clones.  Long
term, maybe gittorrent will be the right solution...

Thanks,
Tarmigan

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport docs
  2008-08-26  1:26 Git-aware HTTP transport Shawn O. Pearce
  2008-08-26  2:34 ` H. Peter Anvin
@ 2013-02-13  1:34 ` H. Peter Anvin
  2013-02-13  2:23   ` Scott Chacon
  1 sibling, 1 reply; 57+ messages in thread
From: H. Peter Anvin @ 2013-02-13  1:34 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: git

Hi Shawn,

You wrote a really great protocol spec for the smart HTTP protocol back
in the day.  It would be really great if it could be checked into the
git repository (updated if need be).  Someone mentioned today trying to
reverse-engineer the protocol because of a lack of specs, and I was a
bit surprised to day the least.

	-hpa

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport docs
  2013-02-13  1:34 ` Git-aware HTTP transport docs H. Peter Anvin
@ 2013-02-13  2:23   ` Scott Chacon
  2013-02-13 15:29     ` Junio C Hamano
  0 siblings, 1 reply; 57+ messages in thread
From: Scott Chacon @ 2013-02-13  2:23 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Shawn O. Pearce, git list

I don't believe it was ever merged into the Git docs.  I have a copy of it here:

https://www.dropbox.com/s/pwawp8kmwgyc3w2/http-protocol.txt

Scott

On Tue, Feb 12, 2013 at 5:34 PM, H. Peter Anvin <hpa@zytor.com> wrote:
> Hi Shawn,
>
> You wrote a really great protocol spec for the smart HTTP protocol back
> in the day.  It would be really great if it could be checked into the
> git repository (updated if need be).  Someone mentioned today trying to
> reverse-engineer the protocol because of a lack of specs, and I was a
> bit surprised to day the least.
>
>         -hpa
>
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: Git-aware HTTP transport docs
  2013-02-13  2:23   ` Scott Chacon
@ 2013-02-13 15:29     ` Junio C Hamano
  0 siblings, 0 replies; 57+ messages in thread
From: Junio C Hamano @ 2013-02-13 15:29 UTC (permalink / raw)
  To: Scott Chacon; +Cc: H. Peter Anvin, Shawn O. Pearce, git list

Scott Chacon <schacon@gmail.com> writes:

> I don't believe it was ever merged into the Git docs.  I have a copy of it here:
>
> https://www.dropbox.com/s/pwawp8kmwgyc3w2/http-protocol.txt

Thanks for a pointer.  It seems that it wasn't in a shape ready to
be "merged" yet.

Does somebody want to pick it up and polish it further?

^ permalink raw reply	[flat|nested] 57+ messages in thread

end of thread, other threads:[~2013-02-13 15:29 UTC | newest]

Thread overview: 57+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-08-26  1:26 Git-aware HTTP transport Shawn O. Pearce
2008-08-26  2:34 ` H. Peter Anvin
2008-08-26  3:45   ` Shawn O. Pearce
2008-08-26  3:59     ` david
2008-08-26  4:15       ` H. Peter Anvin
2008-08-26  4:25         ` david
2008-08-26  4:42           ` H. Peter Anvin
2008-08-26  4:45           ` Imran M Yousuf
2008-08-26 17:01       ` Nicolas Pitre
2008-08-26 17:03         ` Shawn O. Pearce
2008-08-26  4:14     ` H. Peter Anvin
2008-08-26 14:58   ` Shawn O. Pearce
2008-08-26 16:14     ` Shawn O. Pearce
2008-08-26 16:33     ` H. Peter Anvin
2008-08-26 17:26       ` Shawn O. Pearce
2008-08-26 22:38         ` H. Peter Anvin
2008-08-27  2:51           ` Imran M Yousuf
2008-08-28  3:50           ` Shawn O. Pearce
2008-08-28  4:37             ` H. Peter Anvin
2008-08-28  4:42               ` Shawn O. Pearce
2008-08-28  4:58                 ` H. Peter Anvin
2008-08-28  6:40               ` Imran M Yousuf
2008-08-28  4:42             ` Junio C Hamano
2008-08-28 14:57               ` Shawn O. Pearce
2008-08-28 17:26                 ` david
2008-08-28 17:28                   ` Shawn O. Pearce
2008-08-28 17:37                     ` david
2008-08-28 17:38                       ` Daniel Stenberg
2008-08-28 17:43                       ` Shawn O. Pearce
2008-08-28 17:47                         ` H. Peter Anvin
2008-08-28 18:04                       ` Mike Hommey
2008-08-28 17:43                   ` H. Peter Anvin
2008-08-28 18:12                     ` david
2008-08-28 18:14                       ` H. Peter Anvin
2008-08-28 18:18                         ` david
2008-08-29  4:02                 ` Junio C Hamano
2008-08-29  5:11                   ` H. Peter Anvin
2008-08-29  6:50                     ` Junio C Hamano
2008-08-29 17:39                       ` Shawn O. Pearce
2008-08-29 19:55                         ` Nicolas Pitre
2008-09-01 16:05                         ` Tarmigan
2008-09-01 16:13                           ` Tarmigan
2008-09-02  6:06                           ` Shawn O. Pearce
2008-09-02  6:09                             ` H. Peter Anvin
2008-09-02  6:13                               ` Shawn O. Pearce
2008-09-02 18:20                             ` Tarmigan
2008-08-28 17:05               ` H. Peter Anvin
2008-08-28 17:10                 ` Shawn O. Pearce
2008-08-28 17:20                   ` H. Peter Anvin
2008-08-28 17:26                     ` Shawn O. Pearce
2008-08-28 17:44                       ` H. Peter Anvin
2008-08-28 17:46                         ` Shawn O. Pearce
2008-08-28 18:40                       ` Nicolas Pitre
2008-08-28 18:47                         ` H. Peter Anvin
2013-02-13  1:34 ` Git-aware HTTP transport docs H. Peter Anvin
2013-02-13  2:23   ` Scott Chacon
2013-02-13 15:29     ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).