All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Add git-bundle: move objects and references by archive.
@ 2007-02-18 22:47 Mark Levedahl
  2007-02-19  1:07 ` Johannes Schindelin
  2007-02-19  1:49 ` Junio C Hamano
  0 siblings, 2 replies; 32+ messages in thread
From: Mark Levedahl @ 2007-02-18 22:47 UTC (permalink / raw)
  To: junkio; +Cc: git, Mark Levedahl

Some workflows require use of repositories on machines that cannot be
connected, preventing use of git-fetch / git-push to transport objects
and references between the repositories.

git-bundle provides an alternate transport mechanism, effectively allowing
git-fetch and git-pull to operate using sneakernet transport. git-bundle
--create allows the user to create a bundle containing one or more branches
or tags, but with specified basis assumed to exist on the target repository.
At the receiving end, git-bundle acts like git-fetch-pack, allowing the
user to invoke git-fetch or git-pull using the bundle file as the URL.
git-fetch and git-ls-remote determine they have a bundle URL by checking
that the URL points to a file, but are otherwise unchanged in operation
with bundles.

Signed-off-by: Mark Levedahl <mdl123@verizon.net>
---
 Documentation/cmd-list.perl  |    1 +
 Documentation/git-bundle.txt |  146 +++++++++++++++++++++++++++++++++++
 Makefile                     |    3 +-
 git-bundle.sh                |  174 ++++++++++++++++++++++++++++++++++++++++++
 git-fetch.sh                 |   11 ++-
 git-ls-remote.sh             |    7 ++-
 6 files changed, 338 insertions(+), 4 deletions(-)
 mode change 100755 => 100644 Documentation/cmd-list.perl
 create mode 100644 Documentation/git-bundle.txt
 create mode 100755 git-bundle.sh

diff --git a/Documentation/cmd-list.perl b/Documentation/cmd-list.perl
old mode 100755
new mode 100644
index a2d6268..f61c77a
--- a/Documentation/cmd-list.perl
+++ b/Documentation/cmd-list.perl
@@ -70,6 +70,7 @@ git-archive                             mainporcelain
 git-bisect                              mainporcelain
 git-blame                               ancillaryinterrogators
 git-branch                              mainporcelain
+git-bundle                              mainporcelain
 git-cat-file                            plumbinginterrogators
 git-checkout-index                      plumbingmanipulators
 git-checkout                            mainporcelain
diff --git a/Documentation/git-bundle.txt b/Documentation/git-bundle.txt
new file mode 100644
index 0000000..27db785
--- /dev/null
+++ b/Documentation/git-bundle.txt
@@ -0,0 +1,146 @@
+git-bundle(1)
+=============
+
+NAME
+----
+git-bundle - Move objects and refs by archive
+
+
+SYNOPSIS
+--------
+'git-bundle' --create file <git-rev-list args> <--tar tarspec>
+'git-bundle' --verify file  <--tar tarspec>
+'git-bundle' --list-heads file <reflist> <--tar tarspec>
+'git-bundle' --unbundle file <reflist> <--tar tarspec>
+
+DESCRIPTION
+-----------
+
+Some workflows require that one or more branches of development on one
+machine be replicated on another machine, but the two machines cannot
+be directly connected so the interactive git protocols (git, ssh,
+rsync, http) cannot be used.  This command provides suport for
+git-fetch and git-pull to operate by packaging objects and references
+in an archive at the originating machine, then importing those into
+another repository using gitlink:git-fetch[1] and gitlink:git-pull[1]
+after moving the archive by some means (i.e., by sneakernet).  As no
+direct connection between repositories exists, the user must specify a
+basis for the bundle that is held by the destination repository: the
+bundle assumes that all objects in the basis are already in the
+destination repository.
+
+OPTIONS
+-------
+
+--create file::
+       Used to create a bundle named 'file'.  This requires the
+       git-rev-list arguments to define the bundle contents.
+
+--verify file::
+       Used to check that a bundle file is valid and will apply
+       cleanly to the current repository.  This includes checks on the
+       bundle format itself as well as checking that the prerequisite
+       commits exist and are fully linked in the current repository.
+       git-bundle prints a list of missing commits, if any, and exits
+       with non-zero status.
+
+--list-heads file::
+       Lists the references defined in the bundle.  If followed by a
+       list of references, only references matching those given are
+       printed out.
+
+--unbundle file::
+       Passes the objects in the bundle to gitlink:git-index-pack[1]
+       for storage in the repository, then prints the names of all
+       defined references. If a reflist is given, only references
+       matching those in the given list are printed. This command is
+       really plumbing, intended to be called only by
+       gitlink:git-fetch[1].
+
+git-rev-list args::
+       A list of arguments, accepatble to git-rev-parse and
+       git-rev-list, that specify the specific objects and references
+       to transport.  For example, "master~10..master" causes the
+       current master reference to be packaged along with all objects
+       added since its 10th ancestor commit.  There is no explicit
+       limit to the number of references and objects that may be
+       packaged.
+
+
+reflist::
+       A list of references used to limit the references reported as
+       available. This is principally of use to git-fetch, which
+       expects to recieve only those references asked for and not
+       necessarily everything in the pack (in this case, git-bundle is
+       acting like gitlink:git-fetch-pack[1]).
+
+tar tarspec::
+
+       git-bundle uses tar, and requires a tar supporting c, f, and
+       -O. By default, git-bundle looks for gtar on the path, then for
+       tar if gtar is not found. This can be overridden by explicitly
+       defining tarspec, or by defining TAR in the environment.
+
+SPECIFYING REFERENCES
+---------------------
+
+git-bundle will only package references that are shown by
+git-show-ref: this includes heads, tags, and remote heads.  References
+such as master~1 cannot be packaged, but are perfectly suitable for
+defining the basis.  More than one reference may be packaged, and more
+than one basis can be specified.  The objects packaged are those not
+contained in the union of the given bases.  Each basis can be
+specified explicitly (e.g., ^master~10), or implicitly (e.g.,
+master~10..master, master --since=10.days.ago).
+
+It is very important that the basis used be held by the destination.
+It is ok to err on the side of conservatism, causing the bundle file
+to contain objects already in the destination as these are ignored
+when unpacking at the destination.
+
+EXAMPLE
+-------
+
+Assume two repositories exist as R1 on machine A, and R2 on machine B.
+For whatever reason, direct connection between A and B is not allowed,
+but we can move data from A to B via some mechanism (CD, email, etc).
+We want to update R2 with developments made on branch master in R1.
+We set a tag in R1 (lastR2bundle) after the previous such transport,
+and move it afterwards to help build the bundle.
+
+in R1 on A:
+$ git-bundle --create mybundle master ^lastR2bundle
+$ git tag -f lastR2bundle master
+
+(move mybundle from A to B by some mechanism)
+
+in R2 on B:
+$ git-bundle --verify mybundle
+$ git-fetch mybundle  refspec
+
+where refspec is refInBundle:localRef
+
+
+Also, with something like this in your config:
+
+[remote "bundle"]
+    url = /home/me/tmp/file.bdl
+    fetch = refs/heads/*:refs/remotes/origin/*
+
+You can first sneakernet the bundle file to ~/tmp/file.bdl and
+then these commands:
+
+$ git ls-remote bundle
+$ git fetch bundle
+$ git pull bundle
+
+would treat it as if it is talking with a remote side over the
+network.
+
+Author
+------
+Written by Mark Levedahl <mdl123@verizon.net>
+
+GIT
+---
+Part of the gitlink:git[7] suite
diff --git a/Makefile b/Makefile
index ebecbbd..c6d540e 100644
--- a/Makefile
+++ b/Makefile
@@ -177,7 +177,8 @@ SCRIPT_SH = \
 	git-applymbox.sh git-applypatch.sh git-am.sh \
 	git-merge.sh git-merge-stupid.sh git-merge-octopus.sh \
 	git-merge-resolve.sh git-merge-ours.sh \
-	git-lost-found.sh git-quiltimport.sh
+	git-lost-found.sh git-quiltimport.sh \
+	git-bundle.sh
 
 SCRIPT_PERL = \
 	git-add--interactive.perl \
diff --git a/git-bundle.sh b/git-bundle.sh
new file mode 100755
index 0000000..19ac137
--- /dev/null
+++ b/git-bundle.sh
@@ -0,0 +1,174 @@
+#!/bin/sh
+# Basic handler for bundle files to connect repositories via sneakernet.
+# Invocation must include action.
+# This function can create a bundle or provide information on an existing bundle
+# supporting git-fetch, git-pull, and git-ls-remote
+
+USAGE='[--create bundle <git-rev-list args>] |
+[--verify|--list-heads|--unbundle bundle] <--tar tarspec>
+     where bundle is the name of the bundle file.'
+
+verify() {
+    # Check bundle version
+    test -f "$bfile" || die "cannot find $bfile"
+    test "$($TAR -xf ""$bfile"" -O version)" = "v1 git-bundle" ||
+        die "$bfile doesn't look like a v1 bundle file."
+
+	# do fast check, then if any prereqs are missing then go line by line
+	# to be verbose about the errors
+    prereqs=$($TAR xf "$bfile" -O prerequisites)
+	test -z "$prereqs" && return 0
+	bad=$(echo "$prereqs" | cut -b-40 | git-rev-list --stdin --not --all 2>&1)
+    if test -n "$bad" ; then
+		test "$1" = "--silent" && return 1
+        echo "error: $bfile requires the following commits you lack:"
+		echo "$prereqs" |
+		while read sha1 comment ; do
+            missing=$(git-rev-list $sha1 --not --all 2>&1)
+            test -n "$missing" && echo "$sha1 $comment"
+		done
+        exit 1
+    fi
+    return 0
+}
+
+# list all or just a subset
+list_heads() {
+	if test -z "$*" ; then
+		$TAR -xf "$bfile" -O references 2>/dev/null || exit 1
+	else
+		($TAR -xf "$bfile" -O references 2>/dev/null || exit 1) |
+		while read sha1 ref ; do
+			for arg in $* ; do
+				if test "${ref%$arg}" != "$ref" ; then
+					echo "$sha1 $ref"
+					break
+				fi
+			done
+		done
+	fi
+}
+
+SUBDIRECTORY_OK=1
+. git-sh-setup
+
+args=
+action=
+while test -n "$1" ; do
+    case $1 in
+        --create|--list-heads|--unbundle|--verify)
+            action=${1#--}
+            shift
+            bfile=$1
+            test -z "$bfile" && die "$action requires filename";;
+        --tar=*)
+            TAR=${1##--tar=};;
+        --tar)
+			shift
+            TAR=$1;;
+        *)
+            args="$args $1";;
+    esac
+    shift
+done
+test -z "$action" && die "No action given, what should I do?"
+
+# what tar to use, prefer gtar, then tar.
+if test -z "$TAR" ; then
+    GTAR=$(which gtar 2>/dev/null)
+    TAR=${GTAR:-tar}
+fi
+
+case $action in
+create)
+    unknown=$(git-rev-parse --no-revs $args)
+    test -z "$unknown" || die "unknown option: $unknown"
+    gitrevargs=$(git-rev-parse --symbolic --revs-only $args) || exit 1
+
+    # find the refs to carry along and get sha1s for each.
+    refs=
+    fullrevargs=
+    for arg in $gitrevargs ; do
+        #ignore options and basis refs, get full ref name for things
+		# we will transport rejecting anything ambiguous (e.g., user
+		# gives master, have heads/master and remotes/origin/master, we
+		# keep the former).
+        case "$arg" in
+            -* | ^*) fullrevargs="$fullrevargs $arg";;
+            *)  ref=$(git-show-ref "$arg")
+                test "$(echo $ref | wc -w)" = "2" || die "Ambigous reference: $arg
+$ref"
+                fullrevargs="$fullrevargs ${ref#* }"
+                refs="$refs $ref";;
+        esac
+    done
+    test -z "$refs" && die "No references specified, I don't know what to bundle."
+
+    # git-rev-list cannot determine edge objects if a date restriction is
+    # given...  we do things a slow way if max-age or min-age are given
+    case "$fullrevargs" in
+        *--max-age* | *--min-age*)
+            # get a list of all commits that will be packed along with
+            # parents of each.  A fixed git-rev-list --boundary should
+            # replace all of this.
+            echo "Finding prerequisites and commits to bundle..."
+            commits=$(git-rev-list $fullrevargs)
+
+            # get immediate parents of each commit to include
+            parents=
+            for c in $commits ; do
+                parents="$parents $(git-rev-list --parents --max-count=1 $c | cut -b42-)"
+            done
+            parents=$(printf "%s\n" $parents | sort | uniq)
+
+            # factor out what will be in this bundle, the remainder are the
+            # bundle's prerequisites.  double up commits in this as we only
+            # want things that are only in parents to appear once
+            prereqs=$(printf "%s\n" $parents $commits $commits | \
+                sort | uniq -c | sed -ne 's/^ *1 //p');;
+        *)
+            prereqs=$(git-rev-list --objects-edge $fullrevargs | sed -ne 's/^-//p');;
+    esac
+
+	# replace prereqs with annotated version of same
+
+    # create refs and pack
+    tmp="$GIT_DIR/bundle_tmp$$"
+    prerequisites="$tmp/prerequisites"
+    references="$tmp/references"
+    version="$tmp/version"
+    pack="$tmp/pack"
+    trap 'rm -rf "$tmp"' 0 1 2 3 15
+
+    mkdir "$tmp" &&
+    echo "v1 git-bundle" > "$version" &&
+	touch "$prerequisites" &&
+	(for p in $prereqs ; do
+		git-rev-list --pretty=one --max-count=1 $p
+	done) > "$prerequisites" &&
+    git-show-ref $refs > "$references" &&
+    git-rev-list --objects $fullrevargs | cut -b-40 |
+        git pack-objects --all-progress --stdout > "$pack" &&
+
+    # create the tar file, clean up
+    cd "$tmp" &&
+    tar cf bundle prerequisites references version pack &&
+    cd - &&
+    mv "$tmp/bundle" "$bfile" &&
+    rm -rf "$tmp"
+
+    # done
+    echo "Created $bfile";;
+
+verify)
+    verify && echo "$bfile is ok";;
+
+list-heads)
+    list_heads $args;;
+
+unbundle)
+    verify --silent || exit 1
+    $TAR -xf "$bfile" -O pack | git-index-pack --stdin ||
+        die "error: $bfile has a corrupted pack file"
+    list_heads $args;;
+esac
diff --git a/git-fetch.sh b/git-fetch.sh
index ca984e7..42cc62f 100755
--- a/git-fetch.sh
+++ b/git-fetch.sh
@@ -377,8 +377,15 @@ fetch_main () {
     ( : subshell because we muck with IFS
       IFS=" 	$LF"
       (
-	  git-fetch-pack --thin $exec $keep $shallow_depth "$remote" $rref ||
-	  echo failed "$remote"
+	if test -f "$remote" ; then
+	    test -n "$shallow_depth" &&
+		die "shallow clone with bundle is not supported"
+	    git-bundle --unbundle "$remote" $rref ||
+	    echo failed "$remote"
+	else
+	    git-fetch-pack --thin $exec $keep $shallow_depth "$remote" $rref ||
+	    echo failed "$remote"
+	fi
       ) |
       (
 	trap '
diff --git a/git-ls-remote.sh b/git-ls-remote.sh
index 8ea5c5e..28bb9b8 100755
--- a/git-ls-remote.sh
+++ b/git-ls-remote.sh
@@ -89,8 +89,13 @@ rsync://* )
 	;;
 
 * )
-	git-peek-remote $exec "$peek_repo" ||
+	if test -f "$peek_repo" ; then
+		git bundle --list-heads "$peek_repo" ||
 		echo "failed	slurping"
+	else
+		git-peek-remote $exec "$peek_repo" ||
+		echo "failed	slurping"
+	fi
 	;;
 esac |
 sort -t '	' -k 2 |
-- 
1.5.0.rc4.375.gd0938-dirty

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH] Add git-bundle: move objects and references by archive.
  2007-02-18 22:47 [PATCH] Add git-bundle: move objects and references by archive Mark Levedahl
@ 2007-02-19  1:07 ` Johannes Schindelin
  2007-02-19  1:50   ` Junio C Hamano
                     ` (2 more replies)
  2007-02-19  1:49 ` Junio C Hamano
  1 sibling, 3 replies; 32+ messages in thread
From: Johannes Schindelin @ 2007-02-19  1:07 UTC (permalink / raw)
  To: Mark Levedahl; +Cc: git

Hi,

Sorry to be such a PITA, but I really, really think that it is wrong to 
make a tar dependency here. You said your cygwin has problems with binary 
files. Could you please try this:

	$ echo -ne '\x1a\x1b\x15\x10\0abc' | cat | wc

If it returns anything else than "<tab>0<tab>1<tab>8", then your setup 
works differently from mine. I fit returns what I expect it to, then we 
can use cat directly and do not have to move the tar bloat around (it is 
inherently block based, so it wastes a lot of space, and it also stores 
other things we'll never use, like permissions for all files).

On Sun, 18 Feb 2007, Mark Levedahl wrote:

> diff --git a/Documentation/cmd-list.perl b/Documentation/cmd-list.perl
> old mode 100755
> new mode 100644

This is unintended, right?

> diff --git a/git-bundle.sh b/git-bundle.sh
> new file mode 100755
> index 0000000..19ac137
> --- /dev/null
> +++ b/git-bundle.sh
> @@ -0,0 +1,174 @@
> +#!/bin/sh
> +# Basic handler for bundle files to connect repositories via sneakernet.
> +# Invocation must include action.
> +# This function can create a bundle or provide information on an existing bundle
> +# supporting git-fetch, git-pull, and git-ls-remote
> +
> +USAGE='[--create bundle <git-rev-list args>] |
> +[--verify|--list-heads|--unbundle bundle] <--tar tarspec>
> +     where bundle is the name of the bundle file.'

We seem to prefer another (more consistent?) way to describe things: If 
the argument is not an option, we embrace it with "<>", for example:

	<bundle>

if it is an optional option, we put it in "[]", for example:

	[--tar <tarfile>]

and if one of several mutual exclusive options has to be passed, we put it 
in "()" delimited with "|". Also, we put no explanation in the output of 
"-h", since you should read the man page if you don't know what the 
options mean. So, it is

	git-bundle ( --create <bundle> [<rev-list-args>...] |
		 --verify | --list-heads | --unbundle <bundle> )
		[--tar <tarspec>]

> +verify() {

This function is nicely done, AFAICT!

> +list_heads() {
> +	if test -z "$*" ; then
> +		$TAR -xf "$bfile" -O references 2>/dev/null || exit 1
> +	else
> +		($TAR -xf "$bfile" -O references 2>/dev/null || exit 1) |

AFAICT this will not work as expected: "()" opens a subshell, and "exit 1" 
will only exit that subshell. Correct me if I'm wrong.

> +		while read sha1 ref ; do
> +			for arg in $* ; do
> +				if test "${ref%$arg}" != "$ref" ; then
> +					echo "$sha1 $ref"
> +					break
> +				fi
> +			done
> +		done
> +	fi
> +}

The complexity here is unnecessarily high (O(N*M)), but I guess we cannot 
do better in shell. Maybe constructing a regexp from the args, and piping 
the output from tar through a grep would help here. Dunno. Maybe it 
doesn't matter much in reality, anyway.

> +SUBDIRECTORY_OK=1
> +. git-sh-setup

We tend to write this part at the very top of a git script.

> +while test -n "$1" ; do
> +    case $1 in
> +        --create|--list-heads|--unbundle|--verify)
> +            action=${1#--}
> +            shift
> +            bfile=$1

This can contain spaces (you are working on Windows, right? Windows users 
_love_ spaces in their filenames).

> +            test -z "$bfile" && die "$action requires filename";;
> +        --tar=*)
> +            TAR=${1##--tar=};;
> +        --tar)
> +			shift
> +            TAR=$1;;
> +        *)
> +            args="$args $1";;

I know you want to mix the --tar option with the rev-list options (which I 
still find totally confusing, and unnecessary at best), but you _have_ to 
quote "$1" in the args=... part, then. Even refs can contain spaces these 
days.

> +# what tar to use, prefer gtar, then tar.
> +if test -z "$TAR" ; then
> +    GTAR=$(which gtar 2>/dev/null)

Somebody on this list said that "which" should not be relied upon IIRC. I 
forgot why, but I obey that hint.

> +    gitrevargs=$(git-rev-parse --symbolic --revs-only $args) || exit 1

Here, you rely again on the refs not containing spaces.

> +    # git-rev-list cannot determine edge objects if a date restriction is
> +    # given...  we do things a slow way if max-age or min-age are given

Might make sense to teach max-age about boundary commits instead...

> +            prereqs=$(git-rev-list --objects-edge $fullrevargs | sed -ne 's/^-//p');;

Since you are not interested in non-commits at all, why not use 
"--boundary" instead?

> +    # create refs and pack
> +    tmp="$GIT_DIR/bundle_tmp$$"

Should we really rely on $GIT_DIR being writable? For unbundling, yes, but 
for bundling? Given the great confusion with git-status trying to write 
into $GIT_DIR, and people _demanding_ that it does not do so, I recommend 
against that.

In the following part, refnames must not contain spaces again:

> +	(for p in $prereqs ; do
> +		git-rev-list --pretty=one --max-count=1 $p
> +	done) > "$prerequisites" &&
> +    git-show-ref $refs > "$references" &&
> +    git-rev-list --objects $fullrevargs | cut -b-40 |
> +        git pack-objects --all-progress --stdout > "$pack" &&

pack-objects can take rev-list arguments itself these days, using "--revs" 
and being piped the rev-list arguments.

> +verify)
> +    verify && echo "$bfile is ok";;
> +
> +list-heads)
> +    list_heads $args;;

I like this abstraction (defining a function, and calling that). Why don't 
you do it with all commands?

Rest looks fine.

Given the problems you had with "cat" on Cygwin, and the quoting stuff, I 
think it might make sense to make this a builtin right away, before the 
script stage (which is meant to speed things up, not slow them down).

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Add git-bundle: move objects and references by archive.
  2007-02-18 22:47 [PATCH] Add git-bundle: move objects and references by archive Mark Levedahl
  2007-02-19  1:07 ` Johannes Schindelin
@ 2007-02-19  1:49 ` Junio C Hamano
  1 sibling, 0 replies; 32+ messages in thread
From: Junio C Hamano @ 2007-02-19  1:49 UTC (permalink / raw)
  To: Mark Levedahl; +Cc: junkio, git

Mark Levedahl <mdl123@verizon.net> writes:

> Some workflows require use of repositories on machines that cannot be
> connected, preventing use of git-fetch / git-push to transport objects
> and references between the repositories.

Looks much nicer.  Thanks.

>  6 files changed, 338 insertions(+), 4 deletions(-)
>  mode change 100755 => 100644 Documentation/cmd-list.perl

Hmmmm?

> +verify() {
> +    # Check bundle version
> +    test -f "$bfile" || die "cannot find $bfile"
> +    test "$($TAR -xf ""$bfile"" -O version)" = "v1 git-bundle" ||
> +        die "$bfile doesn't look like a v1 bundle file."

It makes me feel uneasy to see these double quotes next to each
other.  Are you sure you got your quoting right (I haven't
checked)?

Just a minor style issue: traditional single command letter to
tar is easier to read if you do not write leading '-'.  IOW,
"tar xf" is preferred over "tar -xf".

> +	# do fast check, then if any prereqs are missing then go line by line
> +	# to be verbose about the errors
> +    prereqs=$($TAR xf "$bfile" -O prerequisites)
> +	test -z "$prereqs" && return 0
> +	bad=$(echo "$prereqs" | cut -b-40 | git-rev-list --stdin --not --all 2>&1)
> +    if test -n "$bad" ; then
> +		test "$1" = "--silent" && return 1
> +        echo "error: $bfile requires the following commits you lack:"
> +		echo "$prereqs" |
> +		while read sha1 comment ; do
> +            missing=$(git-rev-list $sha1 --not --all 2>&1)
> +            test -n "$missing" && echo "$sha1 $comment"
> +		done
> +        exit 1
> +    fi

This part is very hard to read.  Maybe your tabstop is not set
to 8?

Linus rightly said "or something like that"; I do not think the
empty output from "git-rev-list --stdin --not --all" is enough.

Here, you are making sure that each line of $prereqs are names
of objects available in your repository, and well connected to
your existing refs.  If one of the commits listed in $prereqs
does not exist in your repository, I think rev-list exits with
non-zero status, with a message to stderr, without spitting out
anything to its stdout, so you would need:

	if bad=$(echo ... | rev-list ...) && test -z "$bad"
        then
        	: happy
	else
        	echo 2>&1 "error: you lack these..."
	fi

Incidentally, I think you do not have to squelch error message
from rev-list here.

Other than that, I think Linus's check is correct and the above
fills his "something like that" [*1*].


> +list_heads() {
> +	if test -z "$*" ; then
> +		$TAR -xf "$bfile" -O references 2>/dev/null || exit 1
> +	else
> +		($TAR -xf "$bfile" -O references 2>/dev/null || exit 1) |
> +		while read sha1 ref ; do
> +			for arg in $* ; do
> +				if test "${ref%$arg}" != "$ref" ; then
> +					echo "$sha1 $ref"
> +					break
> +				fi
> +			done
> +		done
> +	fi
> +}

I suspect that "exit 1" in the upstream of pipe does not have
any effect on the outcome -- yup, I just tried...

	$ (tar xf no-such-file -O references || exit 1) |
	  while read foo; do echo $foo; done ; echo $?
        0
	$

> +test -z "$action" && die "No action given, what should I do?"
> +
> +# what tar to use, prefer gtar, then tar.
> +if test -z "$TAR" ; then
> +    GTAR=$(which gtar 2>/dev/null)
> +    TAR=${GTAR:-tar}
> +fi

"which" is Ok for interactive use, but never use it in scripts.
I've seen "which foo" that says "No foo found." to its standard
output without erroring out (I think it was older Solaris).

+unbundle)
+    verify --silent || exit 1
+    $TAR -xf "$bfile" -O pack | git-index-pack --stdin ||
+        die "error: $bfile has a corrupted pack file"
+    list_heads $args;;
+esac

Hmm.  Your verify () shell function silently fails if prereq is
not met, so this makes the user of "git --unbundle file.bdl"
would get silent failure and left wondering why the bundle does
not unpack, doesn't it?

> diff --git a/git-fetch.sh b/git-fetch.sh
> index ca984e7..42cc62f 100755
> --- a/git-fetch.sh
> +++ b/git-fetch.sh
> @@ -377,8 +377,15 @@ fetch_main () {

Nice.

> diff --git a/git-ls-remote.sh b/git-ls-remote.sh
> index 8ea5c5e..28bb9b8 100755
> --- a/git-ls-remote.sh
> +++ b/git-ls-remote.sh
> @@ -89,8 +89,13 @@ rsync://* )

Likewise.


[footnote]

*1* If somebody wants to know why it works...

Suppose the "global" history looked like this:

	---A---B---C

and your receiving repository earlier had its tip of the branch
at commit A.  Then the user used git-http-fetch to fetch from a
repository that had its tip at commit B.  http-fetch (another
"commit walker", local-fetch, works the same way) learns that
the tip commit is B, downloads commit B, reads what's in there
to learn the tree associated with commit B, fetches that tree,
reads what's in there to learn the subtrees and blobs in it, ...

The user can interrupt this process, at the point after commit B
and everything that is needed to complete it was downloaded, but
before it actually updated the ref to point at B.

Suppose then you created a bundle with B..C to update the tip to
C, requiring you to have B.  "echo B | rev-list --not --all"
would happily say "Oh, B is a commit that is not reachable from
any of your refs", but in fact you can safely update the ref
that currently point at A (because the user interrupted the
earlier fetch) with C after checking C is a fast forward of A.
This might make "outputs empty" look like a wrong check.

However, the user could have interrupted the earlier http-fetch,
at the point after commit B was downloaded but before finishing
to download the tree associated with it.  A commit walker does
not update the ref, so the commit object B will be in your
receiving repository but it is not complete (it lacks its tree
object).  Even in this case "echo B | rev-list --not --all"
would say "B is ahead of your repository".  That's why "outputs
empty" is a good check.

Before somebody wonders further.  An alternative good check is
to see if "echo B | rev-list --objects --not --all" does _not_
fail, and all of the objects listed in its output are available
in the receiving repository.  This will solve the apparent
inefficiency in the case of killing http-fetch after it
downloaded all the necessary objects before it updated the ref,
which I described earlier.  But I think that the window for that
to happen is very small and if you killed a commit walker it is
much more likely that you lack some objects, so doing this
alternative check is optimizing for a wrong case.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Add git-bundle: move objects and references by archive.
  2007-02-19  1:07 ` Johannes Schindelin
@ 2007-02-19  1:50   ` Junio C Hamano
  2007-02-19  2:02     ` Johannes Schindelin
  2007-02-19  7:56   ` Shawn O. Pearce
  2007-02-19 13:29   ` Mark Levedahl
  2 siblings, 1 reply; 32+ messages in thread
From: Junio C Hamano @ 2007-02-19  1:50 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Mark Levedahl, git

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> Sorry to be such a PITA, but I really, really think that it is wrong to 
> make a tar dependency here. You said your cygwin has problems with binary 
> files. Could you please try this:
>
> 	$ echo -ne '\x1a\x1b\x15\x10\0abc' | cat | wc

It might be just me, but "echo -e" makes me feel much more
uneasy than explicitly saying "Sorry, we require GNU tar in
this little corner of the system".

>> +            bfile=$1
>
> This can contain spaces (you are working on Windows, right? Windows users 
> _love_ spaces in their filenames).

Which is fine.  Try var1=$var2 with something with SP or TAB in var2
and later try running

	$ echo "<$var1>"

to see what happens.

>> +    gitrevargs=$(git-rev-parse --symbolic --revs-only $args) || exit 1
>
> Here, you rely again on the refs not containing spaces.

Which probably is fine, as refs cannot contain spaces ;-).

>> +    # git-rev-list cannot determine edge objects if a date restriction is
>> +    # given...  we do things a slow way if max-age or min-age are given
>
> Might make sense to teach max-age about boundary commits instead...

You are talking about a rather extensive change, but could be
done.  But that is a separate issue, I think.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Add git-bundle: move objects and references by archive.
  2007-02-19  1:50   ` Junio C Hamano
@ 2007-02-19  2:02     ` Johannes Schindelin
  0 siblings, 0 replies; 32+ messages in thread
From: Johannes Schindelin @ 2007-02-19  2:02 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Mark Levedahl, git

Hi,

On Sun, 18 Feb 2007, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
> > Sorry to be such a PITA, but I really, really think that it is wrong to 
> > make a tar dependency here. You said your cygwin has problems with binary 
> > files. Could you please try this:
> >
> > 	$ echo -ne '\x1a\x1b\x15\x10\0abc' | cat | wc
> 
> It might be just me, but "echo -e" makes me feel much more
> uneasy than explicitly saying "Sorry, we require GNU tar in
> this little corner of the system".

Hey, this was meant as a _test_ on why it fails for Mark, not as the end 
result!

> >> +            bfile=$1
> >
> > This can contain spaces (you are working on Windows, right? Windows users 
> > _love_ spaces in their filenames).
> 
> Which is fine.  Try var1=$var2 with something with SP or TAB in var2
> and later try running
> 
> 	$ echo "<$var1>"
> 
> to see what happens.

Okay, I did not know that.

> >> +    gitrevargs=$(git-rev-parse --symbolic --revs-only $args) || exit 1
> >
> > Here, you rely again on the refs not containing spaces.
> 
> Which probably is fine, as refs cannot contain spaces ;-).

Ah! As we allow so many things in refnames, I assumed that spaces are 
allowed also.

> >> +    # git-rev-list cannot determine edge objects if a date restriction is
> >> +    # given...  we do things a slow way if max-age or min-age are given
> >
> > Might make sense to teach max-age about boundary commits instead...
> 
> You are talking about a rather extensive change, but could be
> done.  But that is a separate issue, I think.

I don't think it's so hard.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Add git-bundle: move objects and references by archive.
  2007-02-19  1:07 ` Johannes Schindelin
  2007-02-19  1:50   ` Junio C Hamano
@ 2007-02-19  7:56   ` Shawn O. Pearce
  2007-02-19 13:29   ` Mark Levedahl
  2 siblings, 0 replies; 32+ messages in thread
From: Shawn O. Pearce @ 2007-02-19  7:56 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Mark Levedahl, git

Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> On Sun, 18 Feb 2007, Mark Levedahl wrote:
> > +    gitrevargs=$(git-rev-parse --symbolic --revs-only $args) || exit 1
> 
> Here, you rely again on the refs not containing spaces.

They cannot.  OK, they can if you avoid any of the Git code:

  git rev-parse HEAD >".git/refs/heads/ref with space"

but a lot of things downstream may get annoyed by doing this.
Spaces are not permitted in ref names.
 
> > +    # create refs and pack
> > +    tmp="$GIT_DIR/bundle_tmp$$"
> 
> Should we really rely on $GIT_DIR being writable? For unbundling, yes, but 
> for bundling? Given the great confusion with git-status trying to write 
> into $GIT_DIR, and people _demanding_ that it does not do so, I recommend 
> against that.

I agree.  For an application like bundle creation, $GIT_DIR should be
strictly read-only.  There is no reason for this application to need
to create temporary files, as the bundle can be streamed to stdout
(or optionally to a file supplied by the user on the command line).

> > +    git-rev-list --objects $fullrevargs | cut -b-40 |
> > +        git pack-objects --all-progress --stdout > "$pack" &&
> 
> pack-objects can take rev-list arguments itself these days, using "--revs" 
> and being piped the rev-list arguments.

Uh, actually it can't.  I thought it could too.  It doesn't.
It can only take a ref name or --not.  I'm actually not sure why
we limited it like that, but we did...  It may be a good idea to
change that restriction to not exist.
 
-- 
Shawn.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Add git-bundle: move objects and references by archive.
  2007-02-19  1:07 ` Johannes Schindelin
  2007-02-19  1:50   ` Junio C Hamano
  2007-02-19  7:56   ` Shawn O. Pearce
@ 2007-02-19 13:29   ` Mark Levedahl
  2007-02-19 15:03     ` Johannes Schindelin
  2 siblings, 1 reply; 32+ messages in thread
From: Mark Levedahl @ 2007-02-19 13:29 UTC (permalink / raw)
  To: git

"Johannes Schindelin" <Johannes.Schindelin@gmx.de> wrote in message 
news:Pine.LNX.4.63.0702190126220.22628@wbgn013.biozentrum.uni-wuerzburg.de...
> Hi,
>
> Sorry to be such a PITA, but I really, really think that it is wrong 
> to
> make a tar dependency here. You said your cygwin has problems with 
> binary
> files. Could you please try this:
>
> $ echo -ne '\x1a\x1b\x15\x10\0abc' | cat | wc
>
Same result on Cygwin and FC6:
~>echo -ne '\x1a\x1b\x15\x10\0abc' | cat | wc
      0       1       8

> If it returns anything else than "<tab>0<tab>1<tab>8", then your setup
> works differently from mine. I fit returns what I expect it to, then 
> we
> can use cat directly and do not have to move the tar bloat around (it 
> is
> inherently block based, so it wastes a lot of space, and it also 
> stores
> other things we'll never use, like permissions for all files).


To repeat an earlier message, I tried the following:

cat bundle | (
      while read <header stuff> do
        <prcess header stuff>
      done
      git-pack-index --stdin
)

and it worked, MOST of the time, most being the important word here. My 
memory is already fuzzy about the failure rate, but it was something 
like 2 of 7 bundle files I passed through resulting in git-pack-index 
reporting a corrupted pack file: when I manually edited out the header 
stuff  and passed the remains of the file directly to git-pack-index 
things were fine. The same experiment worked perfectly on my FC6 system, 
so it was definitely some part of Cygwin, not the script. If git is 
worried about tickling "which" on ancient Solaris, I think it is a bit 
out of balance to simultaneously require a not yet written version of 
Cygwin (that will get fixed after someone has time to reduce this 
problem to a simple testcase and push back to the Cygwin maintainers).

A bundle file *is* an archive. We can use an existing archiver or write 
a new one. I don't really favor writing one but that is personal 
preference for not reinventing the wheel, nothing more. However, given 
Cgywin's problems an archiver written in shell isn't going to work for 
binary files. We could uuencode / uudecode the pack files and make 
bundles pure ascii text. Or, we could use tar. Of course, rewriting this 
as a builtin offers the ability to write a purpose built archiver as 
well, but I was trying to avoid that.

</unsolicited ignoreable gripe>Has anyone considered how much easier all 
of git would be if it were written in c + python, rather than c + every 
variant of shell + core utils + non-core utils known to mankind since 
the dawn of unix?</unsolicited ignorable gripe>

Mark

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Add git-bundle: move objects and references by archive.
  2007-02-19 13:29   ` Mark Levedahl
@ 2007-02-19 15:03     ` Johannes Schindelin
  0 siblings, 0 replies; 32+ messages in thread
From: Johannes Schindelin @ 2007-02-19 15:03 UTC (permalink / raw)
  To: Mark Levedahl; +Cc: git

Hi,

On Mon, 19 Feb 2007, Mark Levedahl wrote:

> "Johannes Schindelin" <Johannes.Schindelin@gmx.de> wrote in message
> news:Pine.LNX.4.63.0702190126220.22628@wbgn013.biozentrum.uni-wuerzburg.de...
> > Hi,
> > 
> > Sorry to be such a PITA, but I really, really think that it is wrong to
> > make a tar dependency here. You said your cygwin has problems with binary
> > files. Could you please try this:
> > 
> > $ echo -ne '\x1a\x1b\x15\x10\0abc' | cat | wc
> > 
> Same result on Cygwin and FC6:
> ~>echo -ne '\x1a\x1b\x15\x10\0abc' | cat | wc
>      0       1       8

Okay, so I really would like to have the bundle file from this experiment:

> To repeat an earlier message, I tried the following:
> 
> cat bundle | (
>      while read <header stuff> do
>        <prcess header stuff>
>      done
>      git-pack-index --stdin
> )

because I cannot see how the aforementioned "cat" can succeed, while 
pack-index --stdin cannot.

> A bundle file *is* an archive.

But the archive in the bundle really is the pack. You don't need any other 
archive there. You don't need to know the name of the prerequisites file, 
for example, if that information is in the header to begin with.

> </unsolicited ignoreable gripe>Has anyone considered how much easier all 
> of git would be if it were written in c + python, rather than c + every 
> variant of shell + core utils + non-core utils known to mankind since 
> the dawn of unix?</unsolicited ignorable gripe>

You realize that Python can be a PITA, a real PITA, when it comes to 
non-Linux systems?

Just look at the fun we had with subprocess. It is included in Python 2.4, 
_works_ with Python 2.3, but _not_ Python with 2.2. If you now say "well, 
just upgrade!" then I SHOUT at you. There is _no_ excuse forcing users to 
upgrade when the failure is on _your_ part.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Add git-bundle: move objects and references by archive
  2007-02-23  2:32     ` Mark Levedahl
@ 2007-02-23  4:39       ` Junio C Hamano
  0 siblings, 0 replies; 32+ messages in thread
From: Junio C Hamano @ 2007-02-23  4:39 UTC (permalink / raw)
  To: Mark Levedahl; +Cc: Johannes Schindelin, git

Mark Levedahl <mdl123@verizon.net> writes:

> I found the actual commit summary message (i.e., git-rev-list
> --pretty=one --max-count=1 sha1) the most useful of the various
> summaries available.

I would agree that would be the case, and Johannes's follow-up
seems to do that.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Add git-bundle: move objects and references by archive
  2007-02-22  6:56   ` Junio C Hamano
  2007-02-22  7:08     ` Junio C Hamano
  2007-02-22 16:17     ` Johannes Schindelin
@ 2007-02-23  2:32     ` Mark Levedahl
  2007-02-23  4:39       ` Junio C Hamano
  2 siblings, 1 reply; 32+ messages in thread
From: Mark Levedahl @ 2007-02-23  2:32 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Johannes Schindelin, git

Junio C Hamano wrote:
>> diff --git a/Documentation/cmd-list.perl b/Documentation/cmd-list.perl
>> index a2d6268..f61c77a 100755
>> --- a/Documentation/cmd-list.perl
>> +++ b/Documentation/cmd-list.perl
>> @@ -70,6 +70,7 @@ git-archive                             mainporcelain
>>  git-bisect                              mainporcelain
>>  git-blame                               ancillaryinterrogators
>>  git-branch                              mainporcelain
>> +git-bundle                              mainporcelain
>>  git-cat-file                            plumbinginterrogators
>>  git-checkout-index                      plumbingmanipulators
>>  git-checkout                            mainporcelain
>>     
>
> Is this really a mainporcelain?
> I would say ancillarymanipulators (or perhaps synchelpers).
>
>   
git bundle has four commands: create, verify, list-heads, and unbundle. 
The last two are pure helper functions, basically plumbing. Verify is 
questionable as to where it lies. But, create is the only way to create 
a bundle, is logically equivalent to git push as a user command to move 
data,  so I called it mainporcelain because that is how git push is 
classified.


>> +	/* write prerequisites */
>> +	memcpy(argv_boundary + 2, argv + 1, argc * sizeof(const char *));
>> +	argv_boundary[0] = "rev-list";
>> +	argv_boundary[1] = "--boundary";
>> +	argv_boundary[argc + 1] = NULL;
>> +	out = -1;
>> +	pid = fork_with_pipe(argv_boundary, NULL, &out);
>> +	if (pid < 0)
>> +		return -1;
>> +	while ((i = read_string(out, buffer, sizeof(buffer))) > 0)
>> +		if (buffer[0] == '-')
>> +			write(bundle_fd, buffer, i);
>>     
>
> It would be helpful for the recipient if you can append output
> from git-describe (or name-rev) when the buffer lacks "name".
>   
I found the actual commit summary message (i.e., git-rev-list 
--pretty=one --max-count=1 sha1) the most useful of the various 
summaries available.



Mark

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Add git-bundle: move objects and references by archive
  2007-02-22 20:05             ` Junio C Hamano
@ 2007-02-22 20:25               ` Johannes Schindelin
  0 siblings, 0 replies; 32+ messages in thread
From: Johannes Schindelin @ 2007-02-22 20:25 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Mark Levedahl

Hi,

On Thu, 22 Feb 2007, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
> >> The only worry I would have is if that exact number is too large
> >> that you cannot pass them sensibly in argv[].
> >
> > I thought there is only a limitation in bash?
> 
> I am sure kernel folks on the list would correct me, but my
> understanding is that is execve(2) limitation and you would get
> E2BIG.

Okay. So how about this (on top of my previous fixup):

-- snipsnap --
[PATCH] git-bundle: avoid fork() in verify_bundle()

We can use the revision walker easily for checking if the
prerequisites are met, instead of fork()ing off a rev-list,
which would list only the first unmet prerequisite.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin-bundle.c |   75 ++++++++++++++++++++++++++++++++++-------------------
 1 files changed, 48 insertions(+), 27 deletions(-)

diff --git a/builtin-bundle.c b/builtin-bundle.c
index 521bbda..4ac06f9 100644
--- a/builtin-bundle.c
+++ b/builtin-bundle.c
@@ -19,7 +19,7 @@ static const char bundle_signature[] = "# v2 git bundle\n";
 
 struct ref_list {
 	unsigned int nr, alloc;
-	struct {
+	struct ref_list_entry {
 		unsigned char sha1[20];
 		char *name;
 	} *list;
@@ -167,33 +167,54 @@ static int verify_bundle(struct bundle_header *header)
 	 * to be verbose about the errors
 	 */
 	struct ref_list *p = &header->prerequisites;
-	char **argv;
-	int pid, out, i, ret = 0;
-	char buffer[1024];
+	struct rev_info revs;
+	const char *argv[] = {NULL, "--all"};
+	struct object_array refs;
+	struct commit *commit;
+	int i, ret = 0, req_nr;
+	const char *message = "The bundle requires these lacking revs:";
 
-	argv = xmalloc((p->nr + 4) * sizeof(const char *));
-	argv[0] = "rev-list";
-	argv[1] = "--not";
-	argv[2] = "--all";
-	for (i = 0; i < p->nr; i++)
-		argv[i + 3] = xstrdup(sha1_to_hex(p->list[i].sha1));
-	argv[p->nr + 3] = NULL;
-	out = -1;
-	pid = fork_with_pipe((const char **)argv, NULL, &out);
-	if (pid < 0)
-		return error("Could not fork rev-list");
-	while (read_string(out, buffer, sizeof(buffer)) > 0)
-		; /* do nothing */
-	close(out);
-	for (i = 0; i < p->nr; i++)
-		free(argv[i + 3]);
-	free(argv);
-
-	while (waitpid(pid, &i, 0) < 0)
-		if (errno != EINTR)
-			return -1;
-	if (!ret && (!WIFEXITED(i) || WEXITSTATUS(i)))
-		return error("At least one prerequisite is lacking.");
+	init_revisions(&revs, NULL);
+	for (i = 0; i < p->nr; i++) {
+		struct ref_list_entry *e = p->list + i;
+		struct object *o = parse_object(e->sha1);
+		if (o) {
+			o->flags |= BOUNDARY_SHOW;
+			add_pending_object(&revs, o, e->name);
+			continue;
+		}
+		if (++ret == 1)
+			error(message);
+		error("%s %s", sha1_to_hex(e->sha1), e->name);
+	}
+	if (revs.pending.nr == 0)
+		return ret;
+	req_nr = revs.pending.nr;
+	setup_revisions(2, argv, &revs, NULL);
+
+	memset(&refs, 0, sizeof(struct object_array));
+	for (i = 0; i < revs.pending.nr; i++) {
+		struct object_array_entry *e = revs.pending.objects + i;
+		add_object_array(e->item, e->name, &refs);
+	}
+
+	prepare_revision_walk(&revs);
+
+	i = req_nr;
+	while (i && (commit = get_revision(&revs)))
+		if (commit->object.flags & BOUNDARY_SHOW)
+			i--;
+
+	for (i = 0; i < req_nr; i++)
+		if (!(refs.objects[i].item->flags & SHOWN)) {
+			if (++ret == 1)
+				error(message);
+			error("%s %s", sha1_to_hex(refs.objects[i].item->sha1),
+				refs.objects[i].name);
+		}
+
+	for (i = 0; i < refs.nr; i++)
+		clear_commit_marks((struct commit *)refs.objects[i].item, -1);
 
 	return ret;
 }
-- 
1.5.0.1.2214.gf22e-dirty

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH] Add git-bundle: move objects and references by archive
  2007-02-22 19:16           ` Johannes Schindelin
@ 2007-02-22 20:05             ` Junio C Hamano
  2007-02-22 20:25               ` Johannes Schindelin
  0 siblings, 1 reply; 32+ messages in thread
From: Junio C Hamano @ 2007-02-22 20:05 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Mark Levedahl

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

>> The only worry I would have is if that exact number is too large
>> that you cannot pass them sensibly in argv[].
>
> I thought there is only a limitation in bash?

I am sure kernel folks on the list would correct me, but my
understanding is that is execve(2) limitation and you would get
E2BIG.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Add git-bundle: move objects and references by archive
  2007-02-22 19:10         ` Junio C Hamano
@ 2007-02-22 19:16           ` Johannes Schindelin
  2007-02-22 20:05             ` Junio C Hamano
  0 siblings, 1 reply; 32+ messages in thread
From: Johannes Schindelin @ 2007-02-22 19:16 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Mark Levedahl

Hi,

On Thu, 22 Feb 2007, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
> > Thinking about this deeper, I have to say I find my decision to use 
> > "--stdin" rather silly, given that I know the exact number of revisions, 
> > and their SHA1s.
> 
> The only worry I would have is if that exact number is too large
> that you cannot pass them sensibly in argv[].

I thought there is only a limitation in bash?

Well, anyway, I think that it makes sense writing just a little loop to 
find if we have the revs or not, to be able to warn about them properly.

I'll do that.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Add git-bundle: move objects and references by archive
  2007-02-22 16:20       ` Johannes Schindelin
@ 2007-02-22 19:10         ` Junio C Hamano
  2007-02-22 19:16           ` Johannes Schindelin
  0 siblings, 1 reply; 32+ messages in thread
From: Junio C Hamano @ 2007-02-22 19:10 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Mark Levedahl

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> Thinking about this deeper, I have to say I find my decision to use 
> "--stdin" rather silly, given that I know the exact number of revisions, 
> and their SHA1s.

The only worry I would have is if that exact number is too large
that you cannot pass them sensibly in argv[].

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Add git-bundle: move objects and references by archive
  2007-02-22 17:12         ` Johannes Schindelin
@ 2007-02-22 17:21           ` Nicolas Pitre
  0 siblings, 0 replies; 32+ messages in thread
From: Nicolas Pitre @ 2007-02-22 17:21 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Mark Levedahl, Junio C Hamano

On Thu, 22 Feb 2007, Johannes Schindelin wrote:

> On Thu, 22 Feb 2007, Nicolas Pitre wrote:
> 
> > Could you please make the test, including the call to fstat(), 
> > conditional on !from_stdin instead?
> 
> How about using Simon's idea instead (subtracting input_len)?

Yes it would work.


Nicolas

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Add git-bundle: move objects and references by archive
  2007-02-22 16:24       ` Nicolas Pitre
@ 2007-02-22 17:12         ` Johannes Schindelin
  2007-02-22 17:21           ` Nicolas Pitre
  0 siblings, 1 reply; 32+ messages in thread
From: Johannes Schindelin @ 2007-02-22 17:12 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: git, Mark Levedahl, Junio C Hamano

Hi,

On Thu, 22 Feb 2007, Nicolas Pitre wrote:

> On Thu, 22 Feb 2007, Johannes Schindelin wrote:
> 
> > Hi,
> > 
> > On Wed, 21 Feb 2007, Nicolas Pitre wrote:
> > 
> > > On Thu, 22 Feb 2007, Johannes Schindelin wrote:
> > > 
> > > > diff --git a/index-pack.c b/index-pack.c
> > > > index fa9a0e7..5ccf4c4 100644
> > > > --- a/index-pack.c
> > > > +++ b/index-pack.c
> > > > @@ -457,8 +457,8 @@ static void parse_pack_objects(unsigned char *sha1)
> > > >  	/* If input_fd is a file, we should have reached its end now. */
> > > >  	if (fstat(input_fd, &st))
> > > >  		die("cannot fstat packfile: %s", strerror(errno));
> > > > -	if (S_ISREG(st.st_mode) && st.st_size != consumed_bytes)
> > > > -		die("pack has junk at the end");
> > > > +	if (input_fd && S_ISREG(st.st_mode) && st.st_size != consumed_bytes)
> > > > +		die("pack has junk at the end: 0%o, %d, %d %d", st.st_mode, (int)st.st_size, (int)consumed_bytes, input_fd);
> > > >  
> > > >  	if (!nr_deltas)
> > > >  		return;
> > > 
> > > What is this supposed to mean?
> > 
> > The funny thing is, if you stream part of the bundle file to index-pack, 
> > S_ISREG(st.st_mode) is true, even if input_fd == 0.
> 
> Hmmmm. indeed..
> 
> Could you please make the test, including the call to fstat(), 
> conditional on !from_stdin instead?

How about using Simon's idea instead (subtracting input_len)?

> Also I don't see the point of displaying the mode since we know that 
> S_ISREG(st.st_mode) is true, and input_fd is of little interest as well.

As you probably guessed, these are debug remnants. Will fix.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Add git-bundle: move objects and references by archive
  2007-02-22 16:46           ` Simon 'corecode' Schubert
@ 2007-02-22 17:09             ` Johannes Schindelin
  0 siblings, 0 replies; 32+ messages in thread
From: Johannes Schindelin @ 2007-02-22 17:09 UTC (permalink / raw)
  To: Simon 'corecode' Schubert
  Cc: Nicolas Pitre, git, Mark Levedahl, Junio C Hamano

Hi,

On Thu, 22 Feb 2007, Simon 'corecode' Schubert wrote:

> Johannes Schindelin wrote:
> > 
> > On Thu, 22 Feb 2007, Simon 'corecode' Schubert wrote:
> > 
> > > Maybe something like this would be cleaner:
> > > 
> > > if (IS_REF(st.st_mode) && lseek(input_fd, 0, SEEK_CUR) != st.st_size)
> > > 	die("...");
> > 
> > The lseek would fail whenever the input is _not_ a file, dying. Since
> > index-pack is called from fetch-pack, with a socket instead of a file, it
> > would fail for the most common user.
> 
> that's why i put IS_REF (should read IS_REG) there.

Ah, yes, I understand.

> but as nicolas pointed out, this won't work for read-ahead.

Well, you could do this:

if (S_ISREG(st.st_mode) &&
		lseek(input_fd, 0, SEEK_CUR) - input_len != st.st_size)
	die("...");

Of course, this suggests that a file containing a pack must not have 
_anything_ after the pack. But AFAIR the reading part chokes on "garbage 
after pack" anyway.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Add git-bundle: move objects and references by archive
  2007-02-22 16:37         ` Johannes Schindelin
@ 2007-02-22 16:46           ` Simon 'corecode' Schubert
  2007-02-22 17:09             ` Johannes Schindelin
  0 siblings, 1 reply; 32+ messages in thread
From: Simon 'corecode' Schubert @ 2007-02-22 16:46 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Nicolas Pitre, git, Mark Levedahl, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 2048 bytes --]

Johannes Schindelin wrote:
> Hi,
> 
> On Thu, 22 Feb 2007, Simon 'corecode' Schubert wrote:
> 
>> Johannes Schindelin wrote:
>>> Hi,
>>>
>>> On Wed, 21 Feb 2007, Nicolas Pitre wrote:
>>>
>>>> On Thu, 22 Feb 2007, Johannes Schindelin wrote:
>>>>
>>>>> diff --git a/index-pack.c b/index-pack.c
>>>>> index fa9a0e7..5ccf4c4 100644
>>>>> --- a/index-pack.c
>>>>> +++ b/index-pack.c
>>>>> @@ -457,8 +457,8 @@ static void parse_pack_objects(unsigned char *sha1)
>>>>>  	/* If input_fd is a file, we should have reached its end now. */
>>>>>  	if (fstat(input_fd, &st))
>>>>>  		die("cannot fstat packfile: %s", strerror(errno));
>>>>> -	if (S_ISREG(st.st_mode) && st.st_size != consumed_bytes)
>>>>> -		die("pack has junk at the end");
>>>>> +	if (input_fd && S_ISREG(st.st_mode) && st.st_size != consumed_bytes)
>>>>> +		die("pack has junk at the end: 0%o, %d, %d %d", st.st_mode,
>>>>> (int)st.st_size, (int)consumed_bytes, input_fd);
>>>>>   	if (!nr_deltas)
>>>>>  		return;
>>>> What is this supposed to mean?
>>> The funny thing is, if you stream part of the bundle file to index-pack,
>>> S_ISREG(st.st_mode) is true, even if input_fd == 0.
>> Well, of course:  you opened a regular file and pass this as stdin to
>> index-pack.
>>
>> Maybe something like this would be cleaner:
>>
>> if (IS_REF(st.st_mode) && lseek(input_fd, 0, SEEK_CUR) != st.st_size)
>> 	die("...");
> 
> The lseek would fail whenever the input is _not_ a file, dying. Since 
> index-pack is called from fetch-pack, with a socket instead of a file, it 
> would fail for the most common user.

that's why i put IS_REF (should read IS_REG) there.

but as nicolas pointed out, this won't work for read-ahead.

cheers
  simon

-- 
Serve - BSD     +++  RENT this banner advert  +++    ASCII Ribbon   /"\
Work - Mac      +++  space for low €€€ NOW!1  +++      Campaign     \ /
Party Enjoy Relax   |   http://dragonflybsd.org      Against  HTML   \
Dude 2c 2 the max   !   http://golden-apple.biz       Mail + News   / \


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 252 bytes --]

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Add git-bundle: move objects and references by archive
  2007-02-22 16:14       ` Simon 'corecode' Schubert
  2007-02-22 16:28         ` Nicolas Pitre
@ 2007-02-22 16:37         ` Johannes Schindelin
  2007-02-22 16:46           ` Simon 'corecode' Schubert
  1 sibling, 1 reply; 32+ messages in thread
From: Johannes Schindelin @ 2007-02-22 16:37 UTC (permalink / raw)
  To: Simon 'corecode' Schubert
  Cc: Nicolas Pitre, git, Mark Levedahl, Junio C Hamano

Hi,

On Thu, 22 Feb 2007, Simon 'corecode' Schubert wrote:

> Johannes Schindelin wrote:
> > Hi,
> > 
> > On Wed, 21 Feb 2007, Nicolas Pitre wrote:
> > 
> > > On Thu, 22 Feb 2007, Johannes Schindelin wrote:
> > > 
> > > > diff --git a/index-pack.c b/index-pack.c
> > > > index fa9a0e7..5ccf4c4 100644
> > > > --- a/index-pack.c
> > > > +++ b/index-pack.c
> > > > @@ -457,8 +457,8 @@ static void parse_pack_objects(unsigned char *sha1)
> > > >  	/* If input_fd is a file, we should have reached its end now. */
> > > >  	if (fstat(input_fd, &st))
> > > >  		die("cannot fstat packfile: %s", strerror(errno));
> > > > -	if (S_ISREG(st.st_mode) && st.st_size != consumed_bytes)
> > > > -		die("pack has junk at the end");
> > > > +	if (input_fd && S_ISREG(st.st_mode) && st.st_size != consumed_bytes)
> > > > +		die("pack has junk at the end: 0%o, %d, %d %d", st.st_mode,
> > > > (int)st.st_size, (int)consumed_bytes, input_fd);
> > > >   	if (!nr_deltas)
> > > >  		return;
> > > What is this supposed to mean?
> > 
> > The funny thing is, if you stream part of the bundle file to index-pack,
> > S_ISREG(st.st_mode) is true, even if input_fd == 0.
> 
> Well, of course:  you opened a regular file and pass this as stdin to
> index-pack.
> 
> Maybe something like this would be cleaner:
> 
> if (IS_REF(st.st_mode) && lseek(input_fd, 0, SEEK_CUR) != st.st_size)
> 	die("...");

The lseek would fail whenever the input is _not_ a file, dying. Since 
index-pack is called from fetch-pack, with a socket instead of a file, it 
would fail for the most common user.

My patch was only a minimal fixup to allow "--stdin" even starting in the 
middle of a file, and it does not break fetching.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Add git-bundle: move objects and references by archive
  2007-02-22 16:14       ` Simon 'corecode' Schubert
@ 2007-02-22 16:28         ` Nicolas Pitre
  2007-02-22 16:37         ` Johannes Schindelin
  1 sibling, 0 replies; 32+ messages in thread
From: Nicolas Pitre @ 2007-02-22 16:28 UTC (permalink / raw)
  To: Simon 'corecode' Schubert
  Cc: Johannes Schindelin, git, Mark Levedahl, Junio C Hamano

On Thu, 22 Feb 2007, Simon 'corecode' Schubert wrote:

> Johannes Schindelin wrote:
> > Hi,
> > 
> > On Wed, 21 Feb 2007, Nicolas Pitre wrote:
> > 
> > > On Thu, 22 Feb 2007, Johannes Schindelin wrote:
> > > 
> > > > diff --git a/index-pack.c b/index-pack.c
> > > > index fa9a0e7..5ccf4c4 100644
> > > > --- a/index-pack.c
> > > > +++ b/index-pack.c
> > > > @@ -457,8 +457,8 @@ static void parse_pack_objects(unsigned char *sha1)
> > > >  	/* If input_fd is a file, we should have reached its end now. */
> > > >  	if (fstat(input_fd, &st))
> > > >  		die("cannot fstat packfile: %s", strerror(errno));
> > > > -	if (S_ISREG(st.st_mode) && st.st_size != consumed_bytes)
> > > > -		die("pack has junk at the end");
> > > > +	if (input_fd && S_ISREG(st.st_mode) && st.st_size != consumed_bytes)
> > > > +		die("pack has junk at the end: 0%o, %d, %d %d", st.st_mode,
> > > > (int)st.st_size, (int)consumed_bytes, input_fd);
> > > >   	if (!nr_deltas)
> > > >  		return;
> > > What is this supposed to mean?
> > 
> > The funny thing is, if you stream part of the bundle file to index-pack,
> > S_ISREG(st.st_mode) is true, even if input_fd == 0.
> 
> Well, of course:  you opened a regular file and pass this as stdin to
> index-pack.
> 
> Maybe something like this would be cleaner:
> 
> if (IS_REF(st.st_mode) && lseek(input_fd, 0, SEEK_CUR) != st.st_size)

No that won't work because the input is buffered in the fill() function.


Nicolas

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Add git-bundle: move objects and references by archive
  2007-02-22 15:55     ` Johannes Schindelin
  2007-02-22 16:14       ` Simon 'corecode' Schubert
@ 2007-02-22 16:24       ` Nicolas Pitre
  2007-02-22 17:12         ` Johannes Schindelin
  1 sibling, 1 reply; 32+ messages in thread
From: Nicolas Pitre @ 2007-02-22 16:24 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Mark Levedahl, Junio C Hamano

On Thu, 22 Feb 2007, Johannes Schindelin wrote:

> Hi,
> 
> On Wed, 21 Feb 2007, Nicolas Pitre wrote:
> 
> > On Thu, 22 Feb 2007, Johannes Schindelin wrote:
> > 
> > > diff --git a/index-pack.c b/index-pack.c
> > > index fa9a0e7..5ccf4c4 100644
> > > --- a/index-pack.c
> > > +++ b/index-pack.c
> > > @@ -457,8 +457,8 @@ static void parse_pack_objects(unsigned char *sha1)
> > >  	/* If input_fd is a file, we should have reached its end now. */
> > >  	if (fstat(input_fd, &st))
> > >  		die("cannot fstat packfile: %s", strerror(errno));
> > > -	if (S_ISREG(st.st_mode) && st.st_size != consumed_bytes)
> > > -		die("pack has junk at the end");
> > > +	if (input_fd && S_ISREG(st.st_mode) && st.st_size != consumed_bytes)
> > > +		die("pack has junk at the end: 0%o, %d, %d %d", st.st_mode, (int)st.st_size, (int)consumed_bytes, input_fd);
> > >  
> > >  	if (!nr_deltas)
> > >  		return;
> > 
> > What is this supposed to mean?
> 
> The funny thing is, if you stream part of the bundle file to index-pack, 
> S_ISREG(st.st_mode) is true, even if input_fd == 0.

Hmmmm. indeed..

Could you please make the test, including the call to fstat(), 
conditional on !from_stdin instead?

Also I don't see the point of displaying the mode since we know that 
S_ISREG(st.st_mode) is true, and input_fd is of little interest as well.


Nicolas

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Add git-bundle: move objects and references by archive
  2007-02-22  7:08     ` Junio C Hamano
@ 2007-02-22 16:20       ` Johannes Schindelin
  2007-02-22 19:10         ` Junio C Hamano
  0 siblings, 1 reply; 32+ messages in thread
From: Johannes Schindelin @ 2007-02-22 16:20 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Mark Levedahl

Hi,

On Wed, 21 Feb 2007, Junio C Hamano wrote:

> Junio C Hamano <junkio@cox.net> writes:
> 
> >> +static int verify_bundle(struct bundle_header *header)
> >> +{
> >> +	/*
> >> +	 * Do fast check, then if any prereqs are missing then go line by line
> >> +	 * to be verbose about the errors
> >> +	 */
> >> +	struct ref_list *p = &header->prerequisites;
> >> +	const char *argv[5] = {"rev-list", "--stdin", "--not", "--all", NULL};
> >> +	int pid, in, out, i, ret = 0;
> >> +	char buffer[1024];
> >> +
> >> +	in = out = -1;
> >> +	pid = fork_with_pipe(argv, &in, &out);
> >> +	if (pid < 0)
> >> +		return error("Could not fork rev-list");
> >> +	if (!fork()) {
> >> +		for (i = 0; i < p->nr; i++) {
> >> +			write(in, sha1_to_hex(p->list[i].sha1), 40);
> >> +			write(in, "\n", 1);
> >> +		}
> >> +		close(in);
> >> +		exit(0);
> >> +	}
> >> +	close(in);
> >
> > What if write() fails?  That can happen when one of the objects
> > you feed here, or its parent objects, is missing from your
> > repository -- receiving rev-list would die() and the writing
> > child would sigpipe.
> >
> > I also wonder who waits for this child...
> 
> In general, fork() to avoid bidirectional pipe deadlock is a
> good discipline, but in this particular case I think it would be
> easier to handle errors if you don't (and it would save another
> process).  The other side "rev-list --stdin --not --all" is
> running a limited traversal, and would not emit anything until
> you stop feeding it from --stdin, or until it dies because you
> fed it a commit that does not exist.  So as long as you check
> the error condition from write() for EPIPE to notice the other
> end died, I think you are Ok.

Thinking about this deeper, I have to say I find my decision to use 
"--stdin" rather silly, given that I know the exact number of revisions, 
and their SHA1s.

But it might make more sense to rewrite the checking part, instead of 
fork()ing it.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Add git-bundle: move objects and references by archive
  2007-02-22  6:56   ` Junio C Hamano
  2007-02-22  7:08     ` Junio C Hamano
@ 2007-02-22 16:17     ` Johannes Schindelin
  2007-02-23  2:32     ` Mark Levedahl
  2 siblings, 0 replies; 32+ messages in thread
From: Johannes Schindelin @ 2007-02-22 16:17 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Mark Levedahl

Hi,

On Wed, 21 Feb 2007, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
> > It was updated to make git-bundle a builtin, and get rid of the tar 
> > format: now, the first line is supposed to say "# v2 git bundle", the next 
> > lines either contain a prerequisite ("-" followed by the hash of the 
> > needed commit), or a ref (the hash of a commit, followed by the name of 
> > the ref), and finally the pack. As a result, the bundle argument can be 
> > "-" now.
> 
> That in BNF would be?
> 
> 	bundle = signature prereq* ref* pack

more like

	bundle = header pack
	header = signature prereq* ref* nl

> I suspect that we might want to possibly:
> 
>  - have checksum protection over the part before pack data?

Possibly. But would the sha1's not be sufficient? I.e. if the header is 
corrupted, chances are that the commits are no longer verified correctly.

>  - give descriptive name to prereq, so that an error message can
>    say "you need v1.4.4" instead of "you need e267c2f6"?

My idea was to allow --since=<date>. You don't have a name there.

>  - make the fields extensible without updating "v2"?

You mean, like, warn about unknown header lines? Yes, that's really easy.

> > diff --git a/Documentation/cmd-list.perl b/Documentation/cmd-list.perl
> > index a2d6268..f61c77a 100755
> > --- a/Documentation/cmd-list.perl
> > +++ b/Documentation/cmd-list.perl
> > @@ -70,6 +70,7 @@ git-archive                             mainporcelain
> >  git-bisect                              mainporcelain
> >  git-blame                               ancillaryinterrogators
> >  git-branch                              mainporcelain
> > +git-bundle                              mainporcelain
> >  git-cat-file                            plumbinginterrogators
> >  git-checkout-index                      plumbingmanipulators
> >  git-checkout                            mainporcelain
> 
> Is this really a mainporcelain?

git bundle create <file>?

> > diff --git a/Documentation/git-bundle.txt b/Documentation/git-bundle.txt
> > new file mode 100644
> > index 0000000..4ea9e85
> > ...
> > +Author
> > +------
> > +Written by Mark Levedahl <mdl123@verizon.net>
> 
> ... and Dscho.

... not the Documentation (only patched a little to reflect the "--" 
change).

> > diff --git a/builtin-bundle.c b/builtin-bundle.c
> > new file mode 100644
> > index 0000000..4bd596a
> > --- /dev/null
> > +++ b/builtin-bundle.c
> > ...
> > +static const char *bundle_usage="git-bundle (--create <bundle> <git-rev-list args> | --verify <bundle> | --list-heads <bundle> [refname]... | --unbundle <bundle> [refname]... )";
> 
> I thought you removed "--" prefixes from subcommands.

Yes, and I forgot it there. Will fix.

> > +struct bundle_header {
> > +	struct ref_list prerequisites, references;
> > +};
> 
> (minor style) I find these two members each on its own line
> easier to read, as in:
> 
>         struct bundle_header {
>                 struct ref_list prerequisites;
>                 struct ref_list references;
>         };

Will fix.

> > +/* this function returns the length of the string */
> > +static int read_string(int fd, char *buffer, int size)
> > +{
> > +	int i;
> > +	for (i = 0; i < size - 1; i++) {
> > +		int count = read(fd, buffer + i, 1);
> > +		if (count < 0)
> > +			return error("Read error: %s", strerror(errno));
> > +		if (count == 0) {
> > +			i--;
> > +			break;
> > +		}
> > +		if (buffer[i] == '\n')
> > +			break;
> > +	}
> > +	buffer[i + 1] = '\0';
> > +	return i + 1;
> > +}
> 
> At least xread() please.  I know the reason you read one byte at
> a time is because you cannot over-read into pack area, but
> somehow this makes me go "hmmmmmm".  It's not performance
> critical so let's leave it that way.  Is bundle supposed to be
> streamable?  Otherwise we could probably seek back.

I wanted unbundle to be able to read from stdin (I did not yet allow "git 
fetch - master:master"). But I somehow forgot about xread(). D'oh. Will 
fix.

> > +/* returns an fd */
> > +static int read_header(const char *path, struct bundle_header *header) {
> > +	char buffer[1024];
> > +	int fd = open(path, O_RDONLY);
> > +
> > +	if (fd < 0)
> > +		return error("could not open '%s'", path);
> > +	if (read_string(fd, buffer, sizeof(buffer)) < 0 ||
> > +			strcmp(buffer, bundle_signature)) {
> > +		close(fd);
> > +		return error("'%s' does not look like a v2 bundle file", path);
> > +	}
> > +	while (read_string(fd, buffer, sizeof(buffer)) > 0
> > +			&& buffer[0] != '\n') {
> > +		int offset = buffer[0] == '-' ? 1 : 0;
> > +		int len = strlen(buffer);
> > +		unsigned char sha1[20];
> > +		struct ref_list *list = offset ? &header->prerequisites
> > +			: &header->references;
> > +		if (get_sha1_hex(buffer + offset, sha1)) {
> > +			close(fd);
> > +			return error("invalid SHA1: %s", buffer);
> > +		}
> > +		if (buffer[len - 1] == '\n')
> > +			buffer[len - 1] = '\0';
> > +		add_to_ref_list(sha1, buffer + 41 + offset, list);
> > +	}
> > +	return fd;
> > +}
> 
> Don't you want to look at and validate buffer[40+offset] in any
> way, other than what get_sha1_hex() does (which is issspace())?

Right. Will fix. (Although I will only write two "s" instead of three...)

> > +/* if in && *in >= 0, take that as input file descriptor instead */
> > +static int fork_with_pipe(const char **argv, int *in, int *out)
> > +{
> > +	int needs_in, needs_out;
> > +	int fdin[2], fdout[2], pid;
> > +
> > +	needs_in = in && *in < 0;
> > +	if (needs_in) {
> > +		if (pipe(fdin) < 0)
> > +			return error("could not setup pipe");
> > +		*in = fdin[1];
> > +	}
> > +
> > +	needs_out = out && *out < 0;
> > +	if (needs_out) {
> > +		if (pipe(fdout) < 0)
> > +			return error("could not setup pipe");
> > +		*out = fdout[0];
> > +	}
> > +
> > +	if ((pid = fork()) < 0) {
> > +		if (needs_in) {
> > +			close(fdin[0]);
> > +			close(fdin[1]);
> > +		}
> > +		if (needs_out) {
> > +			close(fdout[0]);
> > +			close(fdout[1]);
> > +		}
> > +		return error("could not fork");
> > +	}
> > +	if (!pid) {
> > +		if (needs_in) {
> > +			dup2(fdin[0], 0);
> > +			close(fdin[0]);
> > +			close(fdin[1]);
> > +		} else if (in)
> > +			dup2(*in, 0);
> > +		if (needs_out) {
> > +			dup2(fdout[1], 1);
> > +			close(fdout[0]);
> > +			close(fdout[1]);
> > +		} else if (out)
> > +			dup2(*out, 1);
> > +		exit(execv_git_cmd(argv));
> > +	}
> 
> Do you need to close *in and *out that are given by the caller
> after dup2() in the child?

I probably should. Will fix.

> > +static int verify_bundle(struct bundle_header *header)
> > +{
> > +	/*
> > +	 * Do fast check, then if any prereqs are missing then go line by line
> > +	 * to be verbose about the errors
> > +	 */
> > +	struct ref_list *p = &header->prerequisites;
> > +	const char *argv[5] = {"rev-list", "--stdin", "--not", "--all", NULL};
> > +	int pid, in, out, i, ret = 0;
> > +	char buffer[1024];
> > +
> > +	in = out = -1;
> > +	pid = fork_with_pipe(argv, &in, &out);
> > +	if (pid < 0)
> > +		return error("Could not fork rev-list");
> > +	if (!fork()) {
> > +		for (i = 0; i < p->nr; i++) {
> > +			write(in, sha1_to_hex(p->list[i].sha1), 40);
> > +			write(in, "\n", 1);
> > +		}
> > +		close(in);
> > +		exit(0);
> > +	}
> > +	close(in);
> 
> What if write() fails?  That can happen when one of the objects
> you feed here, or its parent objects, is missing from your
> repository -- receiving rev-list would die() and the writing
> child would sigpipe.

The error is caught afterwards, with WEXITSTATUS(), no? It might be 
cleaner to check, but is it really necessary?

> I also wonder who waits for this child...

rev-list. It only exits when that child closes the input.

> > +	while (read_string(out, buffer, sizeof(buffer)) > 0) {
> > +		if (ret++ == 0)
> > +			error ("The bundle requires the following commits you lack:");
> > +		fprintf(stderr, "%s", buffer);
> > +	}
> > +	close(out);
> 
> I do not think this error message would buy you anything.  If
> you say:
> 
> 	echo commit | rev-list --stdin --not --all
> 
> and you got commit back, that means you _do_ have that commit,
> and that commit reaches some ref you have (objects associated to
> that commit may be still missing).  If commit is truly missing,
> then you do not get commit output from rev-list as it cannot
> even determine if it is "--not --all" or not -- it would error
> out, which is the more important check to see if prereqs are
> lacking.

Yes, I realized that rev-list die()s, and does not output the missing 
revs -- as git-bundle.sh expected it to. When I found out about that, I 
failed to remove that bogus error. Will fix.

> > +	while (waitpid(pid, &i, 0) < 0)
> > +		if (errno != EINTR)
> > +			return -1;
> > +	if (!ret && (!WIFEXITED(i) || WEXITSTATUS(i)))
> > +		return error("At least one prerequisite is lacking.");
> 
> So you would want to keep this error, perhaps even make it more
> helpful by suggesting which ones are missing.

rev-list die()s with the first missing. It does not continue, printing 
other missing revs.

> > +static int list_heads(struct bundle_header *header, int argc, const char **argv)
> > +{
> > +	int i;
> > +	struct ref_list *r = &header->references;
> > +
> > +	for (i = 0; i < r->nr; i++) {
> > +		if (argc > 1) {
> > +			int j;
> > +			for (j = 1; j < argc; j++)
> > +				if (!strcmp(r->list[i].name, argv[j]))
> > +					break;
> > +			if (j == argc)
> > +				continue;
> > +		}
> > +		printf("%s %s\n", sha1_to_hex(r->list[i].sha1),
> > +				r->list[i].name);
> > +	}
> > +	return 0;
> > +}
> 
> I wonder if we want to have a way to list prereqs in similar way.

That is what verify should be for. Maybe add "-v" to the "verify" 
subcommand, make it properly builtin, and output non-missing prerequisites 
only with "-v"?

> > +static void show_commit(struct commit *commit)
> > +{
> > +	write(1, sha1_to_hex(commit->object.sha1), 40);
> > +	write(1, "\n", 1);
> > +	if (commit->parents) {
> > +		free_commit_list(commit->parents);
> > +		commit->parents = NULL;
> > +	}
> > +}
> 
> (everywhere) We would want write_in_full() with error handling
> that dies even on EPIPE.

Or write_or_die()? I mean, if write() fails, we cannot sanely continue 
anyway, right?

> > +static void show_object(struct object_array_entry *p)
> > +{
> > +	/* An object with name "foo\n0000000..." can be used to
> > +	 *          * confuse downstream git-pack-objects very badly.
> > +	 *                   */
> 
> An interesting way to wrap comments.

Bad, bad vi. Will fix.

> > +	/* write prerequisites */
> > +	memcpy(argv_boundary + 2, argv + 1, argc * sizeof(const char *));
> > +	argv_boundary[0] = "rev-list";
> > +	argv_boundary[1] = "--boundary";
> > +	argv_boundary[argc + 1] = NULL;
> > +	out = -1;
> > +	pid = fork_with_pipe(argv_boundary, NULL, &out);
> > +	if (pid < 0)
> > +		return -1;
> > +	while ((i = read_string(out, buffer, sizeof(buffer))) > 0)
> > +		if (buffer[0] == '-')
> > +			write(bundle_fd, buffer, i);
> 
> It would be helpful for the recipient if you can append output
> >from git-describe (or name-rev) when the buffer lacks "name".

Hmm. This could take a long time, as -describe is not really fast. ATM 
name-rev is not fast either, but I have plans to make it so. So obiously, 
I would prefer name-rev output. Comments?

> > +static int unbundle(struct bundle_header *header, int bundle_fd,
> > +		int argc, const char **argv)
> > +{
> > +	const char *argv_index_pack[] = {"index-pack", "--stdin", NULL};
> > +	int pid, status, dev_null;
> > +
> > +	if (verify_bundle(header))
> > +		return -1;
> > +	dev_null = open("/dev/null", O_WRONLY);
> > +	pid = fork_with_pipe(argv_index_pack, &bundle_fd, &dev_null);
> > +	if (pid < 0)
> > +		return error("Could not spawn index-pack");
> > +	close(bundle_fd);
> > +	while (waitpid(pid, &status, 0) < 0)
> > +		if (errno != EINTR)
> > +			return error("index-pack died");
> > +	if (!WIFEXITED(status) || WEXITSTATUS(status))
> > +		return error("index-pack exited with status %d",
> > +				WEXITSTATUS(status));
> > +	return list_heads(header, argc, argv);
> > +}
> 
> We might want to later use --keep for this as well...

Yes, later. It is such a low-hanging fruit that I'd prefer somebody else 
to get involved with the code.

> > diff --git a/index-pack.c b/index-pack.c
> > index fa9a0e7..5ccf4c4 100644
> > --- a/index-pack.c
> > +++ b/index-pack.c
> > @@ -457,8 +457,8 @@ static void parse_pack_objects(unsigned char *sha1)
> >  	/* If input_fd is a file, we should have reached its end now. */
> >  	if (fstat(input_fd, &st))
> >  		die("cannot fstat packfile: %s", strerror(errno));
> > -	if (S_ISREG(st.st_mode) && st.st_size != consumed_bytes)
> > -		die("pack has junk at the end");
> > +	if (input_fd && S_ISREG(st.st_mode) && st.st_size != consumed_bytes)
> > +		die("pack has junk at the end: 0%o, %d, %d %d", st.st_mode, (int)st.st_size, (int)consumed_bytes, input_fd);
> >  
> >  	if (!nr_deltas)
> >  		return;
> 
> ??

As I said in my reply to Nico, if input_fd is 0, but really comes from a 
file, this check will fail (the bundle is _strictly_ larger than the 
pack).

Would you like the fixes on top of the big commit, or a replacement patch?

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Add git-bundle: move objects and references by archive
  2007-02-22 15:55     ` Johannes Schindelin
@ 2007-02-22 16:14       ` Simon 'corecode' Schubert
  2007-02-22 16:28         ` Nicolas Pitre
  2007-02-22 16:37         ` Johannes Schindelin
  2007-02-22 16:24       ` Nicolas Pitre
  1 sibling, 2 replies; 32+ messages in thread
From: Simon 'corecode' Schubert @ 2007-02-22 16:14 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Nicolas Pitre, git, Mark Levedahl, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 1804 bytes --]

Johannes Schindelin wrote:
> Hi,
> 
> On Wed, 21 Feb 2007, Nicolas Pitre wrote:
> 
>> On Thu, 22 Feb 2007, Johannes Schindelin wrote:
>>
>>> diff --git a/index-pack.c b/index-pack.c
>>> index fa9a0e7..5ccf4c4 100644
>>> --- a/index-pack.c
>>> +++ b/index-pack.c
>>> @@ -457,8 +457,8 @@ static void parse_pack_objects(unsigned char *sha1)
>>>  	/* If input_fd is a file, we should have reached its end now. */
>>>  	if (fstat(input_fd, &st))
>>>  		die("cannot fstat packfile: %s", strerror(errno));
>>> -	if (S_ISREG(st.st_mode) && st.st_size != consumed_bytes)
>>> -		die("pack has junk at the end");
>>> +	if (input_fd && S_ISREG(st.st_mode) && st.st_size != consumed_bytes)
>>> +		die("pack has junk at the end: 0%o, %d, %d %d", st.st_mode, (int)st.st_size, (int)consumed_bytes, input_fd);
>>>  
>>>  	if (!nr_deltas)
>>>  		return;
>> What is this supposed to mean?
> 
> The funny thing is, if you stream part of the bundle file to index-pack, 
> S_ISREG(st.st_mode) is true, even if input_fd == 0.

Well, of course:  you opened a regular file and pass this as stdin to index-pack.

Maybe something like this would be cleaner:

if (IS_REF(st.st_mode) && lseek(input_fd, 0, SEEK_CUR) != st.st_size)
	die("...");

Because the point isn't actually that we consumed less bytes than the file's size, but that there is still data available after we are done with handling the pack.

Anyways, is this a reason to die(), or rather just a remark?

cheers
  simon

-- 
Serve - BSD     +++  RENT this banner advert  +++    ASCII Ribbon   /"\
Work - Mac      +++  space for low €€€ NOW!1  +++      Campaign     \ /
Party Enjoy Relax   |   http://dragonflybsd.org      Against  HTML   \
Dude 2c 2 the max   !   http://golden-apple.biz       Mail + News   / \


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 252 bytes --]

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Add git-bundle: move objects and references by archive
  2007-02-22  3:28   ` Nicolas Pitre
@ 2007-02-22 15:55     ` Johannes Schindelin
  2007-02-22 16:14       ` Simon 'corecode' Schubert
  2007-02-22 16:24       ` Nicolas Pitre
  0 siblings, 2 replies; 32+ messages in thread
From: Johannes Schindelin @ 2007-02-22 15:55 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: git, Mark Levedahl, Junio C Hamano

Hi,

On Wed, 21 Feb 2007, Nicolas Pitre wrote:

> On Thu, 22 Feb 2007, Johannes Schindelin wrote:
> 
> > diff --git a/index-pack.c b/index-pack.c
> > index fa9a0e7..5ccf4c4 100644
> > --- a/index-pack.c
> > +++ b/index-pack.c
> > @@ -457,8 +457,8 @@ static void parse_pack_objects(unsigned char *sha1)
> >  	/* If input_fd is a file, we should have reached its end now. */
> >  	if (fstat(input_fd, &st))
> >  		die("cannot fstat packfile: %s", strerror(errno));
> > -	if (S_ISREG(st.st_mode) && st.st_size != consumed_bytes)
> > -		die("pack has junk at the end");
> > +	if (input_fd && S_ISREG(st.st_mode) && st.st_size != consumed_bytes)
> > +		die("pack has junk at the end: 0%o, %d, %d %d", st.st_mode, (int)st.st_size, (int)consumed_bytes, input_fd);
> >  
> >  	if (!nr_deltas)
> >  		return;
> 
> What is this supposed to mean?

The funny thing is, if you stream part of the bundle file to index-pack, 
S_ISREG(st.st_mode) is true, even if input_fd == 0.

Then, because it is only part of the bundle file, the size check fails.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Add git-bundle: move objects and references by archive
  2007-02-22  0:59 ` Johannes Schindelin
  2007-02-22  3:28   ` Nicolas Pitre
  2007-02-22  6:56   ` Junio C Hamano
@ 2007-02-22  9:31   ` Johannes Sixt
  2 siblings, 0 replies; 32+ messages in thread
From: Johannes Sixt @ 2007-02-22  9:31 UTC (permalink / raw)
  To: git

Johannes Schindelin wrote:
> +/* if in && *in >= 0, take that as input file descriptor instead */
> +static int fork_with_pipe(const char **argv, int *in, int *out)
> +{
> +       int needs_in, needs_out;
> +       int fdin[2], fdout[2], pid;
> +
> +       needs_in = in && *in < 0;
> +       if (needs_in) {
> +               if (pipe(fdin) < 0)
> +                       return error("could not setup pipe");
> +               *in = fdin[1];
> +       }
> +
> +       needs_out = out && *out < 0;
> +       if (needs_out) {
> +               if (pipe(fdout) < 0)
> +                       return error("could not setup pipe");
> +               *out = fdout[0];
> +       }
> +
> +       if ((pid = fork()) < 0) {
> +               if (needs_in) {
> +                       close(fdin[0]);
> +                       close(fdin[1]);
> +               }
> +               if (needs_out) {
> +                       close(fdout[0]);
> +                       close(fdout[1]);
> +               }
> +               return error("could not fork");
> +       }
> +       if (!pid) {
> +               if (needs_in) {
> +                       dup2(fdin[0], 0);
> +                       close(fdin[0]);
> +                       close(fdin[1]);
> +               } else if (in)
> +                       dup2(*in, 0);
> +               if (needs_out) {
> +                       dup2(fdout[1], 1);
> +                       close(fdout[0]);
> +                       close(fdout[1]);
> +               } else if (out)
> +                       dup2(*out, 1);
> +               exit(execv_git_cmd(argv));
> +       }
> +       if (needs_in)
> +               close(fdin[0]);
> +       if (needs_out)
> +               close(fdout[1]);
> +       return pid;
> +}

This function looks very similar to spawnvppe, which I use in the MinGW
port for a number of fork/exec pairs. Do you see a chance to make this
into a helper that can be used in more places (so that it reduces the
differences to the MinGW code)?

> +       in = out = -1;
> +       pid = fork_with_pipe(argv, &in, &out);
> +       if (pid < 0)
> +               return error("Could not fork rev-list");
> +       if (!fork()) {
> +               for (i = 0; i < p->nr; i++) {
> +                       write(in, sha1_to_hex(p->list[i].sha1), 40);
> +                       write(in, "\n", 1);
> +               }
> +               close(in);
> +               exit(0);
> +       }
> +       close(in);

Oops, fork error check missing?

-- Hannes

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Add git-bundle: move objects and references by archive
  2007-02-22  6:56   ` Junio C Hamano
@ 2007-02-22  7:08     ` Junio C Hamano
  2007-02-22 16:20       ` Johannes Schindelin
  2007-02-22 16:17     ` Johannes Schindelin
  2007-02-23  2:32     ` Mark Levedahl
  2 siblings, 1 reply; 32+ messages in thread
From: Junio C Hamano @ 2007-02-22  7:08 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Mark Levedahl, junkio

Junio C Hamano <junkio@cox.net> writes:

>> +static int verify_bundle(struct bundle_header *header)
>> +{
>> +	/*
>> +	 * Do fast check, then if any prereqs are missing then go line by line
>> +	 * to be verbose about the errors
>> +	 */
>> +	struct ref_list *p = &header->prerequisites;
>> +	const char *argv[5] = {"rev-list", "--stdin", "--not", "--all", NULL};
>> +	int pid, in, out, i, ret = 0;
>> +	char buffer[1024];
>> +
>> +	in = out = -1;
>> +	pid = fork_with_pipe(argv, &in, &out);
>> +	if (pid < 0)
>> +		return error("Could not fork rev-list");
>> +	if (!fork()) {
>> +		for (i = 0; i < p->nr; i++) {
>> +			write(in, sha1_to_hex(p->list[i].sha1), 40);
>> +			write(in, "\n", 1);
>> +		}
>> +		close(in);
>> +		exit(0);
>> +	}
>> +	close(in);
>
> What if write() fails?  That can happen when one of the objects
> you feed here, or its parent objects, is missing from your
> repository -- receiving rev-list would die() and the writing
> child would sigpipe.
>
> I also wonder who waits for this child...

In general, fork() to avoid bidirectional pipe deadlock is a
good discipline, but in this particular case I think it would be
easier to handle errors if you don't (and it would save another
process).  The other side "rev-list --stdin --not --all" is
running a limited traversal, and would not emit anything until
you stop feeding it from --stdin, or until it dies because you
fed it a commit that does not exist.  So as long as you check
the error condition from write() for EPIPE to notice the other
end died, I think you are Ok.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Add git-bundle: move objects and references by archive
  2007-02-22  0:59 ` Johannes Schindelin
  2007-02-22  3:28   ` Nicolas Pitre
@ 2007-02-22  6:56   ` Junio C Hamano
  2007-02-22  7:08     ` Junio C Hamano
                       ` (2 more replies)
  2007-02-22  9:31   ` Johannes Sixt
  2 siblings, 3 replies; 32+ messages in thread
From: Junio C Hamano @ 2007-02-22  6:56 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Mark Levedahl, junkio

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> It was updated to make git-bundle a builtin, and get rid of the tar 
> format: now, the first line is supposed to say "# v2 git bundle", the next 
> lines either contain a prerequisite ("-" followed by the hash of the 
> needed commit), or a ref (the hash of a commit, followed by the name of 
> the ref), and finally the pack. As a result, the bundle argument can be 
> "-" now.

That in BNF would be?

	bundle = signature prereq* ref* pack
	signature = "# v2 git bundle\n"
        prereq = "-" sha1 " " "\n"
        ref = sha1 " " name "\n"
	sha1 = [0-9a-f]{40}
	name = <name of ref>
	pack = <output stream from "pack-objects --stdout">

I suspect that we might want to possibly:

 - have checksum protection over the part before pack data?

 - give descriptive name to prereq, so that an error message can
   say "you need v1.4.4" instead of "you need e267c2f6"?

 - make the fields extensible without updating "v2"?

> diff --git a/Documentation/cmd-list.perl b/Documentation/cmd-list.perl
> index a2d6268..f61c77a 100755
> --- a/Documentation/cmd-list.perl
> +++ b/Documentation/cmd-list.perl
> @@ -70,6 +70,7 @@ git-archive                             mainporcelain
>  git-bisect                              mainporcelain
>  git-blame                               ancillaryinterrogators
>  git-branch                              mainporcelain
> +git-bundle                              mainporcelain
>  git-cat-file                            plumbinginterrogators
>  git-checkout-index                      plumbingmanipulators
>  git-checkout                            mainporcelain

Is this really a mainporcelain?
I would say ancillarymanipulators (or perhaps synchelpers).

> diff --git a/Documentation/git-bundle.txt b/Documentation/git-bundle.txt
> new file mode 100644
> index 0000000..4ea9e85
> ...
> +Author
> +------
> +Written by Mark Levedahl <mdl123@verizon.net>

... and Dscho.

> diff --git a/builtin-bundle.c b/builtin-bundle.c
> new file mode 100644
> index 0000000..4bd596a
> --- /dev/null
> +++ b/builtin-bundle.c
> ...
> +static const char *bundle_usage="git-bundle (--create <bundle> <git-rev-list args> | --verify <bundle> | --list-heads <bundle> [refname]... | --unbundle <bundle> [refname]... )";

I thought you removed "--" prefixes from subcommands.

> +struct bundle_header {
> +	struct ref_list prerequisites, references;
> +};

(minor style) I find these two members each on its own line
easier to read, as in:

        struct bundle_header {
                struct ref_list prerequisites;
                struct ref_list references;
        };

> +/* this function returns the length of the string */
> +static int read_string(int fd, char *buffer, int size)
> +{
> +	int i;
> +	for (i = 0; i < size - 1; i++) {
> +		int count = read(fd, buffer + i, 1);
> +		if (count < 0)
> +			return error("Read error: %s", strerror(errno));
> +		if (count == 0) {
> +			i--;
> +			break;
> +		}
> +		if (buffer[i] == '\n')
> +			break;
> +	}
> +	buffer[i + 1] = '\0';
> +	return i + 1;
> +}

At least xread() please.  I know the reason you read one byte at
a time is because you cannot over-read into pack area, but
somehow this makes me go "hmmmmmm".  It's not performance
critical so let's leave it that way.  Is bundle supposed to be
streamable?  Otherwise we could probably seek back.

> +/* returns an fd */
> +static int read_header(const char *path, struct bundle_header *header) {
> +	char buffer[1024];
> +	int fd = open(path, O_RDONLY);
> +
> +	if (fd < 0)
> +		return error("could not open '%s'", path);
> +	if (read_string(fd, buffer, sizeof(buffer)) < 0 ||
> +			strcmp(buffer, bundle_signature)) {
> +		close(fd);
> +		return error("'%s' does not look like a v2 bundle file", path);
> +	}
> +	while (read_string(fd, buffer, sizeof(buffer)) > 0
> +			&& buffer[0] != '\n') {
> +		int offset = buffer[0] == '-' ? 1 : 0;
> +		int len = strlen(buffer);
> +		unsigned char sha1[20];
> +		struct ref_list *list = offset ? &header->prerequisites
> +			: &header->references;
> +		if (get_sha1_hex(buffer + offset, sha1)) {
> +			close(fd);
> +			return error("invalid SHA1: %s", buffer);
> +		}
> +		if (buffer[len - 1] == '\n')
> +			buffer[len - 1] = '\0';
> +		add_to_ref_list(sha1, buffer + 41 + offset, list);
> +	}
> +	return fd;
> +}

Don't you want to look at and validate buffer[40+offset] in any
way, other than what get_sha1_hex() does (which is issspace())?

> +/* if in && *in >= 0, take that as input file descriptor instead */
> +static int fork_with_pipe(const char **argv, int *in, int *out)
> +{
> +	int needs_in, needs_out;
> +	int fdin[2], fdout[2], pid;
> +
> +	needs_in = in && *in < 0;
> +	if (needs_in) {
> +		if (pipe(fdin) < 0)
> +			return error("could not setup pipe");
> +		*in = fdin[1];
> +	}
> +
> +	needs_out = out && *out < 0;
> +	if (needs_out) {
> +		if (pipe(fdout) < 0)
> +			return error("could not setup pipe");
> +		*out = fdout[0];
> +	}
> +
> +	if ((pid = fork()) < 0) {
> +		if (needs_in) {
> +			close(fdin[0]);
> +			close(fdin[1]);
> +		}
> +		if (needs_out) {
> +			close(fdout[0]);
> +			close(fdout[1]);
> +		}
> +		return error("could not fork");
> +	}
> +	if (!pid) {
> +		if (needs_in) {
> +			dup2(fdin[0], 0);
> +			close(fdin[0]);
> +			close(fdin[1]);
> +		} else if (in)
> +			dup2(*in, 0);
> +		if (needs_out) {
> +			dup2(fdout[1], 1);
> +			close(fdout[0]);
> +			close(fdout[1]);
> +		} else if (out)
> +			dup2(*out, 1);
> +		exit(execv_git_cmd(argv));
> +	}

Do you need to close *in and *out that are given by the caller
after dup2() in the child?

> +static int verify_bundle(struct bundle_header *header)
> +{
> +	/*
> +	 * Do fast check, then if any prereqs are missing then go line by line
> +	 * to be verbose about the errors
> +	 */
> +	struct ref_list *p = &header->prerequisites;
> +	const char *argv[5] = {"rev-list", "--stdin", "--not", "--all", NULL};
> +	int pid, in, out, i, ret = 0;
> +	char buffer[1024];
> +
> +	in = out = -1;
> +	pid = fork_with_pipe(argv, &in, &out);
> +	if (pid < 0)
> +		return error("Could not fork rev-list");
> +	if (!fork()) {
> +		for (i = 0; i < p->nr; i++) {
> +			write(in, sha1_to_hex(p->list[i].sha1), 40);
> +			write(in, "\n", 1);
> +		}
> +		close(in);
> +		exit(0);
> +	}
> +	close(in);

What if write() fails?  That can happen when one of the objects
you feed here, or its parent objects, is missing from your
repository -- receiving rev-list would die() and the writing
child would sigpipe.

I also wonder who waits for this child...

> +	while (read_string(out, buffer, sizeof(buffer)) > 0) {
> +		if (ret++ == 0)
> +			error ("The bundle requires the following commits you lack:");
> +		fprintf(stderr, "%s", buffer);
> +	}
> +	close(out);

I do not think this error message would buy you anything.  If
you say:

	echo commit | rev-list --stdin --not --all

and you got commit back, that means you _do_ have that commit,
and that commit reaches some ref you have (objects associated to
that commit may be still missing).  If commit is truly missing,
then you do not get commit output from rev-list as it cannot
even determine if it is "--not --all" or not -- it would error
out, which is the more important check to see if prereqs are
lacking.

> +	while (waitpid(pid, &i, 0) < 0)
> +		if (errno != EINTR)
> +			return -1;
> +	if (!ret && (!WIFEXITED(i) || WEXITSTATUS(i)))
> +		return error("At least one prerequisite is lacking.");

So you would want to keep this error, perhaps even make it more
helpful by suggesting which ones are missing.

> +static int list_heads(struct bundle_header *header, int argc, const char **argv)
> +{
> +	int i;
> +	struct ref_list *r = &header->references;
> +
> +	for (i = 0; i < r->nr; i++) {
> +		if (argc > 1) {
> +			int j;
> +			for (j = 1; j < argc; j++)
> +				if (!strcmp(r->list[i].name, argv[j]))
> +					break;
> +			if (j == argc)
> +				continue;
> +		}
> +		printf("%s %s\n", sha1_to_hex(r->list[i].sha1),
> +				r->list[i].name);
> +	}
> +	return 0;
> +}

I wonder if we want to have a way to list prereqs in similar way.

> +static void show_commit(struct commit *commit)
> +{
> +	write(1, sha1_to_hex(commit->object.sha1), 40);
> +	write(1, "\n", 1);
> +	if (commit->parents) {
> +		free_commit_list(commit->parents);
> +		commit->parents = NULL;
> +	}
> +}

(everywhere) We would want write_in_full() with error handling
that dies even on EPIPE.

> +
> +static void show_object(struct object_array_entry *p)
> +{
> +	/* An object with name "foo\n0000000..." can be used to
> +	 *          * confuse downstream git-pack-objects very badly.
> +	 *                   */

An interesting way to wrap comments.

> +	/* write prerequisites */
> +	memcpy(argv_boundary + 2, argv + 1, argc * sizeof(const char *));
> +	argv_boundary[0] = "rev-list";
> +	argv_boundary[1] = "--boundary";
> +	argv_boundary[argc + 1] = NULL;
> +	out = -1;
> +	pid = fork_with_pipe(argv_boundary, NULL, &out);
> +	if (pid < 0)
> +		return -1;
> +	while ((i = read_string(out, buffer, sizeof(buffer))) > 0)
> +		if (buffer[0] == '-')
> +			write(bundle_fd, buffer, i);

It would be helpful for the recipient if you can append output
from git-describe (or name-rev) when the buffer lacks "name".

> +static int unbundle(struct bundle_header *header, int bundle_fd,
> +		int argc, const char **argv)
> +{
> +	const char *argv_index_pack[] = {"index-pack", "--stdin", NULL};
> +	int pid, status, dev_null;
> +
> +	if (verify_bundle(header))
> +		return -1;
> +	dev_null = open("/dev/null", O_WRONLY);
> +	pid = fork_with_pipe(argv_index_pack, &bundle_fd, &dev_null);
> +	if (pid < 0)
> +		return error("Could not spawn index-pack");
> +	close(bundle_fd);
> +	while (waitpid(pid, &status, 0) < 0)
> +		if (errno != EINTR)
> +			return error("index-pack died");
> +	if (!WIFEXITED(status) || WEXITSTATUS(status))
> +		return error("index-pack exited with status %d",
> +				WEXITSTATUS(status));
> +	return list_heads(header, argc, argv);
> +}

We might want to later use --keep for this as well...

> diff --git a/index-pack.c b/index-pack.c
> index fa9a0e7..5ccf4c4 100644
> --- a/index-pack.c
> +++ b/index-pack.c
> @@ -457,8 +457,8 @@ static void parse_pack_objects(unsigned char *sha1)
>  	/* If input_fd is a file, we should have reached its end now. */
>  	if (fstat(input_fd, &st))
>  		die("cannot fstat packfile: %s", strerror(errno));
> -	if (S_ISREG(st.st_mode) && st.st_size != consumed_bytes)
> -		die("pack has junk at the end");
> +	if (input_fd && S_ISREG(st.st_mode) && st.st_size != consumed_bytes)
> +		die("pack has junk at the end: 0%o, %d, %d %d", st.st_mode, (int)st.st_size, (int)consumed_bytes, input_fd);
>  
>  	if (!nr_deltas)
>  		return;

??

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Add git-bundle: move objects and references by archive
  2007-02-22  0:59 ` Johannes Schindelin
@ 2007-02-22  3:28   ` Nicolas Pitre
  2007-02-22 15:55     ` Johannes Schindelin
  2007-02-22  6:56   ` Junio C Hamano
  2007-02-22  9:31   ` Johannes Sixt
  2 siblings, 1 reply; 32+ messages in thread
From: Nicolas Pitre @ 2007-02-22  3:28 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Mark Levedahl, Junio C Hamano

On Thu, 22 Feb 2007, Johannes Schindelin wrote:

> diff --git a/index-pack.c b/index-pack.c
> index fa9a0e7..5ccf4c4 100644
> --- a/index-pack.c
> +++ b/index-pack.c
> @@ -457,8 +457,8 @@ static void parse_pack_objects(unsigned char *sha1)
>  	/* If input_fd is a file, we should have reached its end now. */
>  	if (fstat(input_fd, &st))
>  		die("cannot fstat packfile: %s", strerror(errno));
> -	if (S_ISREG(st.st_mode) && st.st_size != consumed_bytes)
> -		die("pack has junk at the end");
> +	if (input_fd && S_ISREG(st.st_mode) && st.st_size != consumed_bytes)
> +		die("pack has junk at the end: 0%o, %d, %d %d", st.st_mode, (int)st.st_size, (int)consumed_bytes, input_fd);
>  
>  	if (!nr_deltas)
>  		return;

What is this supposed to mean?


Nicolas

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH] Add git-bundle: move objects and references by archive
@ 2007-02-22  0:59 ` Johannes Schindelin
  2007-02-22  3:28   ` Nicolas Pitre
                     ` (2 more replies)
  0 siblings, 3 replies; 32+ messages in thread
From: Johannes Schindelin @ 2007-02-22  0:59 UTC (permalink / raw)
  To: git; +Cc: Mark Levedahl, junkio


Some workflows require use of repositories on machines that cannot be 
connected, preventing use of git-fetch / git-push to transport objects and 
references between the repositories.

git-bundle provides an alternate transport mechanism, effectively allowing 
git-fetch and git-pull to operate using sneakernet transport. `git-bundle 
create` allows the user to create a bundle containing one or more branches 
or tags, but with specified basis assumed to exist on the target 
repository. At the receiving end, git-bundle acts like git-fetch-pack, 
allowing the user to invoke git-fetch or git-pull using the bundle file as 
the URL. git-fetch and git-ls-remote determine they have a bundle URL by 
checking that the URL points to a file, but are otherwise unchanged in 
operation with bundles.

The original patch was done by Mark Levedahl <mdl123@verizon.net>.

It was updated to make git-bundle a builtin, and get rid of the tar 
format: now, the first line is supposed to say "# v2 git bundle", the next 
lines either contain a prerequisite ("-" followed by the hash of the 
needed commit), or a ref (the hash of a commit, followed by the name of 
the ref), and finally the pack. As a result, the bundle argument can be 
"-" now.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---

	This patch should apply cleanly to "next".

	Mark, could you test on cygwin?

 .gitignore                   |    1 +
 Documentation/cmd-list.perl  |    1 +
 Documentation/git-bundle.txt |  139 +++++++++++++++
 Makefile                     |    1 +
 builtin-bundle.c             |  389 ++++++++++++++++++++++++++++++++++++++++++
 builtin.h                    |    1 +
 git-fetch.sh                 |    7 +
 git-ls-remote.sh             |    7 +-
 git.c                        |    1 +
 index-pack.c                 |    4 +-
 t/t5510-fetch.sh             |   28 +++-
 11 files changed, 575 insertions(+), 4 deletions(-)
 create mode 100644 Documentation/git-bundle.txt
 create mode 100644 builtin-bundle.c

diff --git a/.gitignore b/.gitignore
index f15155d..4d3c66d 100644
--- a/.gitignore
+++ b/.gitignore
@@ -13,6 +13,7 @@ git-archive
 git-bisect
 git-blame
 git-branch
+git-bundle
 git-cat-file
 git-check-ref-format
 git-checkout
diff --git a/Documentation/cmd-list.perl b/Documentation/cmd-list.perl
index a2d6268..f61c77a 100755
--- a/Documentation/cmd-list.perl
+++ b/Documentation/cmd-list.perl
@@ -70,6 +70,7 @@ git-archive                             mainporcelain
 git-bisect                              mainporcelain
 git-blame                               ancillaryinterrogators
 git-branch                              mainporcelain
+git-bundle                              mainporcelain
 git-cat-file                            plumbinginterrogators
 git-checkout-index                      plumbingmanipulators
 git-checkout                            mainporcelain
diff --git a/Documentation/git-bundle.txt b/Documentation/git-bundle.txt
new file mode 100644
index 0000000..4ea9e85
--- /dev/null
+++ b/Documentation/git-bundle.txt
@@ -0,0 +1,139 @@
+git-bundle(1)
+=============
+
+NAME
+----
+git-bundle - Move objects and refs by archive
+
+
+SYNOPSIS
+--------
+'git-bundle' create <file> [git-rev-list args]
+'git-bundle' verify <file>
+'git-bundle' list-heads <file> [refname...]
+'git-bundle' unbundle <file> [refname...]
+
+DESCRIPTION
+-----------
+
+Some workflows require that one or more branches of development on one
+machine be replicated on another machine, but the two machines cannot
+be directly connected so the interactive git protocols (git, ssh,
+rsync, http) cannot be used.  This command provides suport for
+git-fetch and git-pull to operate by packaging objects and references
+in an archive at the originating machine, then importing those into
+another repository using gitlink:git-fetch[1] and gitlink:git-pull[1]
+after moving the archive by some means (i.e., by sneakernet).  As no
+direct connection between repositories exists, the user must specify a
+basis for the bundle that is held by the destination repository: the
+bundle assumes that all objects in the basis are already in the
+destination repository.
+
+OPTIONS
+-------
+
+create <file>::
+       Used to create a bundle named 'file'.  This requires the
+       git-rev-list arguments to define the bundle contents.
+
+verify <file>::
+       Used to check that a bundle file is valid and will apply
+       cleanly to the current repository.  This includes checks on the
+       bundle format itself as well as checking that the prerequisite
+       commits exist and are fully linked in the current repository.
+       git-bundle prints a list of missing commits, if any, and exits
+       with non-zero status.
+
+list-heads <file>::
+       Lists the references defined in the bundle.  If followed by a
+       list of references, only references matching those given are
+       printed out.
+
+unbundle <file>::
+       Passes the objects in the bundle to gitlink:git-index-pack[1]
+       for storage in the repository, then prints the names of all
+       defined references. If a reflist is given, only references
+       matching those in the given list are printed. This command is
+       really plumbing, intended to be called only by
+       gitlink:git-fetch[1].
+
+[git-rev-list-args...]::
+       A list of arguments, accepatble to git-rev-parse and
+       git-rev-list, that specify the specific objects and references
+       to transport.  For example, "master~10..master" causes the
+       current master reference to be packaged along with all objects
+       added since its 10th ancestor commit.  There is no explicit
+       limit to the number of references and objects that may be
+       packaged.
+
+
+[refname...]::
+       A list of references used to limit the references reported as
+       available. This is principally of use to git-fetch, which
+       expects to recieve only those references asked for and not
+       necessarily everything in the pack (in this case, git-bundle is
+       acting like gitlink:git-fetch-pack[1]).
+
+SPECIFYING REFERENCES
+---------------------
+
+git-bundle will only package references that are shown by
+git-show-ref: this includes heads, tags, and remote heads.  References
+such as master~1 cannot be packaged, but are perfectly suitable for
+defining the basis.  More than one reference may be packaged, and more
+than one basis can be specified.  The objects packaged are those not
+contained in the union of the given bases.  Each basis can be
+specified explicitly (e.g., ^master~10), or implicitly (e.g.,
+master~10..master, master --since=10.days.ago).
+
+It is very important that the basis used be held by the destination.
+It is ok to err on the side of conservatism, causing the bundle file
+to contain objects already in the destination as these are ignored
+when unpacking at the destination.
+
+EXAMPLE
+-------
+
+Assume two repositories exist as R1 on machine A, and R2 on machine B.
+For whatever reason, direct connection between A and B is not allowed,
+but we can move data from A to B via some mechanism (CD, email, etc).
+We want to update R2 with developments made on branch master in R1.
+We set a tag in R1 (lastR2bundle) after the previous such transport,
+and move it afterwards to help build the bundle.
+
+in R1 on A:
+$ git-bundle create mybundle master ^lastR2bundle
+$ git tag -f lastR2bundle master
+
+(move mybundle from A to B by some mechanism)
+
+in R2 on B:
+$ git-bundle verify mybundle
+$ git-fetch mybundle  refspec
+
+where refspec is refInBundle:localRef
+
+
+Also, with something like this in your config:
+
+[remote "bundle"]
+    url = /home/me/tmp/file.bdl
+    fetch = refs/heads/*:refs/remotes/origin/*
+
+You can first sneakernet the bundle file to ~/tmp/file.bdl and
+then these commands:
+
+$ git ls-remote bundle
+$ git fetch bundle
+$ git pull bundle
+
+would treat it as if it is talking with a remote side over the
+network.
+
+Author
+------
+Written by Mark Levedahl <mdl123@verizon.net>
+
+GIT
+---
+Part of the gitlink:git[7] suite
diff --git a/Makefile b/Makefile
index 56bb0b8..03a7f4a 100644
--- a/Makefile
+++ b/Makefile
@@ -276,6 +276,7 @@ BUILTIN_OBJS = \
 	builtin-archive.o \
 	builtin-blame.o \
 	builtin-branch.o \
+	builtin-bundle.o \
 	builtin-cat-file.o \
 	builtin-checkout-index.o \
 	builtin-check-ref-format.o \
diff --git a/builtin-bundle.c b/builtin-bundle.c
new file mode 100644
index 0000000..4bd596a
--- /dev/null
+++ b/builtin-bundle.c
@@ -0,0 +1,389 @@
+#include "cache.h"
+#include "object.h"
+#include "commit.h"
+#include "diff.h"
+#include "revision.h"
+#include "list-objects.h"
+#include "exec_cmd.h"
+
+/*
+ * Basic handler for bundle files to connect repositories via sneakernet.
+ * Invocation must include action.
+ * This function can create a bundle or provide information on an existing
+ * bundle supporting git-fetch, git-pull, and git-ls-remote
+ */
+
+static const char *bundle_usage="git-bundle (--create <bundle> <git-rev-list args> | --verify <bundle> | --list-heads <bundle> [refname]... | --unbundle <bundle> [refname]... )";
+
+static const char bundle_signature[] = "# v2 git bundle\n";
+
+struct ref_list {
+	unsigned int nr, alloc;
+	struct {
+		unsigned char sha1[20];
+		char *name;
+	} *list;
+};
+
+static void add_to_ref_list(const unsigned char *sha1, const char *name,
+		struct ref_list *list)
+{
+	if (list->nr + 1 >= list->alloc) {
+		list->alloc = alloc_nr(list->nr + 1);
+		list->list = xrealloc(list->list,
+				list->alloc * sizeof(list->list[0]));
+	}
+	memcpy(list->list[list->nr].sha1, sha1, 20);
+	list->list[list->nr].name = xstrdup(name);
+	list->nr++;
+}
+
+struct bundle_header {
+	struct ref_list prerequisites, references;
+};
+
+/* this function returns the length of the string */
+static int read_string(int fd, char *buffer, int size)
+{
+	int i;
+	for (i = 0; i < size - 1; i++) {
+		int count = read(fd, buffer + i, 1);
+		if (count < 0)
+			return error("Read error: %s", strerror(errno));
+		if (count == 0) {
+			i--;
+			break;
+		}
+		if (buffer[i] == '\n')
+			break;
+	}
+	buffer[i + 1] = '\0';
+	return i + 1;
+}
+
+/* returns an fd */
+static int read_header(const char *path, struct bundle_header *header) {
+	char buffer[1024];
+	int fd = open(path, O_RDONLY);
+
+	if (fd < 0)
+		return error("could not open '%s'", path);
+	if (read_string(fd, buffer, sizeof(buffer)) < 0 ||
+			strcmp(buffer, bundle_signature)) {
+		close(fd);
+		return error("'%s' does not look like a v2 bundle file", path);
+	}
+	while (read_string(fd, buffer, sizeof(buffer)) > 0
+			&& buffer[0] != '\n') {
+		int offset = buffer[0] == '-' ? 1 : 0;
+		int len = strlen(buffer);
+		unsigned char sha1[20];
+		struct ref_list *list = offset ? &header->prerequisites
+			: &header->references;
+		if (get_sha1_hex(buffer + offset, sha1)) {
+			close(fd);
+			return error("invalid SHA1: %s", buffer);
+		}
+		if (buffer[len - 1] == '\n')
+			buffer[len - 1] = '\0';
+		add_to_ref_list(sha1, buffer + 41 + offset, list);
+	}
+	return fd;
+}
+
+/* if in && *in >= 0, take that as input file descriptor instead */
+static int fork_with_pipe(const char **argv, int *in, int *out)
+{
+	int needs_in, needs_out;
+	int fdin[2], fdout[2], pid;
+
+	needs_in = in && *in < 0;
+	if (needs_in) {
+		if (pipe(fdin) < 0)
+			return error("could not setup pipe");
+		*in = fdin[1];
+	}
+
+	needs_out = out && *out < 0;
+	if (needs_out) {
+		if (pipe(fdout) < 0)
+			return error("could not setup pipe");
+		*out = fdout[0];
+	}
+
+	if ((pid = fork()) < 0) {
+		if (needs_in) {
+			close(fdin[0]);
+			close(fdin[1]);
+		}
+		if (needs_out) {
+			close(fdout[0]);
+			close(fdout[1]);
+		}
+		return error("could not fork");
+	}
+	if (!pid) {
+		if (needs_in) {
+			dup2(fdin[0], 0);
+			close(fdin[0]);
+			close(fdin[1]);
+		} else if (in)
+			dup2(*in, 0);
+		if (needs_out) {
+			dup2(fdout[1], 1);
+			close(fdout[0]);
+			close(fdout[1]);
+		} else if (out)
+			dup2(*out, 1);
+		exit(execv_git_cmd(argv));
+	}
+	if (needs_in)
+		close(fdin[0]);
+	if (needs_out)
+		close(fdout[1]);
+	return pid;
+}
+
+static int verify_bundle(struct bundle_header *header)
+{
+	/*
+	 * Do fast check, then if any prereqs are missing then go line by line
+	 * to be verbose about the errors
+	 */
+	struct ref_list *p = &header->prerequisites;
+	const char *argv[5] = {"rev-list", "--stdin", "--not", "--all", NULL};
+	int pid, in, out, i, ret = 0;
+	char buffer[1024];
+
+	in = out = -1;
+	pid = fork_with_pipe(argv, &in, &out);
+	if (pid < 0)
+		return error("Could not fork rev-list");
+	if (!fork()) {
+		for (i = 0; i < p->nr; i++) {
+			write(in, sha1_to_hex(p->list[i].sha1), 40);
+			write(in, "\n", 1);
+		}
+		close(in);
+		exit(0);
+	}
+	close(in);
+	while (read_string(out, buffer, sizeof(buffer)) > 0) {
+		if (ret++ == 0)
+			error ("The bundle requires the following commits you lack:");
+		fprintf(stderr, "%s", buffer);
+	}
+	close(out);
+	while (waitpid(pid, &i, 0) < 0)
+		if (errno != EINTR)
+			return -1;
+	if (!ret && (!WIFEXITED(i) || WEXITSTATUS(i)))
+		return error("At least one prerequisite is lacking.");
+
+	return ret;
+}
+
+static int list_heads(struct bundle_header *header, int argc, const char **argv)
+{
+	int i;
+	struct ref_list *r = &header->references;
+
+	for (i = 0; i < r->nr; i++) {
+		if (argc > 1) {
+			int j;
+			for (j = 1; j < argc; j++)
+				if (!strcmp(r->list[i].name, argv[j]))
+					break;
+			if (j == argc)
+				continue;
+		}
+		printf("%s %s\n", sha1_to_hex(r->list[i].sha1),
+				r->list[i].name);
+	}
+	return 0;
+}
+
+static void show_commit(struct commit *commit)
+{
+	write(1, sha1_to_hex(commit->object.sha1), 40);
+	write(1, "\n", 1);
+	if (commit->parents) {
+		free_commit_list(commit->parents);
+		commit->parents = NULL;
+	}
+}
+
+static void show_object(struct object_array_entry *p)
+{
+	/* An object with name "foo\n0000000..." can be used to
+	 *          * confuse downstream git-pack-objects very badly.
+	 *                   */
+	const char *ep = strchr(p->name, '\n');
+	int len = ep ? ep - p->name : strlen(p->name);
+	write(1, sha1_to_hex(p->item->sha1), 40);
+	write(1, " ", 1);
+	if (len)
+		write(1, p->name, len);
+	write(1, "\n", 1);
+}
+
+static int create_bundle(struct bundle_header *header, const char *path,
+		int argc, const char **argv)
+{
+	int bundle_fd = -1;
+	const char **argv_boundary = xmalloc((argc + 3) * sizeof(const char *));
+	const char **argv_pack = xmalloc(4 * sizeof(const char *));
+	int pid, in, out, i, status;
+	char buffer[1024];
+	struct rev_info revs;
+
+	bundle_fd = (!strcmp(path, "-") ? 1 :
+			open(path, O_CREAT | O_WRONLY, 0666));
+	if (bundle_fd < 0)
+		return error("Could not write to '%s'", path);
+
+	/* write signature */
+	write(bundle_fd, bundle_signature, strlen(bundle_signature));
+
+	/* write prerequisites */
+	memcpy(argv_boundary + 2, argv + 1, argc * sizeof(const char *));
+	argv_boundary[0] = "rev-list";
+	argv_boundary[1] = "--boundary";
+	argv_boundary[argc + 1] = NULL;
+	out = -1;
+	pid = fork_with_pipe(argv_boundary, NULL, &out);
+	if (pid < 0)
+		return -1;
+	while ((i = read_string(out, buffer, sizeof(buffer))) > 0)
+		if (buffer[0] == '-')
+			write(bundle_fd, buffer, i);
+	while ((i = waitpid(pid, &status, 0)) < 0)
+		if (errno != EINTR)
+			return error("rev-list died");
+	if (!WIFEXITED(status) || WEXITSTATUS(status))
+		return error("rev-list died %d", WEXITSTATUS(status));
+
+	/* write references */
+	save_commit_buffer = 0;
+	init_revisions(&revs, NULL);
+	revs.tag_objects = 1;
+	revs.tree_objects = 1;
+	revs.blob_objects = 1;
+	argc = setup_revisions(argc, argv, &revs, NULL);
+	if (argc > 1)
+		return error("unrecognized argument: %s'", argv[1]);
+	for (i = 0; i < revs.pending.nr; i++) {
+		struct object_array_entry *e = revs.pending.objects + i;
+		if (!(e->item->flags & UNINTERESTING)) {
+			unsigned char sha1[20];
+			char *ref;
+			if (dwim_ref(e->name, strlen(e->name), sha1, &ref) != 1)
+				continue;
+			write(bundle_fd, sha1_to_hex(e->item->sha1), 40);
+			write(bundle_fd, " ", 1);
+			write(bundle_fd, ref, strlen(ref));
+			write(bundle_fd, "\n", 1);
+			free(ref);
+		}
+	}
+
+	/* end header */
+	write(bundle_fd, "\n", 1);
+
+	/* write pack */
+	argv_pack[0] = "pack-objects";
+	argv_pack[1] = "--all-progress";
+	argv_pack[2] = "--stdout";
+	argv_pack[3] = NULL;
+	in = -1;
+	out = bundle_fd;
+	pid = fork_with_pipe(argv_pack, &in, &out);
+	if (pid < 0)
+		return error("Could not spawn pack-objects");
+	close(1);
+	close(bundle_fd);
+	dup2(in, 1);
+	close(in);
+	prepare_revision_walk(&revs);
+	traverse_commit_list(&revs, show_commit, show_object);
+	close(1);
+	while (waitpid(pid, &status, 0) < 0)
+		if (errno != EINTR)
+			return -1;
+	if (!WIFEXITED(status) || WEXITSTATUS(status))
+		return error ("pack-objects died");
+	return 0;
+}
+
+static int unbundle(struct bundle_header *header, int bundle_fd,
+		int argc, const char **argv)
+{
+	const char *argv_index_pack[] = {"index-pack", "--stdin", NULL};
+	int pid, status, dev_null;
+
+	if (verify_bundle(header))
+		return -1;
+	dev_null = open("/dev/null", O_WRONLY);
+	pid = fork_with_pipe(argv_index_pack, &bundle_fd, &dev_null);
+	if (pid < 0)
+		return error("Could not spawn index-pack");
+	close(bundle_fd);
+	while (waitpid(pid, &status, 0) < 0)
+		if (errno != EINTR)
+			return error("index-pack died");
+	if (!WIFEXITED(status) || WEXITSTATUS(status))
+		return error("index-pack exited with status %d",
+				WEXITSTATUS(status));
+	return list_heads(header, argc, argv);
+}
+
+int cmd_bundle(int argc, const char **argv, const char *prefix)
+{
+	struct bundle_header header;
+	int nongit = 0;
+	const char *cmd, *bundle_file;
+	int bundle_fd = -1;
+	char buffer[PATH_MAX];
+
+	if (argc < 3)
+		usage(bundle_usage);
+
+	cmd = argv[1];
+	bundle_file = argv[2];
+	argc -= 2;
+	argv += 2;
+
+	prefix = setup_git_directory_gently(&nongit);
+	if (prefix && bundle_file[0] != '/') {
+		snprintf(buffer, sizeof(buffer), "%s/%s", prefix, bundle_file);
+		bundle_file = buffer;
+	}
+
+	memset(&header, 0, sizeof(header));
+	if (strcmp(cmd, "create") &&
+			!(bundle_fd = read_header(bundle_file, &header)))
+		return 1;
+
+	if (!strcmp(cmd, "verify")) {
+		close(bundle_fd);
+		if (verify_bundle(&header))
+			return 1;
+		fprintf(stderr, "%s is okay\n", bundle_file);
+		return 0;
+	}
+	if (!strcmp(cmd, "list-heads")) {
+		close(bundle_fd);
+		return !!list_heads(&header, argc, argv);
+	}
+	if (!strcmp(cmd, "create")) {
+		if (nongit)
+			die("Need a repository to create a bundle.");
+		return !!create_bundle(&header, bundle_file, argc, argv);
+	} else if (!strcmp(cmd, "unbundle")) {
+		if (nongit)
+			die("Need a repository to unbundle.");
+		return !!unbundle(&header, bundle_fd, argc, argv);
+	} else
+		usage(bundle_usage);
+}
+
diff --git a/builtin.h b/builtin.h
index 57e8741..528074b 100644
--- a/builtin.h
+++ b/builtin.h
@@ -19,6 +19,7 @@ extern int cmd_apply(int argc, const char **argv, const char *prefix);
 extern int cmd_archive(int argc, const char **argv, const char *prefix);
 extern int cmd_blame(int argc, const char **argv, const char *prefix);
 extern int cmd_branch(int argc, const char **argv, const char *prefix);
+extern int cmd_bundle(int argc, const char **argv, const char *prefix);
 extern int cmd_cat_file(int argc, const char **argv, const char *prefix);
 extern int cmd_checkout_index(int argc, const char **argv, const char *prefix);
 extern int cmd_check_ref_format(int argc, const char **argv, const char *prefix);
diff --git a/git-fetch.sh b/git-fetch.sh
index 5d3fec0..5ae0d28 100755
--- a/git-fetch.sh
+++ b/git-fetch.sh
@@ -388,9 +388,16 @@ fetch_main () {
     ( : subshell because we muck with IFS
       IFS=" 	$LF"
       (
+	if test -f "$remote" ; then
+	    test -n "$shallow_depth" &&
+		die "shallow clone with bundle is not supported"
+	    git-bundle unbundle "$remote" $rref ||
+	    echo failed "$remote"
+	else
 	  git-fetch-pack --thin $exec $keep $shallow_depth $no_progress \
 		"$remote" $rref ||
 	  echo failed "$remote"
+	fi
       ) |
       (
 	trap '
diff --git a/git-ls-remote.sh b/git-ls-remote.sh
index 8ea5c5e..a6ed99a 100755
--- a/git-ls-remote.sh
+++ b/git-ls-remote.sh
@@ -89,8 +89,13 @@ rsync://* )
 	;;
 
 * )
-	git-peek-remote $exec "$peek_repo" ||
+	if test -f "$peek_repo" ; then
+		git bundle list-heads "$peek_repo" ||
 		echo "failed	slurping"
+	else
+		git-peek-remote $exec "$peek_repo" ||
+		echo "failed	slurping"
+	fi
 	;;
 esac |
 sort -t '	' -k 2 |
diff --git a/git.c b/git.c
index 83f3d90..a184d47 100644
--- a/git.c
+++ b/git.c
@@ -229,6 +229,7 @@ static void handle_internal_command(int argc, const char **argv, char **envp)
 		{ "archive", cmd_archive },
 		{ "blame", cmd_blame, RUN_SETUP },
 		{ "branch", cmd_branch, RUN_SETUP },
+		{ "bundle", cmd_bundle },
 		{ "cat-file", cmd_cat_file, RUN_SETUP },
 		{ "checkout-index", cmd_checkout_index, RUN_SETUP },
 		{ "check-ref-format", cmd_check_ref_format },
diff --git a/index-pack.c b/index-pack.c
index fa9a0e7..5ccf4c4 100644
--- a/index-pack.c
+++ b/index-pack.c
@@ -457,8 +457,8 @@ static void parse_pack_objects(unsigned char *sha1)
 	/* If input_fd is a file, we should have reached its end now. */
 	if (fstat(input_fd, &st))
 		die("cannot fstat packfile: %s", strerror(errno));
-	if (S_ISREG(st.st_mode) && st.st_size != consumed_bytes)
-		die("pack has junk at the end");
+	if (input_fd && S_ISREG(st.st_mode) && st.st_size != consumed_bytes)
+		die("pack has junk at the end: 0%o, %d, %d %d", st.st_mode, (int)st.st_size, (int)consumed_bytes, input_fd);
 
 	if (!nr_deltas)
 		return;
diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh
index 50c6485..fa76662 100755
--- a/t/t5510-fetch.sh
+++ b/t/t5510-fetch.sh
@@ -35,7 +35,9 @@ test_expect_success "clone and setup child repos" '
 		echo "URL: ../two/.git/"
 		echo "Pull: refs/heads/master:refs/heads/two"
 		echo "Pull: refs/heads/one:refs/heads/one"
-	} >.git/remotes/two
+	} >.git/remotes/two &&
+	cd .. &&
+	git clone . bundle
 '
 
 test_expect_success "fetch test" '
@@ -81,4 +83,28 @@ test_expect_success 'fetch following tags' '
 
 '
 
+test_expect_success 'create bundle 1' '
+	cd "$D" &&
+	echo >file updated again by origin &&
+	git commit -a -m "tip" &&
+	git bundle create bundle1 master^..master
+'
+
+test_expect_success 'create bundle 2' '
+	cd "$D" &&
+	git bundle create bundle2 master~2..master
+'
+
+test_expect_failure 'unbundle 1' '
+	cd "$D/bundle" &&
+	git checkout -b some-branch &&
+	git fetch "$D/bundle1" master:master
+'
+
+test_expect_success 'unbundle 2' '
+	cd "$D/bundle" &&
+	git fetch ../bundle2 master:master &&
+	test "tip" = "$(git log -1 --pretty=oneline master | cut -b42-)"
+'
+
 test_done
-- 
1.5.0.1.616.gc6c4-dirty

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH] Add git-bundle: move objects and references by archive.
  2007-02-18 17:27 ` [PATCH] Add git-bundle: move objects and references by archive Mark Levedahl
@ 2007-02-18 22:47   ` Mark Levedahl
  0 siblings, 0 replies; 32+ messages in thread
From: Mark Levedahl @ 2007-02-18 22:47 UTC (permalink / raw)
  To: junkio; +Cc: git

Mark Levedahl wrote:
> -	git-peek-remote $exec "$peek_repo" ||
> +	if test -r "$peek_repo" ; then
>   
oops:                   test    -f
actually works. Corrected patch follows.

Mark

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH] Add git-bundle: move objects and references by archive.
  2007-02-17 18:41 [PATCH] Add git-unbundle - unpack objects and references for disconnected transfer Junio C Hamano
@ 2007-02-18 17:27 ` Mark Levedahl
  2007-02-18 22:47   ` Mark Levedahl
  0 siblings, 1 reply; 32+ messages in thread
From: Mark Levedahl @ 2007-02-18 17:27 UTC (permalink / raw)
  To: junkio, mdl123; +Cc: git, Mark Levedahl

Some workflows require use of repositories on machines that cannot be
connected, preventing use of git-fetch / git-push to transport objects
and references between the repositories.

git-bundle provides an alternate transport mechanism, effectively allowing
git-fetch and git-pull to operate using sneakernet transport. git-bundle
--create allows the user to create a bundle containing one or more branches
or tags, but with specified basis assumed to exist on the target repository.
At the receiving end, git-bundle acts like git-fetch-pack, allowing the
user to invoke git-fetch or git-pull using the bundle file as the URL.
git-fetch and git-ls-remote determine they have a bundle URL by checking
that the URL points to a file, but are otherwise unchanged in operation
with bundles.

Signed-off-by: Mark Levedahl <mdl123@verizon.net>
---
 Documentation/cmd-list.perl  |    1 +
 Documentation/git-bundle.txt |  146 +++++++++++++++++++++++++++++++++++
 Makefile                     |    3 +-
 git-bundle.sh                |  174 ++++++++++++++++++++++++++++++++++++++++++
 git-fetch.sh                 |   11 ++-
 git-ls-remote.sh             |    7 ++-
 6 files changed, 338 insertions(+), 4 deletions(-)
 mode change 100755 => 100644 Documentation/cmd-list.perl
 create mode 100644 Documentation/git-bundle.txt
 create mode 100755 git-bundle.sh

diff --git a/Documentation/cmd-list.perl b/Documentation/cmd-list.perl
old mode 100755
new mode 100644
index a2d6268..f61c77a
--- a/Documentation/cmd-list.perl
+++ b/Documentation/cmd-list.perl
@@ -70,6 +70,7 @@ git-archive                             mainporcelain
 git-bisect                              mainporcelain
 git-blame                               ancillaryinterrogators
 git-branch                              mainporcelain
+git-bundle                              mainporcelain
 git-cat-file                            plumbinginterrogators
 git-checkout-index                      plumbingmanipulators
 git-checkout                            mainporcelain
diff --git a/Documentation/git-bundle.txt b/Documentation/git-bundle.txt
new file mode 100644
index 0000000..27db785
--- /dev/null
+++ b/Documentation/git-bundle.txt
@@ -0,0 +1,146 @@
+git-bundle(1)
+=============
+
+NAME
+----
+git-bundle - Move objects and refs by archive
+
+
+SYNOPSIS
+--------
+'git-bundle' --create file <git-rev-list args> <--tar tarspec>
+'git-bundle' --verify file  <--tar tarspec>
+'git-bundle' --list-heads file <reflist> <--tar tarspec>
+'git-bundle' --unbundle file <reflist> <--tar tarspec>
+
+DESCRIPTION
+-----------
+
+Some workflows require that one or more branches of development on one
+machine be replicated on another machine, but the two machines cannot
+be directly connected so the interactive git protocols (git, ssh,
+rsync, http) cannot be used.  This command provides suport for
+git-fetch and git-pull to operate by packaging objects and references
+in an archive at the originating machine, then importing those into
+another repository using gitlink:git-fetch[1] and gitlink:git-pull[1]
+after moving the archive by some means (i.e., by sneakernet).  As no
+direct connection between repositories exists, the user must specify a
+basis for the bundle that is held by the destination repository: the
+bundle assumes that all objects in the basis are already in the
+destination repository.
+
+OPTIONS
+-------
+
+--create file::
+       Used to create a bundle named 'file'.  This requires the
+       git-rev-list arguments to define the bundle contents.
+
+--verify file::
+       Used to check that a bundle file is valid and will apply
+       cleanly to the current repository.  This includes checks on the
+       bundle format itself as well as checking that the prerequisite
+       commits exist and are fully linked in the current repository.
+       git-bundle prints a list of missing commits, if any, and exits
+       with non-zero status.
+
+--list-heads file::
+       Lists the references defined in the bundle.  If followed by a
+       list of references, only references matching those given are
+       printed out.
+
+--unbundle file::
+       Passes the objects in the bundle to gitlink:git-index-pack[1]
+       for storage in the repository, then prints the names of all
+       defined references. If a reflist is given, only references
+       matching those in the given list are printed. This command is
+       really plumbing, intended to be called only by
+       gitlink:git-fetch[1].
+
+git-rev-list args::
+       A list of arguments, accepatble to git-rev-parse and
+       git-rev-list, that specify the specific objects and references
+       to transport.  For example, "master~10..master" causes the
+       current master reference to be packaged along with all objects
+       added since its 10th ancestor commit.  There is no explicit
+       limit to the number of references and objects that may be
+       packaged.
+
+
+reflist::
+       A list of references used to limit the references reported as
+       available. This is principally of use to git-fetch, which
+       expects to recieve only those references asked for and not
+       necessarily everything in the pack (in this case, git-bundle is
+       acting like gitlink:git-fetch-pack[1]).
+
+tar tarspec::
+
+       git-bundle uses tar, and requires a tar supporting c, f, and
+       -O. By default, git-bundle looks for gtar on the path, then for
+       tar if gtar is not found. This can be overridden by explicitly
+       defining tarspec, or by defining TAR in the environment.
+
+SPECIFYING REFERENCES
+---------------------
+
+git-bundle will only package references that are shown by
+git-show-ref: this includes heads, tags, and remote heads.  References
+such as master~1 cannot be packaged, but are perfectly suitable for
+defining the basis.  More than one reference may be packaged, and more
+than one basis can be specified.  The objects packaged are those not
+contained in the union of the given bases.  Each basis can be
+specified explicitly (e.g., ^master~10), or implicitly (e.g.,
+master~10..master, master --since=10.days.ago).
+
+It is very important that the basis used be held by the destination.
+It is ok to err on the side of conservatism, causing the bundle file
+to contain objects already in the destination as these are ignored
+when unpacking at the destination.
+
+EXAMPLE
+-------
+
+Assume two repositories exist as R1 on machine A, and R2 on machine B.
+For whatever reason, direct connection between A and B is not allowed,
+but we can move data from A to B via some mechanism (CD, email, etc).
+We want to update R2 with developments made on branch master in R1.
+We set a tag in R1 (lastR2bundle) after the previous such transport,
+and move it afterwards to help build the bundle.
+
+in R1 on A:
+$ git-bundle --create mybundle master ^lastR2bundle
+$ git tag -f lastR2bundle master
+
+(move mybundle from A to B by some mechanism)
+
+in R2 on B:
+$ git-bundle --verify mybundle
+$ git-fetch mybundle  refspec
+
+where refspec is refInBundle:localRef
+
+
+Also, with something like this in your config:
+
+[remote "bundle"]
+    url = /home/me/tmp/file.bdl
+    fetch = refs/heads/*:refs/remotes/origin/*
+
+You can first sneakernet the bundle file to ~/tmp/file.bdl and
+then these commands:
+
+$ git ls-remote bundle
+$ git fetch bundle
+$ git pull bundle
+
+would treat it as if it is talking with a remote side over the
+network.
+
+Author
+------
+Written by Mark Levedahl <mdl123@verizon.net>
+
+GIT
+---
+Part of the gitlink:git[7] suite
diff --git a/Makefile b/Makefile
index ebecbbd..c6d540e 100644
--- a/Makefile
+++ b/Makefile
@@ -177,7 +177,8 @@ SCRIPT_SH = \
 	git-applymbox.sh git-applypatch.sh git-am.sh \
 	git-merge.sh git-merge-stupid.sh git-merge-octopus.sh \
 	git-merge-resolve.sh git-merge-ours.sh \
-	git-lost-found.sh git-quiltimport.sh
+	git-lost-found.sh git-quiltimport.sh \
+	git-bundle.sh
 
 SCRIPT_PERL = \
 	git-add--interactive.perl \
diff --git a/git-bundle.sh b/git-bundle.sh
new file mode 100755
index 0000000..2bb9232
--- /dev/null
+++ b/git-bundle.sh
@@ -0,0 +1,174 @@
+#!/bin/sh
+# Basic handler for bundle files to connect repositories via sneakernet.
+# Invocation must include action.
+# This function can create a bundle or provide information on an existing bundle
+# supporting git-fetch, git-pull, and git-ls-remote
+
+USAGE='[--create bundle <git-rev-list args>] |
+[--verify|--list-heads|--unbundle bundle] <--tar tarspec>
+     where bundle is the name of the bundle file.'
+
+verify() {
+    # Check bundle version
+    test -r "$bfile" || die "cannot find $bfile"
+    test "$($TAR -xf ""$bfile"" -O version)" = "v1 git-bundle" ||
+        die "$bfile doesn't look like a v1 bundle file."
+
+	# do fast check, then if any prereqs are missing then go line by line
+	# to be verbose about the errors
+    prereqs=$($TAR xf "$bfile" -O prerequisites)
+	test -z "$prereqs" && return 0
+	bad=$(echo "$prereqs" | cut -b-40 | git-rev-list --stdin --not --all 2>&1)
+    if test -n "$bad" ; then
+		test "$1" = "--silent" && return 1
+        echo "error: $bfile requires the following commits you lack:"
+		echo "$prereqs" |
+		while read sha1 comment ; do
+            missing=$(git-rev-list $sha1 --not --all 2>&1)
+            test -n "$missing" && echo "$sha1 $comment"
+		done
+        exit 1
+    fi
+    return 0
+}
+
+# list all or just a subset
+list_heads() {
+	if test -z "$*" ; then
+		$TAR -xf "$bfile" -O references 2>/dev/null || exit 1
+	else
+		($TAR -xf "$bfile" -O references 2>/dev/null || exit 1) |
+		while read sha1 ref ; do
+			for arg in $* ; do
+				if test "${ref%$arg}" != "$ref" ; then
+					echo "$sha1 $ref"
+					break
+				fi
+			done
+		done
+	fi
+}
+
+SUBDIRECTORY_OK=1
+. git-sh-setup
+
+args=
+action=
+while test -n "$1" ; do
+    case $1 in
+        --create|--list-heads|--unbundle|--verify)
+            action=${1#--}
+            shift
+            bfile=$1
+            test -z "$bfile" && die "$action requires filename";;
+        --tar=*)
+            TAR=${1##--tar=};;
+        --tar)
+			shift
+            TAR=$1;;
+        *)
+            args="$args $1";;
+    esac
+    shift
+done
+test -z "$action" && die "No action given, what should I do?"
+
+# what tar to use, prefer gtar, then tar.
+if test -z "$TAR" ; then
+    GTAR=$(which gtar 2>/dev/null)
+    TAR=${GTAR:-tar}
+fi
+
+case $action in
+create)
+    unknown=$(git-rev-parse --no-revs $args)
+    test -z "$unknown" || die "unknown option: $unknown"
+    gitrevargs=$(git-rev-parse --symbolic --revs-only $args) || exit 1
+
+    # find the refs to carry along and get sha1s for each.
+    refs=
+    fullrevargs=
+    for arg in $gitrevargs ; do
+        #ignore options and basis refs, get full ref name for things
+		# we will transport rejecting anything ambiguous (e.g., user
+		# gives master, have heads/master and remotes/origin/master, we
+		# keep the former).
+        case "$arg" in
+            -* | ^*) fullrevargs="$fullrevargs $arg";;
+            *)  ref=$(git-show-ref "$arg")
+                test "$(echo $ref | wc -w)" = "2" || die "Ambigous reference: $arg
+$ref"
+                fullrevargs="$fullrevargs ${ref#* }"
+                refs="$refs $ref";;
+        esac
+    done
+    test -z "$refs" && die "No references specified, I don't know what to bundle."
+
+    # git-rev-list cannot determine edge objects if a date restriction is
+    # given...  we do things a slow way if max-age or min-age are given
+    case "$fullrevargs" in
+        *--max-age* | *--min-age*)
+            # get a list of all commits that will be packed along with
+            # parents of each.  A fixed git-rev-list --boundary should
+            # replace all of this.
+            echo "Finding prerequisites and commits to bundle..."
+            commits=$(git-rev-list $fullrevargs)
+
+            # get immediate parents of each commit to include
+            parents=
+            for c in $commits ; do
+                parents="$parents $(git-rev-list --parents --max-count=1 $c | cut -b42-)"
+            done
+            parents=$(printf "%s\n" $parents | sort | uniq)
+
+            # factor out what will be in this bundle, the remainder are the
+            # bundle's prerequisites.  double up commits in this as we only
+            # want things that are only in parents to appear once
+            prereqs=$(printf "%s\n" $parents $commits $commits | \
+                sort | uniq -c | sed -ne 's/^ *1 //p');;
+        *)
+            prereqs=$(git-rev-list --objects-edge $fullrevargs | sed -ne 's/^-//p');;
+    esac
+
+	# replace prereqs with annotated version of same
+
+    # create refs and pack
+    tmp="$GIT_DIR/bundle_tmp$$"
+    prerequisites="$tmp/prerequisites"
+    references="$tmp/references"
+    version="$tmp/version"
+    pack="$tmp/pack"
+    trap 'rm -rf "$tmp"' 0 1 2 3 15
+
+    mkdir "$tmp" &&
+    echo "v1 git-bundle" > "$version" &&
+	touch "$prerequisites" &&
+	(for p in $prereqs ; do
+		git-rev-list --pretty=one --max-count=1 $p
+	done) > "$prerequisites" &&
+    git-show-ref $refs > "$references" &&
+    git-rev-list --objects $fullrevargs | cut -b-40 |
+        git pack-objects --all-progress --stdout > "$pack" &&
+
+    # create the tar file, clean up
+    cd "$tmp" &&
+    tar cf bundle prerequisites references version pack &&
+    cd - &&
+    mv "$tmp/bundle" "$bfile" &&
+    rm -rf "$tmp"
+
+    # done
+    echo "Created $bfile";;
+
+verify)
+    verify && echo "$bfile is ok";;
+
+list-heads)
+    list_heads $args;;
+
+unbundle)
+    verify --silent || exit 1
+    $TAR -xf "$bfile" -O pack | git-index-pack --stdin ||
+        die "error: $bfile has a corrupted pack file"
+    list_heads $args;;
+esac
diff --git a/git-fetch.sh b/git-fetch.sh
index ca984e7..4cb4009 100755
--- a/git-fetch.sh
+++ b/git-fetch.sh
@@ -377,8 +377,15 @@ fetch_main () {
     ( : subshell because we muck with IFS
       IFS=" 	$LF"
       (
-	  git-fetch-pack --thin $exec $keep $shallow_depth "$remote" $rref ||
-	  echo failed "$remote"
+	if test -r "$remote" ; then
+	    test -n "$shallow_depth" &&
+		die "shallow clone with bundle is not supported"
+	    git-bundle --unbundle "$remote" $rref ||
+	    echo failed "$remote"
+	else
+	    git-fetch-pack --thin $exec $keep $shallow_depth "$remote" $rref ||
+	    echo failed "$remote"
+	fi
       ) |
       (
 	trap '
diff --git a/git-ls-remote.sh b/git-ls-remote.sh
index 8ea5c5e..73b745d 100755
--- a/git-ls-remote.sh
+++ b/git-ls-remote.sh
@@ -89,8 +89,13 @@ rsync://* )
 	;;
 
 * )
-	git-peek-remote $exec "$peek_repo" ||
+	if test -r "$peek_repo" ; then
+		git bundle --list-heads "$peek_repo" ||
 		echo "failed	slurping"
+	else
+		git-peek-remote $exec "$peek_repo" ||
+		echo "failed	slurping"
+	fi
 	;;
 esac |
 sort -t '	' -k 2 |
-- 
1.5.0.rc4.375.gd0938-dirty

^ permalink raw reply related	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2007-02-23  4:39 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-02-18 22:47 [PATCH] Add git-bundle: move objects and references by archive Mark Levedahl
2007-02-19  1:07 ` Johannes Schindelin
2007-02-19  1:50   ` Junio C Hamano
2007-02-19  2:02     ` Johannes Schindelin
2007-02-19  7:56   ` Shawn O. Pearce
2007-02-19 13:29   ` Mark Levedahl
2007-02-19 15:03     ` Johannes Schindelin
2007-02-19  1:49 ` Junio C Hamano
     [not found] <Pine.LNX.4.63.0702220157130.22628@wbgn013.biozentrum.uni-wuerz burg.de>
2007-02-22  0:59 ` Johannes Schindelin
2007-02-22  3:28   ` Nicolas Pitre
2007-02-22 15:55     ` Johannes Schindelin
2007-02-22 16:14       ` Simon 'corecode' Schubert
2007-02-22 16:28         ` Nicolas Pitre
2007-02-22 16:37         ` Johannes Schindelin
2007-02-22 16:46           ` Simon 'corecode' Schubert
2007-02-22 17:09             ` Johannes Schindelin
2007-02-22 16:24       ` Nicolas Pitre
2007-02-22 17:12         ` Johannes Schindelin
2007-02-22 17:21           ` Nicolas Pitre
2007-02-22  6:56   ` Junio C Hamano
2007-02-22  7:08     ` Junio C Hamano
2007-02-22 16:20       ` Johannes Schindelin
2007-02-22 19:10         ` Junio C Hamano
2007-02-22 19:16           ` Johannes Schindelin
2007-02-22 20:05             ` Junio C Hamano
2007-02-22 20:25               ` Johannes Schindelin
2007-02-22 16:17     ` Johannes Schindelin
2007-02-23  2:32     ` Mark Levedahl
2007-02-23  4:39       ` Junio C Hamano
2007-02-22  9:31   ` Johannes Sixt
  -- strict thread matches above, loose matches on Subject: below --
2007-02-17 18:41 [PATCH] Add git-unbundle - unpack objects and references for disconnected transfer Junio C Hamano
2007-02-18 17:27 ` [PATCH] Add git-bundle: move objects and references by archive Mark Levedahl
2007-02-18 22:47   ` Mark Levedahl

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.