All of lore.kernel.org
 help / color / mirror / Atom feed
* Scripts to use "bundles" for moving data between repositories
@ 2007-02-14 14:10 Mark Levedahl
  2007-02-14 14:10 ` [PATCH] git-bundle - bundle objects and references for disconnected transfer Mark Levedahl
                   ` (3 more replies)
  0 siblings, 4 replies; 32+ messages in thread
From: Mark Levedahl @ 2007-02-14 14:10 UTC (permalink / raw)
  To: git

I am working a project using git where we have many repositories on machines
that can never be directly connected, but which need to have the same objects
and development history.  Existing git protocols offer limited support: we can
either a) publish and apply patch files branch by branch, or b) copy an entire
repository from one machine to another and then do local push or fetch.  While
both are workable, neither is a completely satisfactory solution, so I wrote the
attached scripts that support a "bundle" transfer mechanism.  A bundle is a zip
archive having two files: a list of references as given by git-show-ref and a
pack file of objects from git-pack-objects.  git-bundle creates the bundle,
git-unbundle unpacks and applies at the receiving end.  The means of
transporting the bundle file between the machines is arbitrary (sneaker net,
email, etc all can work).

This transfer protocol leaves it to the user to assure that the objects in the 
bundle are sufficient to update the target machine.  This is a direct 
consequence of the prohibition on direct communication between the machines.  
The approach supported here is to use normal git-rev-list format to specify what 
to include, e.g.  master~10..master, or ^master pu next, etc.  Having too many 
objects in the pack file is fine: git-unpack-objects at the receiving end 
happily ignores things not needed. git-unbundle normally checks that the updated 
references are fast-forward (--force to override), and that all required objects 
exist (--shallow to override). This latter option supports a disconnected 
shallow clone.

I offer this for inclusion in the main distribution, comments and suggestions
for improvement are welcome regardless. The scripts are working for me today
and I find them very useful.

Mark Levedahl

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH] git-bundle - bundle objects and references for disconnected transfer.
  2007-02-14 14:10 Scripts to use "bundles" for moving data between repositories Mark Levedahl
@ 2007-02-14 14:10 ` Mark Levedahl
  2007-02-14 14:10   ` [PATCH] git-unbundle - unbundle " Mark Levedahl
                     ` (2 more replies)
  2007-02-14 14:13 ` Scripts to use "bundles" for moving data between repositories Matthieu Moy
                   ` (2 subsequent siblings)
  3 siblings, 3 replies; 32+ messages in thread
From: Mark Levedahl @ 2007-02-14 14:10 UTC (permalink / raw)
  To: git; +Cc: Mark Levedahl

Some workflows require coordinated development between repositories on
machines that can never be connected. This utility creates a bundle
containing a pack of objects and associated references (heads or tags)
that can be independently transferred to another machine, effectively
supporting git-push like operations between disconnected systems.

Signed-off-by: Mark Levedahl <mdl123@verizon.net>
---
 git-bundle |   85 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 85 insertions(+), 0 deletions(-)
 create mode 100755 git-bundle

diff --git a/git-bundle b/git-bundle
new file mode 100755
index 0000000..1341885
--- /dev/null
+++ b/git-bundle
@@ -0,0 +1,85 @@
+#!/bin/sh
+# Create a bundle to carry from one git repo to another (e.g., "sneaker-net" based push)
+# git-bundle <git-rev-list args>
+# git-bundle --bare <git-rev-list args>
+# creates bundle.zip in current directory (can rename of course)
+#
+# The bundle includes all refs given (--all selects every ref in the repo).
+# and all of the commit objects needed subject to the list given.
+#
+# Objects to be packed are limited by specifying one or more of
+#   ^commit-ish    - indicated commits already at the target (can have more than one ^commit-ish)
+#   --since=xxx    - Assume target repo has all relevant commits earlier than xxx
+
+USAGE='git-bundle <options> <git-rev-list arguments>
+
+Creates a bundle of objects to be carried to a disconnected repository, bringing the target
+repository''s definition of one or more references up to date as selected by the
+<git-rev-list arguments>
+
+Options:
+    -h, --help        Print this help screen
+        --bare        Work in a bare repository
+    --output=f        Output to file f (default is bundle.zip)
+
+    examples
+        git-bundle master~10..master
+        git-bundle master next pu ^master~30 --output=mybundle.zip --bare'
+
+die() {
+	echo >&2 "$@"
+	exit 1
+}
+
+# pull out rev-list args vs program args, parse the latter
+gitrevargs=$(git-rev-parse --symbolic --revs-only $*) || exit 1
+myargs=$(git-rev-parse --no-revs $*) || exit 1
+
+bfile=bundle.zip
+for arg in $myargs ; do
+	case "$arg" in
+		--bare)
+			export GIT_DIR=.;;
+		-h|--h|--he|--hel|--help)
+			echo "$USAGE"
+			exit;;
+		--output=*)
+			bfile=${arg##--output=};;
+		-*)
+			die "unknown option: $arg";;
+		*)
+	esac
+done
+
+GIT_DIR=$(git-rev-parse --git-dir) || die "Not in a git directory"
+
+# find the refs to carry along and get sha1s for each.
+refs=
+for arg in $gitrevargs ; do
+	#ignore options and basis refs
+	case "$arg" in
+		^*) ;;
+		-*) ;;
+		*)
+			n=$(git-show-ref "$arg" | wc -l)
+			[ $n -eq 1 ] || die "ambiguous reference: $arg"
+			refs="$refs $arg"
+			;;
+	esac
+done
+[ -z "$refs" ] && die "No references specified, I don't know what to bundle."
+
+# put the refs into the bundle file
+[ -e "$bfile" ] && rm -f "$bfile" 2>/dev/null
+git-show-ref $refs > .gitBundleReferences
+zip -m "$bfile" .gitBundleReferences
+
+# add the pack file
+(git-rev-list --objects $gitrevargs | \
+	cut -b -40 | \
+	git pack-objects --all-progress --progress --stdout >.gitBundlePack) \
+	|| (rm -f "$bfile" ; exit)
+zip -m "$bfile" .gitBundlePack
+
+# done
+echo "Created $bfile"
-- 
1.5.0.rc3.24.g0c5e

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH] git-unbundle - unbundle objects and references for disconnected transfer.
  2007-02-14 14:10 ` [PATCH] git-bundle - bundle objects and references for disconnected transfer Mark Levedahl
@ 2007-02-14 14:10   ` Mark Levedahl
  2007-02-14 14:10     ` [PATCH] Create a man page for git-bundle Mark Levedahl
  2007-02-14 19:45     ` [PATCH] git-unbundle - unbundle objects and references for disconnected transfer Shawn O. Pearce
  2007-02-14 19:42   ` [PATCH] git-bundle - bundle " Shawn O. Pearce
  2007-02-14 21:58   ` Johannes Schindelin
  2 siblings, 2 replies; 32+ messages in thread
From: Mark Levedahl @ 2007-02-14 14:10 UTC (permalink / raw)
  To: git; +Cc: Mark Levedahl

Some workflows require coordinated development between repositories on
machines that can never be connected. This utility unpacks a bundle
containing objects and associated references (heads or tags) into the
current repository, effectively supporting git-push like operations
between disconnected systems.

Signed-off-by: Mark Levedahl <mdl123@verizon.net>
---
 git-unbundle |   75 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 75 insertions(+), 0 deletions(-)
 create mode 100755 git-unbundle

diff --git a/git-unbundle b/git-unbundle
new file mode 100755
index 0000000..5ea4ae6
--- /dev/null
+++ b/git-unbundle
@@ -0,0 +1,75 @@
+#!/bin/sh
+# unpack a git-bundle file into current repository
+#
+# See git-bundle.
+
+die() {
+	echo >&2 "$@"
+	exit 1
+}
+
+bfile=bundle.zip
+force=
+shallow=
+while case "$#" in 0) break ;; esac
+do
+	case "$1" in
+	--bare)
+		export GIT_DIR=.;;
+	-f|--f|--fo|--for|--forc|--force)
+		force=1;;
+	-h|--h|--he|--hel|--help)
+		echo "usage: git-unbundle [--bare] [-f|--force] [--shallow] [bundle (default is bundle.zip)]"
+		exit;;
+	-s|--s|--sh|--sha|--shal|--shall|--shallo|--shallow)
+		shallow=1;;
+	-*)
+		die "unknown option: $1";;
+	*)
+		bfile="$1";;
+	esac
+	shift
+done
+
+[ -e "$bfile" ] || die "cannot find $bfile"
+GIT_DIR=$(git-rev-parse --git-dir) || die "Not in a git directory"
+
+# get the objects
+unzip -p "$bfile" .gitBundlePack | git-unpack-objects
+
+# check each reference, assure that the result would be valid before updating local ref
+unzip -p "$bfile" .gitBundleReferences | while read sha1 ref ; do
+	if [ -z "$shallow" ] ; then
+		result=$(git fsck $sha1)
+		havemissing=$(echo "$result" | grep '^missing')
+	else
+		# accept a shallow transfer
+		havemissing=
+	fi
+	ok=
+	if [ ! -z "$havemissing" ] ; then
+		echo "Not updating: $ref to $sha1"
+		echo "Bundle does not contain all required objects. (possibly partial) errors:"
+		echo "$result" | head
+	elif [ -z "$force" ] ; then
+		# update only if non-fastforward
+		local=$(git-rev-parse --verify "$ref^0" 2>/dev/null)
+		if [ ! -z "$local" ] ; then
+			mb=$(git-merge-base $local $sha1)
+			if [ "$mb" != "$local" ] ; then
+				echo "Not applying non-fast forward update: $ref"
+			else
+				ok=1
+			fi
+		else
+			ok=1
+		fi
+	else
+		#forced, accept non-fast forward update
+		ok=1
+	fi
+	if [ ! -z "$ok" ] ; then
+		echo "updating: $ref to $sha1"
+		git-update-ref -m "git-unbundle update" $ref $sha1
+	fi
+done
-- 
1.5.0.rc3.24.g0c5e

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH] Create a man page for git-bundle.
  2007-02-14 14:10   ` [PATCH] git-unbundle - unbundle " Mark Levedahl
@ 2007-02-14 14:10     ` Mark Levedahl
  2007-02-14 14:10       ` [PATCH] Create a man page for git-unbundle Mark Levedahl
  2007-02-14 19:45     ` [PATCH] git-unbundle - unbundle objects and references for disconnected transfer Shawn O. Pearce
  1 sibling, 1 reply; 32+ messages in thread
From: Mark Levedahl @ 2007-02-14 14:10 UTC (permalink / raw)
  To: git; +Cc: Mark Levedahl

Signed-off-by: Mark Levedahl <mdl123@verizon.net>
---
 Documentation/git-bundle.txt |   92 ++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 92 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/git-bundle.txt

diff --git a/Documentation/git-bundle.txt b/Documentation/git-bundle.txt
new file mode 100644
index 0000000..2cec0f9
--- /dev/null
+++ b/Documentation/git-bundle.txt
@@ -0,0 +1,92 @@
+git-bundle(1)
+================
+
+NAME
+----
+git-bundle - Package objects and refs to update a disconnected repository
+
+
+SYNOPSIS
+--------
+'git-bundle' [ --bare ] [--output=file] <git-rev-list args>
+
+DESCRIPTION
+-----------
+
+Some workflows require that one or more branches of development on one machine
+be replicated on another machine, but the two machines cannot be directly
+connected so the git-fetch protocol cannot be used.  This command creates a
+bundle file containing objects and references that can be used to update another
+repository (using gitlink:git-unbundle[1]) without phsyically connecting the
+two.  As no direct connection exists, the user must specify a basis for the
+bundle that is held by the destination repository: the bundle assumes that all
+objects in the basis are already in the destination repository.
+
+OPTIONS
+-------
+
+--bare::
+	Assume operation in a bare repository.
+
+--output=file::
+	Specifies the name of the bundle file. Default is "bundle.zip" in the
+	current directory.
+
+
+<git-rev-list args>::
+
+        A list of arguments, accepatble to git-rev-parse and git-rev-list, that
+        specify the specific objects and references to transport. For example,
+        "master~10..master" causes the current master reference to be packaged
+        along with all objects added since its 10th ancestor commit. There is no
+        explicit limit to the number of references and objects that may be
+        packaged.
+
+
+SPECIFYING REFERENCES
+--------------------
+
+git-bundle will only package references that are shown by git-show-ref: this
+includes heads, tags, and remote heads. References such as master~1 cannot be
+packaged, but are perfectly suitable for defining the basis. More than one
+reference may be packaged, and more than one basis can be specified. The objects
+packaged are those not contained in the union of the given bases. Each basis can
+be specified explicitly (e.g., ^master~10), or implicitly (e.g.,
+master~10..master).
+
+In general, it is very important that the basis used be held by the destination.
+It is ok to err on the side of conservatism, causing the bundle file to contain
+objects already in the destination as these are ignored when unpacking at the
+destination.
+
+A shallow copy or clone can be done which has fewer than all required objects.
+Typically, this would use "git-bundle --since=<some date> master", and use the
+--shallow option to git-unbundle at the receiving end.
+
+EXAMPLE
+-------
+
+Assume two repositories exist as R1 on machine A, and R2 on machine B.  For
+whatever reason, direct connection between A and B is not allowed, but we can
+move data from A to B via some mechanism (CD, email, etc).  We want to update R2
+with developments made on branch master in R1.  We set a tag in R1
+(lastR2bundle) after the previous such transport, and move it afterwards to help
+build the bundle.
+
+in R1 on A:
+git-bundle master ^lastR2bundle
+git tag -f lastR2bundle master
+
+[move bundle.zip from A to B by some mechanism]
+
+in R2 on B:
+git-unbundle bundle.zip (3)
+
+
+Author
+------
+Written by Mark Levedahl <mdl123@verizon.net>
+
+GIT
+---
+Part of the gitlink:git[7] suite
-- 
1.5.0.rc3.24.g0c5e

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH] Create a man page for git-unbundle.
  2007-02-14 14:10     ` [PATCH] Create a man page for git-bundle Mark Levedahl
@ 2007-02-14 14:10       ` Mark Levedahl
  0 siblings, 0 replies; 32+ messages in thread
From: Mark Levedahl @ 2007-02-14 14:10 UTC (permalink / raw)
  To: git; +Cc: Mark Levedahl

Signed-off-by: Mark Levedahl <mdl123@verizon.net>
---
 Documentation/git-unbundle.txt |   55 ++++++++++++++++++++++++++++++++++++++++
 1 files changed, 55 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/git-unbundle.txt

diff --git a/Documentation/git-unbundle.txt b/Documentation/git-unbundle.txt
new file mode 100644
index 0000000..8dbfccb
--- /dev/null
+++ b/Documentation/git-unbundle.txt
@@ -0,0 +1,55 @@
+git-unbundle(1)
+================
+
+NAME
+----
+git-unbundle - Unpackage objects and refs to update a disconnected repository
+
+
+SYNOPSIS
+--------
+'git-unbundle' [--bare ] [--force] [--shallow] file
+
+DESCRIPTION
+-----------
+
+Some workflows require that one or more branches of development on one machine
+be replicated on another machine, but the two machines cannot be directly
+connected so the gitlink:git-fetch[1] protocol cannot be used.  This command
+unpacks a bundle file created by gitlink:git-bundle[1] on another repository,
+adding the objects and updating references as defined by the donor repository.
+
+OPTIONS
+-------
+
+--bare::
+	Assume operation in a bare repository.
+
+--force::
+        Normally only fast-forward reference updates are performed. Specifying
+        this option allows non-fast forward updates.
+
+--shallow::
+        Normally, git-fsck is invoked on each reference to assure there are no
+        missing objects. This option bypasses that checking, allowing shallow
+        copies. Use with caution, many git operations are not supported on
+        shallow repositories.
+
+file::
+        Bundle file created by gitlink:git-bundle[1]. Default is bundle.zip.
+
+ERROR CHECKING
+--------------
+
+In addition to the checks mentioned under --force and --shallow above,
+git-unbundle uses gitlink:git-unpack-objects[1] to update objects, and
+gitlink:git-update-ref to update all references, and thus all the inherent
+safety checks provided by those functions are in force.
+
+Author
+------
+Written by Mark Levedahl <mdl123@verizon.net>
+
+GIT
+---
+Part of the gitlink:git[7] suite
-- 
1.5.0.rc3.24.g0c5e

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: Scripts to use "bundles" for moving data between repositories
  2007-02-14 14:10 Scripts to use "bundles" for moving data between repositories Mark Levedahl
  2007-02-14 14:10 ` [PATCH] git-bundle - bundle objects and references for disconnected transfer Mark Levedahl
@ 2007-02-14 14:13 ` Matthieu Moy
  2007-02-14 14:37   ` Mark Levedahl
  2007-02-14 15:46 ` Johannes Schindelin
  2007-02-14 17:11 ` Junio C Hamano
  3 siblings, 1 reply; 32+ messages in thread
From: Matthieu Moy @ 2007-02-14 14:13 UTC (permalink / raw)
  To: git

Mark Levedahl <mdl123@verizon.net> writes:

> I offer this for inclusion in the main distribution, comments and suggestions
> for improvement are welcome regardless. The scripts are working for me today
> and I find them very useful.

Did you also have a look at

http://kernel.org/git/?p=cogito/cogito-bundle.git

-- 
Matthieu

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Scripts to use "bundles" for moving data between repositories
  2007-02-14 14:13 ` Scripts to use "bundles" for moving data between repositories Matthieu Moy
@ 2007-02-14 14:37   ` Mark Levedahl
  0 siblings, 0 replies; 32+ messages in thread
From: Mark Levedahl @ 2007-02-14 14:37 UTC (permalink / raw)
  To: git

Matthieu Moy wrote:
>> Did you also have a look at
>>
>> http://kernel.org/git/?p=cogito/cogito-bundle.git
>>
>>     
Yes, I did. I rejected that for my use as it seemed much too 
restrictive: 1 branch at a time, no tags. What I wrote can pack up 
everything in a repository in one go if so desired, or any subpiece. 
Also, this requires nothing beyond git-core (no dependency upon cg), and 
cg and git-core do not interoperate well regarding remote branch 
definitions.

Mark

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Scripts to use "bundles" for moving data between repositories
  2007-02-14 14:10 Scripts to use "bundles" for moving data between repositories Mark Levedahl
  2007-02-14 14:10 ` [PATCH] git-bundle - bundle objects and references for disconnected transfer Mark Levedahl
  2007-02-14 14:13 ` Scripts to use "bundles" for moving data between repositories Matthieu Moy
@ 2007-02-14 15:46 ` Johannes Schindelin
  2007-02-14 17:11 ` Junio C Hamano
  3 siblings, 0 replies; 32+ messages in thread
From: Johannes Schindelin @ 2007-02-14 15:46 UTC (permalink / raw)
  To: Mark Levedahl; +Cc: git

Hi,

On Wed, 14 Feb 2007, Mark Levedahl wrote:

> I am working a project using git where we have many repositories on 
> machines that can never be directly connected, but which need to have 
> the same objects and development history.

I had the same problem some time ago. My network is a sneaker 
net, and the transport medium is a USB stick.

Then, I just push onto the stick and pull from the stick as needed.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Scripts to use "bundles" for moving data between repositories
  2007-02-14 14:10 Scripts to use "bundles" for moving data between repositories Mark Levedahl
                   ` (2 preceding siblings ...)
  2007-02-14 15:46 ` Johannes Schindelin
@ 2007-02-14 17:11 ` Junio C Hamano
  2007-02-14 17:56   ` Mark Levedahl
  3 siblings, 1 reply; 32+ messages in thread
From: Junio C Hamano @ 2007-02-14 17:11 UTC (permalink / raw)
  To: Mark Levedahl; +Cc: git

I think something like this is a good addition but I do not want
us to require zip/unzip to use git.  I think saner alternative
would be to use tar.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Scripts to use "bundles" for moving data between repositories
  2007-02-14 17:11 ` Junio C Hamano
@ 2007-02-14 17:56   ` Mark Levedahl
  2007-02-14 18:00     ` Junio C Hamano
  2007-02-14 18:20     ` Junio C Hamano
  0 siblings, 2 replies; 32+ messages in thread
From: Mark Levedahl @ 2007-02-14 17:56 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Junio C Hamano wrote:
> I think something like this is a good addition but I do not want
> us to require zip/unzip to use git.  I think saner alternative
> would be to use tar.
>
>
>   
That is any easy change. As the dominant content is an already 
compressed pack file, is tar sufficient or should it be a gzip or bzip tar?

Mark

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Scripts to use "bundles" for moving data between repositories
  2007-02-14 17:56   ` Mark Levedahl
@ 2007-02-14 18:00     ` Junio C Hamano
  2007-02-14 21:24       ` Mark Levedahl
  2007-02-14 18:20     ` Junio C Hamano
  1 sibling, 1 reply; 32+ messages in thread
From: Junio C Hamano @ 2007-02-14 18:00 UTC (permalink / raw)
  To: Mark Levedahl; +Cc: git

Mark Levedahl <mdl123@verizon.net> writes:

> That is any easy change. As the dominant content is an already
> compressed pack file, is tar sufficient or should it be a gzip or bzip
> tar?

Plain vanilla would do.  Have you noticed how well they deflate
with your implementation that uses zip?

Also we _might_ want to uuencode (or base85) so that you can
even e-mail a bundle easily.

I am 75% kidding ;-).

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Scripts to use "bundles" for moving data between repositories
  2007-02-14 17:56   ` Mark Levedahl
  2007-02-14 18:00     ` Junio C Hamano
@ 2007-02-14 18:20     ` Junio C Hamano
  1 sibling, 0 replies; 32+ messages in thread
From: Junio C Hamano @ 2007-02-14 18:20 UTC (permalink / raw)
  To: Mark Levedahl; +Cc: git

Mark Levedahl <mdl123@verizon.net> writes:

> Junio C Hamano wrote:
>> I think something like this is a good addition but I do not want
>> us to require zip/unzip to use git.  I think saner alternative
>> would be to use tar.
>>
>>
>>
> That is any easy change. As the dominant content is an already
> compressed pack file, is tar sufficient or should it be a gzip or bzip
> tar?

If you are re-spinning the patch, please do not forget that you
would want to link it in the main Makefile and link the docs to
git.7 by adding them to Documentation/cmd-list.perl.

The USAGE string should fit comfortably on 80-column terminal.
The same goes for AsciiDoc text documentation.  Your lines are
too long.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] git-bundle - bundle objects and references for disconnected transfer.
  2007-02-14 14:10 ` [PATCH] git-bundle - bundle objects and references for disconnected transfer Mark Levedahl
  2007-02-14 14:10   ` [PATCH] git-unbundle - unbundle " Mark Levedahl
@ 2007-02-14 19:42   ` Shawn O. Pearce
  2007-02-14 21:58   ` Johannes Schindelin
  2 siblings, 0 replies; 32+ messages in thread
From: Shawn O. Pearce @ 2007-02-14 19:42 UTC (permalink / raw)
  To: Mark Levedahl; +Cc: git

Mark Levedahl <mdl123@verizon.net> wrote:
> +# add the pack file
> +(git-rev-list --objects $gitrevargs | \
> +	cut -b -40 | \
> +	git pack-objects --all-progress --progress --stdout >.gitBundlePack) \
> +	|| (rm -f "$bfile" ; exit)

pack-objects can run a rev-list internally; which means this
can be written as:

 echo $gitrevargs | \
 git pack-objects --all-progress --progress --stdout --revs >.gitBundlePack \
 || (rm -f "$bfile" ; exit)

-- 
Shawn.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] git-unbundle - unbundle objects and references for disconnected transfer.
  2007-02-14 14:10   ` [PATCH] git-unbundle - unbundle " Mark Levedahl
  2007-02-14 14:10     ` [PATCH] Create a man page for git-bundle Mark Levedahl
@ 2007-02-14 19:45     ` Shawn O. Pearce
  2007-02-14 20:57       ` Mark Levedahl
  1 sibling, 1 reply; 32+ messages in thread
From: Shawn O. Pearce @ 2007-02-14 19:45 UTC (permalink / raw)
  To: Mark Levedahl; +Cc: git

Mark Levedahl <mdl123@verizon.net> wrote:
> +# get the objects
> +unzip -p "$bfile" .gitBundlePack | git-unpack-objects

Since you are transporting a packfile by sneakernet it might
be reasonable to assume this transfer happens infrequently.
Consequently we might assume its object count exceeds
transfer.unpackLimit, which means a standard fetch or push would
have kept the packfile rather than unpacking it to loose objects.

So maybe use git-index-pack here to index the packfile and
retain it as-is, rather than unpacking it?

-- 
Shawn.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] git-unbundle - unbundle objects and references for disconnected transfer.
  2007-02-14 19:45     ` [PATCH] git-unbundle - unbundle objects and references for disconnected transfer Shawn O. Pearce
@ 2007-02-14 20:57       ` Mark Levedahl
  2007-02-14 21:03         ` Shawn O. Pearce
  2007-02-14 21:18         ` Nicolas Pitre
  0 siblings, 2 replies; 32+ messages in thread
From: Mark Levedahl @ 2007-02-14 20:57 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: git

Shawn O. Pearce wrote:
> Mark Levedahl <mdl123@verizon.net> wrote:
>   
>> +# get the objects
>> +unzip -p "$bfile" .gitBundlePack | git-unpack-objects
>>     
>
> Since you are transporting a packfile by sneakernet it might
> be reasonable to assume this transfer happens infrequently.
> Consequently we might assume its object count exceeds
> transfer.unpackLimit, which means a standard fetch or push would
> have kept the packfile rather than unpacking it to loose objects.
>
> So maybe use git-index-pack here to index the packfile and
> retain it as-is, rather than unpacking it?
>
>   
Many of my uses of this result in 10-20 objects being transferred, so 
I'm not sure keeping each pack is a real benefit. In particular, one use 
is for daily updates between two sites via email where we tend to have a 
lot of extra objects in the packs as we assume that not every bundle 
actually gets applied, while the number of real new objects tends to be 
small. On the other hand, given the manual nature of this operation, we 
could always just follow up with repack -a -d, possibly guarded by a git 
count. Thoughts?

Mark

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] git-unbundle - unbundle objects and references for disconnected transfer.
  2007-02-14 20:57       ` Mark Levedahl
@ 2007-02-14 21:03         ` Shawn O. Pearce
  2007-02-14 22:43           ` Mark Levedahl
  2007-02-14 21:18         ` Nicolas Pitre
  1 sibling, 1 reply; 32+ messages in thread
From: Shawn O. Pearce @ 2007-02-14 21:03 UTC (permalink / raw)
  To: Mark Levedahl; +Cc: git

Mark Levedahl <mdl123@verizon.net> wrote:
> Many of my uses of this result in 10-20 objects being transferred, so 
> I'm not sure keeping each pack is a real benefit. In particular, one use 
> is for daily updates between two sites via email where we tend to have a 
> lot of extra objects in the packs as we assume that not every bundle 
> actually gets applied, while the number of real new objects tends to be 
> small. On the other hand, given the manual nature of this operation, we 
> could always just follow up with repack -a -d, possibly guarded by a git 
> count. Thoughts?

I don't really have an opinion here, as I'm fortunate enough that
I can use an SSH or an anonymous git connection between all of my
repositories, and thus don't really have a need for bundle/unbundle.

Its just one of those operations which I thought would not happen
often, and when it did, probably would be big.  In which case keeping
the packfile would make the unbundle run faster, as you don't need
to create a huge mess of loose objects.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] git-unbundle - unbundle objects and references for disconnected transfer.
  2007-02-14 20:57       ` Mark Levedahl
  2007-02-14 21:03         ` Shawn O. Pearce
@ 2007-02-14 21:18         ` Nicolas Pitre
  1 sibling, 0 replies; 32+ messages in thread
From: Nicolas Pitre @ 2007-02-14 21:18 UTC (permalink / raw)
  To: Mark Levedahl; +Cc: Shawn O. Pearce, git

On Wed, 14 Feb 2007, Mark Levedahl wrote:

> Shawn O. Pearce wrote:
> > Mark Levedahl <mdl123@verizon.net> wrote:
> >   
> > > +# get the objects
> > > +unzip -p "$bfile" .gitBundlePack | git-unpack-objects
> > >     
> >
> > Since you are transporting a packfile by sneakernet it might
> > be reasonable to assume this transfer happens infrequently.
> > Consequently we might assume its object count exceeds
> > transfer.unpackLimit, which means a standard fetch or push would
> > have kept the packfile rather than unpacking it to loose objects.
> >
> > So maybe use git-index-pack here to index the packfile and
> > retain it as-is, rather than unpacking it?
> >
> >   
> Many of my uses of this result in 10-20 objects being transferred, so I'm not
> sure keeping each pack is a real benefit. In particular, one use is for daily
> updates between two sites via email where we tend to have a lot of extra
> objects in the packs as we assume that not every bundle actually gets applied,
> while the number of real new objects tends to be small. On the other hand,
> given the manual nature of this operation, we could always just follow up with
> repack -a -d, possibly guarded by a git count. Thoughts?

Since this is meant for manual operation and therefore is not meant to 
happen multiple times per minute, I'd suggest you still use index-pack 
unconditionally instead of unpack-objects despite having a small number 
of objects.


Nicolas

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Scripts to use "bundles" for moving data between repositories
  2007-02-14 18:00     ` Junio C Hamano
@ 2007-02-14 21:24       ` Mark Levedahl
  2007-02-14 21:26         ` Junio C Hamano
  0 siblings, 1 reply; 32+ messages in thread
From: Mark Levedahl @ 2007-02-14 21:24 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Junio C Hamano wrote:
> Also we _might_ want to uuencode (or base85) so that you can
> even e-mail a bundle easily.
>
> I am 75% kidding ;-).
>   
Wouldn't such encoding more logically be a part of whatever is used for 
the transport? My experience (other than emailing patches to the git 
list :-[ ) in this area is that encode / decode of attachments is 
handled transparently by email clients. My email programs (including 
command line scripts) all know how to mime encode / decode arbitrary 
attachments, so I at least would gain nothing by such encoding.

Mark

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Scripts to use "bundles" for moving data between repositories
  2007-02-14 21:24       ` Mark Levedahl
@ 2007-02-14 21:26         ` Junio C Hamano
  0 siblings, 0 replies; 32+ messages in thread
From: Junio C Hamano @ 2007-02-14 21:26 UTC (permalink / raw)
  To: Mark Levedahl; +Cc: git

Exactly, that is why I said I was 75% kidding.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] git-bundle - bundle objects and references for disconnected transfer.
  2007-02-14 14:10 ` [PATCH] git-bundle - bundle objects and references for disconnected transfer Mark Levedahl
  2007-02-14 14:10   ` [PATCH] git-unbundle - unbundle " Mark Levedahl
  2007-02-14 19:42   ` [PATCH] git-bundle - bundle " Shawn O. Pearce
@ 2007-02-14 21:58   ` Johannes Schindelin
  2007-02-14 23:19     ` Mark Levedahl
  2 siblings, 1 reply; 32+ messages in thread
From: Johannes Schindelin @ 2007-02-14 21:58 UTC (permalink / raw)
  To: Mark Levedahl; +Cc: git

Hi,

On Wed, 14 Feb 2007, Mark Levedahl wrote:

> +bfile=bundle.zip
> +for arg in $myargs ; do
> +	case "$arg" in
> +		--bare)
> +			export GIT_DIR=.;;

This is not necessary. You should do this instead:

	. git-sh-setup

It should autodetect if you are running in a bare repo. Also, it gives you 
the nice die and help functions.

> +		-h|--h|--he|--hel|--help)
> +			echo "$USAGE"
> +			exit;;
> +		--output=*)
> +			bfile=${arg##--output=};;

Throughout git, we seem to do both "--output=<bla>" _and_ "--output <bla>" 
forms, or just the latter.

> +GIT_DIR=$(git-rev-parse --git-dir) || die "Not in a git directory"

Again, this is done by git-sh-setup

> +git-show-ref $refs > .gitBundleReferences

Would it not be better to say explicitely which refs are expected to be 
present already (they start with "^" in the output of `git-rev-parse`, but 
you would need to do a bit more work, since you cannot just take the 
symbolic names).

Some general remarks:

It would be so much nicer if you worked without temporary files (you could 
do that by starting the file with the refs, then have an empty line, and 
then just pipe the pack after that).

IMHO reliance on $(git fsck | grep ^missing) is not good. The file check 
might take very, very long, or use much memory. And you _can_ do better 
[*1*].

Also, your use of shallow is incorrect. If the boundary commits are 
present, you might just leave them as-are, but if they are not present, 
you have to mark them as shallow. Otherwise, you end up with a corrupt 
(not shallow) repository.

Ciao,
Dscho

[*1*] Instead of providing a list "<hash> <refname>" with just the refs to 
be updated, append a list "<hash> ^<refname>" with the refs which _have_ 
to be present in order to succeed. You get this list by

	gitrevnotargs=$(git-rev-parse --symbolic --revs-only --not $*)
	git show-ref $gitrevnotargs | sed 's/^\(.\{41\}\)/&^/'

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] git-unbundle - unbundle objects and references for disconnected transfer.
  2007-02-14 21:03         ` Shawn O. Pearce
@ 2007-02-14 22:43           ` Mark Levedahl
  0 siblings, 0 replies; 32+ messages in thread
From: Mark Levedahl @ 2007-02-14 22:43 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: git

Shawn O. Pearce wrote:
> I don't really have an opinion here, as I'm fortunate enough that
> I can use an SSH or an anonymous git connection between all of my
> repositories, and thus don't really have a need for bundle/unbundle.
>
> Its just one of those operations which I thought would not happen
> often, and when it did, probably would be big.  In which case keeping
> the packfile would make the unbundle run faster, as you don't need
> to create a huge mess of loose objects.
Fair enough - I've made that change, also your other suggestion about 
piping refs directly to pack-objects. Thanks for both.

Mark

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] git-bundle - bundle objects and references for disconnected transfer.
  2007-02-14 21:58   ` Johannes Schindelin
@ 2007-02-14 23:19     ` Mark Levedahl
  2007-02-14 23:55       ` Mark Levedahl
  2007-02-15  0:07       ` Johannes Schindelin
  0 siblings, 2 replies; 32+ messages in thread
From: Mark Levedahl @ 2007-02-14 23:19 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin wrote:
> This is not necessary. You should do this instead:
>
> 	. git-sh-setup
>   
I debated that, it seemed a wash but can do.
> Throughout git, we seem to do both "--output=<bla>" _and_ "--output <bla>" 
> forms, or just the latter.
>   
Patches gratefully accepted for that. This exceeds my skills in bash: I 
can do that in python, C, or other languages, but in bash I am working 
through a list that is a part of $* an arg at a time with no ability to 
look at the next, which is what this needs. Unless of course bash arrays 
are part of portable shell (not sure on that).
>> +git-show-ref $refs > .gitBundleReferences
>>     
>
> Would it not be better to say explicitely which refs are expected to be 
> present already (they start with "^" in the output of `git-rev-parse`, but 
> you would need to do a bit more work, since you cannot just take the 
> symbolic names).
>
> Some general remarks:
>
> It would be so much nicer if you worked without temporary files (you could 
> do that by starting the file with the refs, then have an empty line, and 
> then just pipe the pack after that).
>   
Originally, this was in python with zip file built in memory (no 
temporaries). Sticking to portable shell makes many easy things really 
hard. I'll think about this.
> IMHO reliance on $(git fsck | grep ^missing) is not good. The file check 
> might take very, very long, or use much memory. And you _can_ do better 
> [*1*].
>   
Good idea, but I think it is simpler to just keep the ^... output from 
git-rev-parse and check that those exist. What you suggest below seems 
to presume all bases are themselves references, which is not the case 
when doing, for example, master~10..master.
> Also, your use of shallow is incorrect. If the boundary commits are 
> present, you might just leave them as-are, but if they are not present, 
> you have to mark them as shallow. Otherwise, you end up with a corrupt 
> (not shallow) repository.
>   
I have to say I do not understand what "mark them as shallow" means: can 
you please enlighten me further?


Thanks,
Mark

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] git-bundle - bundle objects and references for disconnected transfer.
  2007-02-14 23:19     ` Mark Levedahl
@ 2007-02-14 23:55       ` Mark Levedahl
  2007-02-15  0:15         ` Johannes Schindelin
  2007-02-15  0:07       ` Johannes Schindelin
  1 sibling, 1 reply; 32+ messages in thread
From: Mark Levedahl @ 2007-02-14 23:55 UTC (permalink / raw)
  To: Mark Levedahl; +Cc: Johannes Schindelin, git

Mark Levedahl wrote:
> Johannes Schindelin wrote:
>
>>
>> Would it not be better to say explicitely which refs are expected to 
>> be present already (they start with "^" in the output of 
>> `git-rev-parse`, but you would need to do a bit more work, since you 
>> cannot just take the symbolic names).
>>
>> IMHO reliance on $(git fsck | grep ^missing) is not good. The file 
>> check might take very, very long, or use much memory. And you _can_ 
>> do better [*1*].
>>   
> Good idea, but I think it is simpler to just keep the ^... output from 
> git-rev-parse and check that those exist. What you suggest below seems 
> to presume all bases are themselves references, which is not the case 
> when doing, for example, master~10..master.
Examining further, I just don't know how to do this in shell. Basically, 
what I want is the list of parents of all bases, but those bases might 
not be explicitly mentioned, e.g., master --since=10.days.ago and I 
don't understand any direct plumbing call that will give me the list of 
parents in the general case. At the expense of extreme slowness I can do 
some of this invoking sort and uniq with long lists of objects. The pain 
in doing that on the sender side is definitely worth the potential gain 
on the receiver side (I now remember I tried that a while back, was able 
to do something reasonable in Python using sets, it died in bash).

Mark

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] git-bundle - bundle objects and references for disconnected transfer.
  2007-02-14 23:19     ` Mark Levedahl
  2007-02-14 23:55       ` Mark Levedahl
@ 2007-02-15  0:07       ` Johannes Schindelin
  2007-02-15  2:32         ` Mark Levedahl
  1 sibling, 1 reply; 32+ messages in thread
From: Johannes Schindelin @ 2007-02-15  0:07 UTC (permalink / raw)
  To: Mark Levedahl; +Cc: git

Hi,

On Wed, 14 Feb 2007, Mark Levedahl wrote:

> Johannes Schindelin wrote:
> > This is not necessary. You should do this instead:
> > 
> > 	. git-sh-setup
> >   
> I debated that, it seemed a wash but can do.

It makes things easier, doesn't it?

> > Throughout git, we seem to do both "--output=<bla>" _and_ "--output <bla>"
> > forms, or just the latter.
> >   
> Patches gratefully accepted for that. This exceeds my skills in bash: I can
> do that in python, C, or other languages, but in bash I am working through a
> list that is a part of $* an arg at a time with no ability to look at the
> next, which is what this needs. Unless of course bash arrays are part of
> portable shell (not sure on that).

Ah, I just realized that you do not shift. This is wrong. For example,

	git bundle --output=a1 a..b

would pass "--output=a1 a..b" to git-rev-parse. While you say 
"--revs-only", this would work, but so would "these are no refs". You lose 
valuable information that way (namely invalid parameters). The standard 
shell way is nicely visible in git-tag.sh (see the while loop). It is 
basically

while case "$#" in 0) break ;; esac
do
	case "$1" in
	--output)
		# handle $1 (and check that you can write to it).
		;;
	-*)
		usage
		;;
	*)
		break
	esac
done

> > > +git-show-ref $refs > .gitBundleReferences
> > >     
> > 
> > Would it not be better to say explicitely which refs are expected to be
> > present already (they start with "^" in the output of `git-rev-parse`, but
> > you would need to do a bit more work, since you cannot just take the
> > symbolic names).
> > 
> > Some general remarks:
> > 
> > It would be so much nicer if you worked without temporary files (you 
> > could do that by starting the file with the refs, then have an empty 
> > line, and then just pipe the pack after that).
> >   
> Originally, this was in python with zip file built in memory (no 
> temporaries). Sticking to portable shell makes many easy things really 
> hard.

Not if you just pipe the two parts (refs & pack) into the output. Piping 
also allows for "--output -" meaning stdout...

> > IMHO reliance on $(git fsck | grep ^missing) is not good. The file 
> > check might take very, very long, or use much memory. And you _can_ do 
> > better [*1*].
>   
> Good idea, but I think it is simpler to just keep the ^... output from
> git-rev-parse and check that those exist. What you suggest below seems to
> presume all bases are themselves references, which is not the case when
> doing, for example, master~10..master.

Not at all. I meant to verify that these _hashes_ exist as commits. Not 
necessarily refs.

> > Also, your use of shallow is incorrect. If the boundary commits are 
> > present, you might just leave them as-are, but if they are not 
> > present, you have to mark them as shallow. Otherwise, you end up with 
> > a corrupt (not shallow) repository.
> >   
> I have to say I do not understand what "mark them as shallow" means: can 
> you please enlighten me further?

We have shallow clones. This means that you can mark commits as "fake 
root" commits, i.e. even if they have parents, they are treated as if they 
had no parents. You do this by adding the hashes of the shallow commits to 
.git/shallow. For a short description, search for "shallow" in 
Documentation/glossary.txt.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] git-bundle - bundle objects and references for disconnected transfer.
  2007-02-14 23:55       ` Mark Levedahl
@ 2007-02-15  0:15         ` Johannes Schindelin
  2007-02-15  2:13           ` Mark Levedahl
  0 siblings, 1 reply; 32+ messages in thread
From: Johannes Schindelin @ 2007-02-15  0:15 UTC (permalink / raw)
  To: Mark Levedahl; +Cc: git

Hi,

On Wed, 14 Feb 2007, Mark Levedahl wrote:

> Mark Levedahl wrote:
> > Johannes Schindelin wrote:
> > 
> > > 
> > > Would it not be better to say explicitely which refs are expected to be
> > > present already (they start with "^" in the output of `git-rev-parse`,
> > > but you would need to do a bit more work, since you cannot just take the
> > > symbolic names).
> > > 
> > > IMHO reliance on $(git fsck | grep ^missing) is not good. The file check
> > > might take very, very long, or use much memory. And you _can_ do better
> > > [*1*].
> > >   
> > Good idea, but I think it is simpler to just keep the ^... output from
> > git-rev-parse and check that those exist. What you suggest below seems to
> > presume all bases are themselves references, which is not the case when
> > doing, for example, master~10..master.
>
> Examining further, I just don't know how to do this in shell. Basically, 
> what I want is the list of parents of all bases,

I don't think you need the bases. If you say "master~10..master" on the 
sender side, you want to update master on the receiving side, _after_ you 
verified that receiver already has "master~10".

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] git-bundle - bundle objects and references for disconnected   transfer.
  2007-02-15  0:15         ` Johannes Schindelin
@ 2007-02-15  2:13           ` Mark Levedahl
  2007-02-15 15:35             ` Johannes Schindelin
  0 siblings, 1 reply; 32+ messages in thread
From: Mark Levedahl @ 2007-02-15  2:13 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin wrote:
> Hi,
> 
> On Wed, 14 Feb 2007, Mark Levedahl wrote:
> 
>> Mark Levedahl wrote:
>>> Johannes Schindelin wrote:
>>>
> 
> I don't think you need the bases. If you say "master~10..master" on the 
> sender side, you want to update master on the receiving side, _after_ you 
> verified that receiver already has "master~10".
> 
> Ciao,
> Dscho
> 
git>git-rev-parse master~10..master
dc0f74905bd94b88d3b1d477e79faef7e0308fbf
^602598fd5d8f64028f84d2772725c5e3414a112f

Which shows the new head and the commit that the destination needs. That 
is fine. But:

git>git-rev-parse master --since=10.days.ago
dc0f74905bd94b88d3b1d477e79faef7e0308fbf
--max-age=1170641182

is not helpful: it does not tell what is expected to be on the other 
end. And I find both forms absolutely useful in the ways I use 
git-bundle. The latter one does not tell me what is needed. The only way 
I solved that was to walk all the commits from git-rev-list, one at a 
time, to find the parents, and keep the results not otherwise in the 
list. I found that so terribly slow in bash I gave up on it as 
unworkable: I have found in practice my current solution of git-fsck to 
be much faster.

Mark

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] git-bundle - bundle objects and references for disconnected   transfer.
  2007-02-15  0:07       ` Johannes Schindelin
@ 2007-02-15  2:32         ` Mark Levedahl
  2007-02-15 15:32           ` Johannes Schindelin
  0 siblings, 1 reply; 32+ messages in thread
From: Mark Levedahl @ 2007-02-15  2:32 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin wrote:
> Hi,
> 
> On Wed, 14 Feb 2007, Mark Levedahl wrote:
> 
> 
> Ah, I just realized that you do not shift. This is wrong. For example,
> 
> 	git bundle --output=a1 a..b
> 
> would pass "--output=a1 a..b" to git-rev-parse. While you say 
> "--revs-only", this would work, but so would "these are no refs". You lose 
> valuable information that way (namely invalid parameters). The standard 
> shell way is nicely visible in git-tag.sh (see the while loop). It is 
> basically
> 
> while case "$#" in 0) break ;; esac
> do
> 	case "$1" in
> 	--output)
> 		# handle $1 (and check that you can write to it).
> 		;;
> 	-*)
> 		usage
> 		;;
> 	*)
> 		break
> 	esac
> done

And that loop would always abort on things meant for git-rev-list. I 
want to avoid making git-bundle have to understand everything that is 
legal to git-rev-list. The current construct does this: it lets 
git-rev-parse remove what that function knows, aborting if something is 
amiss (or aborting later in git-rev-list), leaving git-bundle's parser 
to chew on the rest. I really don't see a way out of the dilemma: either 
allow --output foo but don't barf on bad arguments, or only accept 
--output=foo and be able to trap errors, or teach git-bundle everything 
that is valid for the other two.  (Let me write this in python, the 
dilemma is gone).

>>>   
>> Originally, this was in python with zip file built in memory (no 
>> temporaries). Sticking to portable shell makes many easy things really 
>> hard.
> 
> Not if you just pipe the two parts (refs & pack) into the output. Piping 
> also allows for "--output -" meaning stdout...

git-unbundle uses no temporary files: it pipes directly from tar (was 
zip, but I've changed to tar per Junio's request).

The problem is creating the tar: I know of no way to create a tar file 
with two separately addressable items, both created by piping in to 
stdin. If there are not two streams, I don't know how to split the data 
in sh without mangling the pack file due to sh variable substitution 
rules. So, I think the temporary file solution is a reasonable compromise.
> 
> Not at all. I meant to verify that these _hashes_ exist as commits. Not 
> necessarily refs.

See my other note.

>
> 
> We have shallow clones. This means that you can mark commits as "fake 
> root" commits, i.e. even if they have parents, they are treated as if they 
> had no parents. You do this by adding the hashes of the shallow commits to 
> ..git/shallow. For a short description, search for "shallow" in 
> Documentation/glossary.txt.

Thanks.

> 
> Ciao,
> Dscho
> 

Mark

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] git-bundle - bundle objects and references for disconnected transfer.
  2007-02-15  2:32         ` Mark Levedahl
@ 2007-02-15 15:32           ` Johannes Schindelin
  2007-02-16  0:12             ` Mark Levedahl
  0 siblings, 1 reply; 32+ messages in thread
From: Johannes Schindelin @ 2007-02-15 15:32 UTC (permalink / raw)
  To: Mark Levedahl; +Cc: git

Hi,

On Wed, 14 Feb 2007, Mark Levedahl wrote:

> Johannes Schindelin wrote:
> > 
> > On Wed, 14 Feb 2007, Mark Levedahl wrote:
> > 
> > Ah, I just realized that you do not shift. This is wrong. For example,
> > 
> > 	git bundle --output=a1 a..b
> > 
> > would pass "--output=a1 a..b" to git-rev-parse. While you say "--revs-only",
> > this would work, but so would "these are no refs". You lose valuable
> > information that way (namely invalid parameters). The standard shell way is
> > nicely visible in git-tag.sh (see the while loop). It is basically
> > 
> > while case "$#" in 0) break ;; esac
> > do
> > 	case "$1" in
> > 	--output)
> > 		# handle $1 (and check that you can write to it).
> > 		;;
> > 	-*)
> > 		usage
> > 		;;
> > 	*)
> > 		break
> > 	esac
> > done
> 
> And that loop would always abort on things meant for git-rev-list. I 
> want to avoid making git-bundle have to understand everything that is 
> legal to git-rev-list. The current construct does this: it lets 
> git-rev-parse remove what that function knows, aborting if something is 
> amiss (or aborting later in git-rev-list), leaving git-bundle's parser 
> to chew on the rest.

Why not force unmixing? I.e. first the options for git-bundle, _then_ the 
rest? (In that case, you would leave out the "-*)" clause).

> > > Originally, this was in python with zip file built in memory (no 
> > > temporaries). Sticking to portable shell makes many easy things 
> > > really hard.
> > 
> > Not if you just pipe the two parts (refs & pack) into the output. 
> > Piping also allows for "--output -" meaning stdout...
> 
> git-unbundle uses no temporary files: it pipes directly from tar (was 
> zip, but I've changed to tar per Junio's request).

It does not have to be tar. There is no good reason that the parts you put 
into the bundle have to be files, rather than header and body.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] git-bundle - bundle objects and references for disconnected transfer.
  2007-02-15  2:13           ` Mark Levedahl
@ 2007-02-15 15:35             ` Johannes Schindelin
  0 siblings, 0 replies; 32+ messages in thread
From: Johannes Schindelin @ 2007-02-15 15:35 UTC (permalink / raw)
  To: Mark Levedahl; +Cc: git

Hi,

On Wed, 14 Feb 2007, Mark Levedahl wrote:

> Johannes Schindelin wrote:
> > 
> > On Wed, 14 Feb 2007, Mark Levedahl wrote:
> > 
> > > Mark Levedahl wrote:
> > > > Johannes Schindelin wrote:
> > > > 
> > 
> > I don't think you need the bases. If you say "master~10..master" on 
> > the sender side, you want to update master on the receiving side, 
> > _after_ you verified that receiver already has "master~10".
> > 
> git>git-rev-parse master~10..master
> dc0f74905bd94b88d3b1d477e79faef7e0308fbf
> ^602598fd5d8f64028f84d2772725c5e3414a112f
> 
> Which shows the new head and the commit that the destination needs. That 
> is fine. But:
> 
> git>git-rev-parse master --since=10.days.ago
> dc0f74905bd94b88d3b1d477e79faef7e0308fbf
> --max-age=1170641182
> 
> is not helpful: it does not tell what is expected to be on the other 
> end. And I find both forms absolutely useful in the ways I use 
> git-bundle.

You're right.

But instead of doing this with Python or by hand, why not make the 
"--boundary" option useful in that case?

> I have found in practice my current solution of git-fsck to be much 
> faster.

It is only faster since you unpack the objects. Which makes almost every 
other operation slow.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] git-bundle - bundle objects and references for disconnected   transfer.
  2007-02-15 15:32           ` Johannes Schindelin
@ 2007-02-16  0:12             ` Mark Levedahl
  2007-02-16  0:40               ` Johannes Schindelin
  0 siblings, 1 reply; 32+ messages in thread
From: Mark Levedahl @ 2007-02-16  0:12 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin wrote:
> Why not force unmixing? I.e. first the options for git-bundle, _then_ the 
> rest? (In that case, you would leave out the "-*)" clause).
>   
This would just trade one usability issue for another.
> It does not have to be tar. There is no good reason that the parts you put 
> into the bundle have to be files, rather than header and body.
>   
sh does not handle binary files: there is no way to split header from 
binary payload.

Mark

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] git-bundle - bundle objects and references for disconnected transfer.
  2007-02-16  0:12             ` Mark Levedahl
@ 2007-02-16  0:40               ` Johannes Schindelin
  2007-02-16  3:23                 ` Mark Levedahl
  0 siblings, 1 reply; 32+ messages in thread
From: Johannes Schindelin @ 2007-02-16  0:40 UTC (permalink / raw)
  To: Mark Levedahl; +Cc: git

Hi,

On Thu, 15 Feb 2007, Mark Levedahl wrote:

> Johannes Schindelin wrote:
> > Why not force unmixing? I.e. first the options for git-bundle, _then_ the
> > rest? (In that case, you would leave out the "-*)" clause).
> >   
> This would just trade one usability issue for another.

It is not a usability issue if you are cleanly separating things which do 
not belong together.

> > It does not have to be tar. There is no good reason that the parts you 
> > put into the bundle have to be files, rather than header and body.
> >   
> sh does not handle binary files: there is no way to split header from 
> binary payload.

Example:

#!/bin/sh

(echo Hallo; echo Bello; echo; echo blabla) | \
(
	while read line; do
		echo "$line"
		if [ -z "$line" ]; then
			break
		fi
	done
	echo "xxx"
	cat
)

In this case, shell reads the header until an empty line is encountered. 
The rest is piped through cat. And it does not matter if "blabla" is text 
or binary.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] git-bundle - bundle objects and references for disconnected   transfer.
  2007-02-16  0:40               ` Johannes Schindelin
@ 2007-02-16  3:23                 ` Mark Levedahl
  0 siblings, 0 replies; 32+ messages in thread
From: Mark Levedahl @ 2007-02-16  3:23 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin wrote:
> It is not a usability issue if you are cleanly separating things which do 
> not belong together.
>
>   
This introduces order dependency that is otherwise not there. The order 
dependency makes perfect sense to one who understands the details, but 
otherwise seems arbitrary. (..still pondering what to do here).
> Example:
>
> #!/bin/sh
>
> (echo Hallo; echo Bello; echo; echo blabla) | \
> (
> 	while read line; do
> 		echo "$line"
> 		if [ -z "$line" ]; then
> 			break
> 		fi
> 	done
> 	echo "xxx"
> 	cat
> )
>
> In this case, shell reads the header until an empty line is encountered. 
> The rest is piped through cat. And it does not matter if "blabla" is text 
> or binary.
>   
Doh! (sometimes you just have to whack people over the head with a 2x4). 
Thanks.

Mark

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2007-02-16  3:24 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-02-14 14:10 Scripts to use "bundles" for moving data between repositories Mark Levedahl
2007-02-14 14:10 ` [PATCH] git-bundle - bundle objects and references for disconnected transfer Mark Levedahl
2007-02-14 14:10   ` [PATCH] git-unbundle - unbundle " Mark Levedahl
2007-02-14 14:10     ` [PATCH] Create a man page for git-bundle Mark Levedahl
2007-02-14 14:10       ` [PATCH] Create a man page for git-unbundle Mark Levedahl
2007-02-14 19:45     ` [PATCH] git-unbundle - unbundle objects and references for disconnected transfer Shawn O. Pearce
2007-02-14 20:57       ` Mark Levedahl
2007-02-14 21:03         ` Shawn O. Pearce
2007-02-14 22:43           ` Mark Levedahl
2007-02-14 21:18         ` Nicolas Pitre
2007-02-14 19:42   ` [PATCH] git-bundle - bundle " Shawn O. Pearce
2007-02-14 21:58   ` Johannes Schindelin
2007-02-14 23:19     ` Mark Levedahl
2007-02-14 23:55       ` Mark Levedahl
2007-02-15  0:15         ` Johannes Schindelin
2007-02-15  2:13           ` Mark Levedahl
2007-02-15 15:35             ` Johannes Schindelin
2007-02-15  0:07       ` Johannes Schindelin
2007-02-15  2:32         ` Mark Levedahl
2007-02-15 15:32           ` Johannes Schindelin
2007-02-16  0:12             ` Mark Levedahl
2007-02-16  0:40               ` Johannes Schindelin
2007-02-16  3:23                 ` Mark Levedahl
2007-02-14 14:13 ` Scripts to use "bundles" for moving data between repositories Matthieu Moy
2007-02-14 14:37   ` Mark Levedahl
2007-02-14 15:46 ` Johannes Schindelin
2007-02-14 17:11 ` Junio C Hamano
2007-02-14 17:56   ` Mark Levedahl
2007-02-14 18:00     ` Junio C Hamano
2007-02-14 21:24       ` Mark Levedahl
2007-02-14 21:26         ` Junio C Hamano
2007-02-14 18:20     ` Junio C Hamano

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.