All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/2] archive: factor out write phase of tar format
@ 2011-06-14 18:17 Jeff King
  2011-06-14 18:18 ` [PATCH 2/2] archive: support gzipped tar files Jeff King
  0 siblings, 1 reply; 56+ messages in thread
From: Jeff King @ 2011-06-14 18:17 UTC (permalink / raw)
  To: git; +Cc: René Scharfe, git-dev

The code to output the tar format for git-archive always
assumes we are writing directly to stdout. Let's factor out
that bit of code so that we can put an in-process gzip
filter in place.

Signed-off-by: Jeff King <peff@peff.net>
---
 archive-tar.c |   22 +++++++++++++++++-----
 1 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/archive-tar.c b/archive-tar.c
index cee06ce..b1aea87 100644
--- a/archive-tar.c
+++ b/archive-tar.c
@@ -10,6 +10,7 @@
 
 static char block[BLOCKSIZE];
 static unsigned long offset;
+static void (*output)(const char *buf, unsigned long size);
 
 static int tar_umask = 002;
 
@@ -17,7 +18,7 @@ static int tar_umask = 002;
 static void write_if_needed(void)
 {
 	if (offset == BLOCKSIZE) {
-		write_or_die(1, block, BLOCKSIZE);
+		output(block, BLOCKSIZE);
 		offset = 0;
 	}
 }
@@ -42,7 +43,7 @@ static void write_blocked(const void *data, unsigned long size)
 		write_if_needed();
 	}
 	while (size >= BLOCKSIZE) {
-		write_or_die(1, buf, BLOCKSIZE);
+		output(buf, BLOCKSIZE);
 		size -= BLOCKSIZE;
 		buf += BLOCKSIZE;
 	}
@@ -66,10 +67,10 @@ static void write_trailer(void)
 {
 	int tail = BLOCKSIZE - offset;
 	memset(block + offset, 0, tail);
-	write_or_die(1, block, BLOCKSIZE);
+	output(block, BLOCKSIZE);
 	if (tail < 2 * RECORDSIZE) {
 		memset(block, 0, offset);
-		write_or_die(1, block, BLOCKSIZE);
+		output(block, BLOCKSIZE);
 	}
 }
 
@@ -234,7 +235,7 @@ static int git_tar_config(const char *var, const char *value, void *cb)
 	return git_default_config(var, value, cb);
 }
 
-int write_tar_archive(struct archiver_args *args)
+static int write_tar_archive_internal(struct archiver_args *args)
 {
 	int err = 0;
 
@@ -248,3 +249,14 @@ int write_tar_archive(struct archiver_args *args)
 		write_trailer();
 	return err;
 }
+
+static void output_write(const char *buf, unsigned long len)
+{
+	write_or_die(1, buf, len);
+}
+
+int write_tar_archive(struct archiver_args *args)
+{
+	output = output_write;
+	return write_tar_archive_internal(args);
+}
-- 
1.7.6.rc1.37.g6d4ed.dirty

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 2/2] archive: support gzipped tar files
  2011-06-14 18:17 [PATCH 1/2] archive: factor out write phase of tar format Jeff King
@ 2011-06-14 18:18 ` Jeff King
  2011-06-14 19:25   ` J.H.
                     ` (2 more replies)
  0 siblings, 3 replies; 56+ messages in thread
From: Jeff King @ 2011-06-14 18:18 UTC (permalink / raw)
  To: git; +Cc: René Scharfe, git-dev

git-archive already supports the creation of tar files. For
local cases, one can simply pipe the output to gzip, and
having git-archive do the gzip is a minor convenience.

However, when running git-archive against a remote site,
having the remote side do the compression can save
considerable bandwidth. Service providers could always wrap
git-archive to provide that functionality, but this makes it
much simpler.

Creating gzipped archives is of course more expensive than
regular tar archives; however, the amount of work should be
comparable to that of creating a zip file, which is already
possible. So there should be no new security implications
with respect to creating load on a remote server.

Signed-off-by: Jeff King <peff@peff.net>
---
 Documentation/git-archive.txt |   17 +++++++++++++++--
 archive-tar.c                 |   27 +++++++++++++++++++++++++++
 archive.c                     |    1 +
 archive.h                     |    1 +
 builtin/archive.c             |    6 ++++++
 t/t5000-tar-tree.sh           |   26 ++++++++++++++++++++++++++
 6 files changed, 76 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-archive.txt b/Documentation/git-archive.txt
index 9c750e2..963bec4 100644
--- a/Documentation/git-archive.txt
+++ b/Documentation/git-archive.txt
@@ -34,10 +34,11 @@ OPTIONS
 -------
 
 --format=<fmt>::
-	Format of the resulting archive: 'tar' or 'zip'. If this option
+	Format of the resulting archive: 'tar', 'tgz', or 'zip'. If this option
 	is not given, and the output file is specified, the format is
 	inferred from the filename if possible (e.g. writing to "foo.zip"
-	makes the output to be in the zip format). Otherwise the output
+	creates the output in the zip format; "foo.tgz" or "foo.tar.gz"
+	creates the output in the tgz format). Otherwise the output
 	format is `tar`.
 
 -l::
@@ -89,6 +90,12 @@ zip
 	Highest and slowest compression level.  You can specify any
 	number from 1 to 9 to adjust compression speed and ratio.
 
+tgz
+~~~
+-9::
+	Highest and slowest compression level. You can specify any
+	number from 1 to 9 to adjust compression speed and ratio.
+
 
 CONFIGURATION
 -------------
@@ -133,6 +140,12 @@ git archive --format=tar --prefix=git-1.4.0/ v1.4.0 | gzip >git-1.4.0.tar.gz::
 
 	Create a compressed tarball for v1.4.0 release.
 
+git archive --prefix=git-1.4.0/ -o git-1.4.0.tar.gz v1.4.0
+
+	Same as above, except that we use the internal gzip. Note that
+	the output format is inferred by the extension of the output
+	file.
+
 git archive --format=tar --prefix=git-1.4.0/ v1.4.0{caret}\{tree\} | gzip >git-1.4.0.tar.gz::
 
 	Create a compressed tarball for v1.4.0 release, but without a
diff --git a/archive-tar.c b/archive-tar.c
index b1aea87..86c8aa9 100644
--- a/archive-tar.c
+++ b/archive-tar.c
@@ -260,3 +260,30 @@ int write_tar_archive(struct archiver_args *args)
 	output = output_write;
 	return write_tar_archive_internal(args);
 }
+
+static gzFile gz_file;
+static void output_gz(const char *buf, unsigned long len)
+{
+	if (!gzwrite(gz_file, buf, len))
+		die("unable to write compressed stream: %s",
+		    gzerror(gz_file, NULL));
+}
+
+int write_tgz_archive(struct archiver_args *args)
+{
+	int r;
+
+	gz_file = gzdopen(1, "w");
+	if (!gz_file)
+		die_errno("unable to open compressed stream");
+	gzsetparams(gz_file, args->compression_level, Z_DEFAULT_STRATEGY);
+
+	output = output_gz;
+	r = write_tar_archive_internal(args);
+	if (r == 0) {
+		int zerr = gzclose(gz_file);
+		if (zerr != Z_OK)
+			die("unable to write compressed stream (err=%d)", zerr);
+	}
+	return r;
+}
diff --git a/archive.c b/archive.c
index 42f2d2f..6073a8d 100644
--- a/archive.c
+++ b/archive.c
@@ -23,6 +23,7 @@ static const struct archiver {
 } archivers[] = {
 	{ "tar", write_tar_archive },
 	{ "zip", write_zip_archive, USES_ZLIB_COMPRESSION },
+	{ "tgz", write_tgz_archive, USES_ZLIB_COMPRESSION },
 };
 
 static void format_subst(const struct commit *commit,
diff --git a/archive.h b/archive.h
index 038ac35..c1bf72e 100644
--- a/archive.h
+++ b/archive.h
@@ -23,6 +23,7 @@ typedef int (*write_archive_entry_fn_t)(struct archiver_args *args, const unsign
  */
 extern int write_tar_archive(struct archiver_args *);
 extern int write_zip_archive(struct archiver_args *);
+extern int write_tgz_archive(struct archiver_args *);
 
 extern int write_archive_entries(struct archiver_args *args, write_archive_entry_fn_t write_entry);
 extern int write_archive(int argc, const char **argv, const char *prefix, int setup_prefix);
diff --git a/builtin/archive.c b/builtin/archive.c
index b14eaba..4f60af5 100644
--- a/builtin/archive.c
+++ b/builtin/archive.c
@@ -71,6 +71,12 @@ static const char *format_from_name(const char *filename)
 	ext++;
 	if (!strcasecmp(ext, "zip"))
 		return "--format=zip";
+	if (!strcasecmp(ext, "tgz"))
+		return "--format=tgz";
+	if (!strcasecmp(ext, "gz") &&
+	    ext - 4 >= filename &&
+	    !strcasecmp(ext - 4, "tar.gz"))
+		return "--format=tgz";
 	return NULL;
 }
 
diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh
index cff1b3e..faf2784 100755
--- a/t/t5000-tar-tree.sh
+++ b/t/t5000-tar-tree.sh
@@ -26,6 +26,7 @@ commit id embedding:
 
 . ./test-lib.sh
 UNZIP=${UNZIP:-unzip}
+GUNZIP=${GUNZIP:-gunzip}
 
 SUBSTFORMAT=%H%n
 
@@ -252,4 +253,29 @@ test_expect_success 'git-archive --prefix=olde-' '
 	test -f h/olde-a/bin/sh
 '
 
+test_expect_success 'git archive --format=tgz' '
+	git archive --format=tgz HEAD >e.tgz
+'
+
+test_expect_success 'infer tgz from .tgz filename' '
+	git archive --output=e1.tgz HEAD &&
+	test_cmp e.tgz e1.tgz
+'
+
+test_expect_success 'infer tgz from .tar.gz filename' '
+	git archive --output=e2.tar.gz HEAD &&
+	test_cmp e.tgz e2.tar.gz
+'
+
+if $GUNZIP --version >/dev/null 2>&1; then
+	test_set_prereq GUNZIP
+else
+	say "Skipping tgz tests because gunzip was not found"
+fi
+
+test_expect_success GUNZIP 'extract tgz file' '
+	gunzip -c <e.tgz >e.tar &&
+	test_cmp b.tar e.tar
+'
+
 test_done
-- 
1.7.6.rc1.37.g6d4ed.dirty

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH 2/2] archive: support gzipped tar files
  2011-06-14 18:18 ` [PATCH 2/2] archive: support gzipped tar files Jeff King
@ 2011-06-14 19:25   ` J.H.
  2011-06-14 19:30     ` Jeff King
  2011-06-14 19:39   ` René Scharfe
  2011-06-14 20:30   ` [PATCH 2/2] archive: support gzipped tar files Junio C Hamano
  2 siblings, 1 reply; 56+ messages in thread
From: J.H. @ 2011-06-14 19:25 UTC (permalink / raw)
  To: Jeff King; +Cc: git, René Scharfe, git-dev

On 06/14/2011 11:18 AM, Jeff King wrote:
> git-archive already supports the creation of tar files. For
> local cases, one can simply pipe the output to gzip, and
> having git-archive do the gzip is a minor convenience.
> 
> However, when running git-archive against a remote site,
> having the remote side do the compression can save
> considerable bandwidth. Service providers could always wrap
> git-archive to provide that functionality, but this makes it
> much simpler.
> 
> Creating gzipped archives is of course more expensive than
> regular tar archives; however, the amount of work should be
> comparable to that of creating a zip file, which is already
> possible. So there should be no new security implications
> with respect to creating load on a remote server.

Would it make sense to make this a little more generic and support bz2
and xz as well?

- John 'Warthog9' Hawley

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 2/2] archive: support gzipped tar files
  2011-06-14 19:25   ` J.H.
@ 2011-06-14 19:30     ` Jeff King
  0 siblings, 0 replies; 56+ messages in thread
From: Jeff King @ 2011-06-14 19:30 UTC (permalink / raw)
  To: J.H.; +Cc: git, René Scharfe, git-dev

On Tue, Jun 14, 2011 at 12:25:17PM -0700, J.H. wrote:

> On 06/14/2011 11:18 AM, Jeff King wrote:
> > git-archive already supports the creation of tar files. For
> > local cases, one can simply pipe the output to gzip, and
> > having git-archive do the gzip is a minor convenience.
> > 
> > However, when running git-archive against a remote site,
> > having the remote side do the compression can save
> > considerable bandwidth. Service providers could always wrap
> > git-archive to provide that functionality, but this makes it
> > much simpler.
> > 
> > Creating gzipped archives is of course more expensive than
> > regular tar archives; however, the amount of work should be
> > comparable to that of creating a zip file, which is already
> > possible. So there should be no new security implications
> > with respect to creating load on a remote server.
> 
> Would it make sense to make this a little more generic and support bz2
> and xz as well?

I think it's a great idea if somebody wants to do it on top of my patch.
They should be able to hook into the tar code just like I did in 2/2.
But they will need library support that we don't already have in git.
Doing gz was easy because we already require zlib.

We could also support them by piping to an external compressor, which
wouldn't be too hard (you could do it for gzip, too, but given that we
have zlib, this was much simpler). There is a slight hitch with the
"--list" command, though. Should git-archive advertise these formats,
and if so, how should it know that they are available?

-Peff

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 2/2] archive: support gzipped tar files
  2011-06-14 18:18 ` [PATCH 2/2] archive: support gzipped tar files Jeff King
  2011-06-14 19:25   ` J.H.
@ 2011-06-14 19:39   ` René Scharfe
  2011-06-14 20:14     ` Jeff King
  2011-06-14 20:30   ` [PATCH 2/2] archive: support gzipped tar files Junio C Hamano
  2 siblings, 1 reply; 56+ messages in thread
From: René Scharfe @ 2011-06-14 19:39 UTC (permalink / raw)
  To: Jeff King; +Cc: git, git-dev

Am 14.06.2011 20:18, schrieb Jeff King:
> git-archive already supports the creation of tar files. For
> local cases, one can simply pipe the output to gzip, and
> having git-archive do the gzip is a minor convenience.
> 
> However, when running git-archive against a remote site,
> having the remote side do the compression can save
> considerable bandwidth. Service providers could always wrap
> git-archive to provide that functionality, but this makes it
> much simpler.

That's a good point and one that was overlooked when this topic came up
earlier (see http://kerneltrap.org/mailarchive/git/2009/9/10/11507 and
http://kerneltrap.org/mailarchive/git/2009/9/11/11577).  That
implementation was ... heavier than yours, but it also avoided an
unnecessary level of buffering.  I wonder if it makes a measurable
difference, though.

> Creating gzipped archives is of course more expensive than
> regular tar archives; however, the amount of work should be
> comparable to that of creating a zip file, which is already
> possible. So there should be no new security implications
> with respect to creating load on a remote server.
> 
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  Documentation/git-archive.txt |   17 +++++++++++++++--
>  archive-tar.c                 |   27 +++++++++++++++++++++++++++
>  archive.c                     |    1 +
>  archive.h                     |    1 +
>  builtin/archive.c             |    6 ++++++
>  t/t5000-tar-tree.sh           |   26 ++++++++++++++++++++++++++
>  6 files changed, 76 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/git-archive.txt b/Documentation/git-archive.txt
> index 9c750e2..963bec4 100644
> --- a/Documentation/git-archive.txt
> +++ b/Documentation/git-archive.txt
> @@ -34,10 +34,11 @@ OPTIONS
>  -------
>  
>  --format=<fmt>::
> -	Format of the resulting archive: 'tar' or 'zip'. If this option
> +	Format of the resulting archive: 'tar', 'tgz', or 'zip'. If this option
>  	is not given, and the output file is specified, the format is
>  	inferred from the filename if possible (e.g. writing to "foo.zip"
> -	makes the output to be in the zip format). Otherwise the output
> +	creates the output in the zip format; "foo.tgz" or "foo.tar.gz"
> +	creates the output in the tgz format). Otherwise the output
>  	format is `tar`.
>  
>  -l::
> @@ -89,6 +90,12 @@ zip
>  	Highest and slowest compression level.  You can specify any
>  	number from 1 to 9 to adjust compression speed and ratio.
>  
> +tgz
> +~~~
> +-9::
> +	Highest and slowest compression level. You can specify any
> +	number from 1 to 9 to adjust compression speed and ratio.
> +
>  
>  CONFIGURATION
>  -------------
> @@ -133,6 +140,12 @@ git archive --format=tar --prefix=git-1.4.0/ v1.4.0 | gzip >git-1.4.0.tar.gz::
>  
>  	Create a compressed tarball for v1.4.0 release.
>  
> +git archive --prefix=git-1.4.0/ -o git-1.4.0.tar.gz v1.4.0
> +
> +	Same as above, except that we use the internal gzip. Note that
> +	the output format is inferred by the extension of the output
> +	file.
> +
>  git archive --format=tar --prefix=git-1.4.0/ v1.4.0{caret}\{tree\} | gzip >git-1.4.0.tar.gz::
>  
>  	Create a compressed tarball for v1.4.0 release, but without a
> diff --git a/archive-tar.c b/archive-tar.c
> index b1aea87..86c8aa9 100644
> --- a/archive-tar.c
> +++ b/archive-tar.c
> @@ -260,3 +260,30 @@ int write_tar_archive(struct archiver_args *args)
>  	output = output_write;
>  	return write_tar_archive_internal(args);
>  }
> +
> +static gzFile gz_file;
> +static void output_gz(const char *buf, unsigned long len)
> +{
> +	if (!gzwrite(gz_file, buf, len))
> +		die("unable to write compressed stream: %s",
> +		    gzerror(gz_file, NULL));
> +}

Does this do the right things when faced with interrupted writes or
truncated pipes?  I ask because the earlier attempt had a
gzwrite_or_die() which did that, but I don't know anymore if that is
strictly needed.  Oh, and bridging the gap between unsigned long and int
was certainly another reason for the existence of this function.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 2/2] archive: support gzipped tar files
  2011-06-14 19:39   ` René Scharfe
@ 2011-06-14 20:14     ` Jeff King
  2011-06-14 20:45       ` Jeff King
  0 siblings, 1 reply; 56+ messages in thread
From: Jeff King @ 2011-06-14 20:14 UTC (permalink / raw)
  To: René Scharfe; +Cc: git, git-dev

On Tue, Jun 14, 2011 at 09:39:55PM +0200, René Scharfe wrote:

> > However, when running git-archive against a remote site,
> > having the remote side do the compression can save
> > considerable bandwidth. Service providers could always wrap
> > git-archive to provide that functionality, but this makes it
> > much simpler.
> 
> That's a good point and one that was overlooked when this topic came up
> earlier (see http://kerneltrap.org/mailarchive/git/2009/9/10/11507 and
> http://kerneltrap.org/mailarchive/git/2009/9/11/11577).

Hmph, I should have done my homework better. I totally missed that
thread.

Yeah, I am unsurprised that doing it in a single process is actually
slower. I do think because of the remote issue that we should provide
something like this. But we could implement it by piping to an external
gzip. That would make us just slightly less portable, but would give us
the multi-processor speedup, or even allow using something like pigz.

> > +static void output_gz(const char *buf, unsigned long len)
> > +{
> > +	if (!gzwrite(gz_file, buf, len))
> > +		die("unable to write compressed stream: %s",
> > +		    gzerror(gz_file, NULL));
> > +}
> 
> Does this do the right things when faced with interrupted writes or
> truncated pipes? I ask because the earlier attempt had a
> gzwrite_or_die() which did that, but I don't know anymore if that is
> strictly needed.

No, I blindly assumed that gzwrite was a little bit smart, but looking
at the zlib code, it really is just propagating whatever it got from
fwrite. I need to handle both errors and short writes myself. So we do
need gzwrite_or_die.

> Oh, and bridging the gap between unsigned long and int
> was certainly another reason for the existence of this function.

Ugh. I correctly saw that it took an unsigned long, but it actually
returns the number of bytes written as an int! Nice interface.

All of this can go away, though, if we switch to an external process.
It's tempting.

-Peff

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 2/2] archive: support gzipped tar files
  2011-06-14 18:18 ` [PATCH 2/2] archive: support gzipped tar files Jeff King
  2011-06-14 19:25   ` J.H.
  2011-06-14 19:39   ` René Scharfe
@ 2011-06-14 20:30   ` Junio C Hamano
  2011-06-14 20:49     ` Jeff King
  2 siblings, 1 reply; 56+ messages in thread
From: Junio C Hamano @ 2011-06-14 20:30 UTC (permalink / raw)
  To: Jeff King; +Cc: git, René Scharfe, git-dev

Jeff King <peff@peff.net> writes:

I didn't know it was that easy (primarily because I didn't know zlib had a
ready-to-eat interface to do this).

> +	if (!strcasecmp(ext, "tgz"))
> +		return "--format=tgz";
> +	if (!strcasecmp(ext, "gz") &&
> +	    ext - 4 >= filename &&
> +	    !strcasecmp(ext - 4, "tar.gz"))

Shouldn't this be

	if (!strcasecmp(ext, "gz") && filename < ext - 5 &&
            !strcasecmp(ext - 5, ".tar.gz"))

to exclude "hellotar.gz" and possibly ".tar.gz" ("<=" vs "<")?

> diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh
> index cff1b3e..faf2784 100755
> --- a/t/t5000-tar-tree.sh
> +++ b/t/t5000-tar-tree.sh
> @@ -26,6 +26,7 @@ commit id embedding:
>  
>  . ./test-lib.sh
>  UNZIP=${UNZIP:-unzip}
> +GUNZIP=${GUNZIP:-gunzip}

Just a personal preference but I find myself using "gzip -d" more often
than "gunzip".

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 2/2] archive: support gzipped tar files
  2011-06-14 20:14     ` Jeff King
@ 2011-06-14 20:45       ` Jeff King
  2011-06-15 22:30         ` [RFC/PATCH 0/7] user-configurable git-archive output formats Jeff King
  0 siblings, 1 reply; 56+ messages in thread
From: Jeff King @ 2011-06-14 20:45 UTC (permalink / raw)
  To: René Scharfe; +Cc: git, git-dev

On Tue, Jun 14, 2011 at 04:14:33PM -0400, Jeff King wrote:

> Yeah, I am unsurprised that doing it in a single process is actually
> slower. I do think because of the remote issue that we should provide
> something like this. But we could implement it by piping to an external
> gzip. That would make us just slightly less portable, but would give us
> the multi-processor speedup, or even allow using something like pigz.

So here's a relatively quick implementation of the pipe idea. It just
handles .tar.gz, but it would be trivial to do bz2 or other formats, as
long as they can act as a stdio filter.

The gzip path is not configurable at all. Probably it should read the
path and arguments from the config file. In fact, we could even allow
arbitrary config like:

  [tarfilter "tgz"]
    command = gzip -c
    extension = tgz
    extension = tar.gz

which also solves the "don't advertise in --list if you don't have it
installed problem".  At the same time, that is a lot to have to
configure for somebody who is not providing remote service and just
wants:

  git archive -o HEAD foo.tar.gz

to work out of the box.

I think we could probably allow arbitrary config, but provide a few
sane, common defaults like gzip and bz2 unless the user specifically
turns them off at build time.

---
 archive-tar.c       |   45 +++++++++++++++++++++++++++++++++++++++++++++
 archive.c           |    1 +
 archive.h           |    1 +
 builtin/archive.c   |    6 ++++++
 t/t5000-tar-tree.sh |   26 ++++++++++++++++++++++++++
 5 files changed, 79 insertions(+), 0 deletions(-)

diff --git a/archive-tar.c b/archive-tar.c
index cee06ce..a77d605 100644
--- a/archive-tar.c
+++ b/archive-tar.c
@@ -4,6 +4,7 @@
 #include "cache.h"
 #include "tar.h"
 #include "archive.h"
+#include "run-command.h"
 
 #define RECORDSIZE	(512)
 #define BLOCKSIZE	(RECORDSIZE * 20)
@@ -248,3 +249,47 @@ int write_tar_archive(struct archiver_args *args)
 		write_trailer();
 	return err;
 }
+
+static int write_tar_to_filter(struct archiver_args *args, const char **argv)
+{
+	struct child_process filter;
+	int r;
+
+	memset(&filter, 0, sizeof(filter));
+	filter.argv = argv;
+	filter.in = -1;
+
+	if (start_command(&filter) < 0)
+		die_errno("unable to start '%s' filter", argv[0]);
+	close(1);
+	if (dup2(filter.in, 1) < 0)
+		die_errno("unable to redirect descriptor");
+	close(filter.in);
+
+	r = write_tar_archive(args);
+
+	close(1);
+	if (finish_command(&filter) != 0)
+		die("'%s' filter reported error", argv[0]);
+
+	return r;
+}
+
+int write_tgz_archive(struct archiver_args *args)
+{
+	char compression[4];
+	const char *argv[] = {
+		"gzip",
+		"-c",
+		NULL, /* compression level */
+		NULL
+	};
+
+	if (args->compression_level >= 0) {
+		snprintf(compression, sizeof(compression),
+			 "-%d", args->compression_level);
+		argv[2] = compression;
+	}
+
+	return write_tar_to_filter(args, argv);
+}
diff --git a/archive.c b/archive.c
index 42f2d2f..6073a8d 100644
--- a/archive.c
+++ b/archive.c
@@ -23,6 +23,7 @@ static const struct archiver {
 } archivers[] = {
 	{ "tar", write_tar_archive },
 	{ "zip", write_zip_archive, USES_ZLIB_COMPRESSION },
+	{ "tgz", write_tgz_archive, USES_ZLIB_COMPRESSION },
 };
 
 static void format_subst(const struct commit *commit,
diff --git a/archive.h b/archive.h
index 038ac35..c1bf72e 100644
--- a/archive.h
+++ b/archive.h
@@ -23,6 +23,7 @@ typedef int (*write_archive_entry_fn_t)(struct archiver_args *args, const unsign
  */
 extern int write_tar_archive(struct archiver_args *);
 extern int write_zip_archive(struct archiver_args *);
+extern int write_tgz_archive(struct archiver_args *);
 
 extern int write_archive_entries(struct archiver_args *args, write_archive_entry_fn_t write_entry);
 extern int write_archive(int argc, const char **argv, const char *prefix, int setup_prefix);
diff --git a/builtin/archive.c b/builtin/archive.c
index b14eaba..4f60af5 100644
--- a/builtin/archive.c
+++ b/builtin/archive.c
@@ -71,6 +71,12 @@ static const char *format_from_name(const char *filename)
 	ext++;
 	if (!strcasecmp(ext, "zip"))
 		return "--format=zip";
+	if (!strcasecmp(ext, "tgz"))
+		return "--format=tgz";
+	if (!strcasecmp(ext, "gz") &&
+	    ext - 4 >= filename &&
+	    !strcasecmp(ext - 4, "tar.gz"))
+		return "--format=tgz";
 	return NULL;
 }
 
diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh
index cff1b3e..faf2784 100755
--- a/t/t5000-tar-tree.sh
+++ b/t/t5000-tar-tree.sh
@@ -26,6 +26,7 @@ commit id embedding:
 
 . ./test-lib.sh
 UNZIP=${UNZIP:-unzip}
+GUNZIP=${GUNZIP:-gunzip}
 
 SUBSTFORMAT=%H%n
 
@@ -252,4 +253,29 @@ test_expect_success 'git-archive --prefix=olde-' '
 	test -f h/olde-a/bin/sh
 '
 
+test_expect_success 'git archive --format=tgz' '
+	git archive --format=tgz HEAD >e.tgz
+'
+
+test_expect_success 'infer tgz from .tgz filename' '
+	git archive --output=e1.tgz HEAD &&
+	test_cmp e.tgz e1.tgz
+'
+
+test_expect_success 'infer tgz from .tar.gz filename' '
+	git archive --output=e2.tar.gz HEAD &&
+	test_cmp e.tgz e2.tar.gz
+'
+
+if $GUNZIP --version >/dev/null 2>&1; then
+	test_set_prereq GUNZIP
+else
+	say "Skipping tgz tests because gunzip was not found"
+fi
+
+test_expect_success GUNZIP 'extract tgz file' '
+	gunzip -c <e.tgz >e.tar &&
+	test_cmp b.tar e.tar
+'
+
 test_done
-- 
1.7.6.rc1.4.g49204

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH 2/2] archive: support gzipped tar files
  2011-06-14 20:30   ` [PATCH 2/2] archive: support gzipped tar files Junio C Hamano
@ 2011-06-14 20:49     ` Jeff King
  2011-06-14 23:40       ` Miles Bader
  0 siblings, 1 reply; 56+ messages in thread
From: Jeff King @ 2011-06-14 20:49 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, René Scharfe, git-dev

On Tue, Jun 14, 2011 at 01:30:47PM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> I didn't know it was that easy (primarily because I didn't know zlib had a
> ready-to-eat interface to do this).

Yes, though I think it may be worth doing the more flexible,
external-filters approach. See elsewhere in the thread.

> > +	if (!strcasecmp(ext, "tgz"))
> > +		return "--format=tgz";
> > +	if (!strcasecmp(ext, "gz") &&
> > +	    ext - 4 >= filename &&
> > +	    !strcasecmp(ext - 4, "tar.gz"))
> 
> Shouldn't this be
> 
> 	if (!strcasecmp(ext, "gz") && filename < ext - 5 &&
>             !strcasecmp(ext - 5, ".tar.gz"))
> 
> to exclude "hellotar.gz" and possibly ".tar.gz" ("<=" vs "<")?

Yeah, definitely. I think the right way forward is to make this less
hard-coded, to allow people to configure new filters, so this bit of
code will get rewritten in my next version. But I'll make sure it is a
little more strict on extension matching.

-Peff

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 2/2] archive: support gzipped tar files
  2011-06-14 20:49     ` Jeff King
@ 2011-06-14 23:40       ` Miles Bader
  2011-06-15 22:46         ` Jeff King
  0 siblings, 1 reply; 56+ messages in thread
From: Miles Bader @ 2011-06-14 23:40 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, git, René Scharfe, git-dev

Jeff King <peff@github.com> writes:
>> I didn't know it was that easy (primarily because I didn't know zlib had a
>> ready-to-eat interface to do this).
>
> Yes, though I think it may be worth doing the more flexible,
> external-filters approach. See elsewhere in the thread.

Given the relatively trivial code, isn't it worth doing both...?

One method for flexibility/multi-threaded-speed, the other for
portability/robustness (doesn't depend on configuration / setup
details)...

-Miles

-- 
"Though they may have different meanings, the cries of 'Yeeeee-haw!' and
 'Allahu akbar!' are, in spirit, not actually all that different."

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [RFC/PATCH 0/7] user-configurable git-archive output formats
  2011-06-14 20:45       ` Jeff King
@ 2011-06-15 22:30         ` Jeff King
  2011-06-15 22:31           ` [PATCH 1/7] archive: reorder option parsing and config reading Jeff King
                             ` (8 more replies)
  0 siblings, 9 replies; 56+ messages in thread
From: Jeff King @ 2011-06-15 22:30 UTC (permalink / raw)
  To: René Scharfe; +Cc: Junio C Hamano, J.H., git, git-dev

On Tue, Jun 14, 2011 at 04:45:21PM -0400, Jeff King wrote:

> The gzip path is not configurable at all. Probably it should read the
> path and arguments from the config file. In fact, we could even allow
> arbitrary config like:
> 
>   [tarfilter "tgz"]
>     command = gzip -c
>     extension = tgz
>     extension = tar.gz

Here's a series implementing that. You can configure whatever you want,
and it includes builtin gzip configuration by default. You can override
to turn it off, or even switch it to run something like pigz instead.

My biggest reservation with the patches as-is is that they are very
tar-centric and not orthogonal. Specifically, they won't handle:

  1. Other streamable archive formats you would want to pipe through
     compressors. Do any of these actually exist? I guess we could offer
     "pax" as a format eventually, and it might be like tar with
     different defaults? I dunno.

     Fixing this would not be too hard. Instead of these being
     "tarfilters", they would be "archive filters", and they would chain
     to some format, defaulting to "tar".  Since there is no other
     format right now, we could even punt on writing most of the code
     until somebody adds one. But we would want to get the naming of the
     config options right, since those are user-facing. Maybe
     "archivefilter" (unfortunately the more readable archive.filter is
     a little awkward with the way we parse config files)?

  2. In theory you might want to plug in external helpers that are not
     just stream filters, but actually their own container formats (like
     zip). I think people who want 7zip would want this.

     But how does git-archive interact with the helper? By definition
     the data it wants is the set of files, not a single stream. So
     either:

       a. We give the helper a temporary exported checkout, and it
          generates the stream from that.

       b. We use tar as the lingua franca of streaming file containers,
          and let the helper deal with converting to its preferred
          output format.

      Option (a) seems horribly inefficient on disk I/O. And if we did
      want to do that, I think it's largely unrelated to this patch
      series.

      You can actually do option (b) with this series. In its worst
      case, you can do the same as (a): just untar into a temporary
      directory and compress from there. But a well-written helper could
      convert tar into the output format on the fly.

The patches are:

  [1/7]: archive: reorder option parsing and config reading
  [2/7]: archive: add user-configurable tar-filter infrastructure
  [3/7]: archive: support user tar-filters via --format
  [4/7]: archive: advertise user tar-filters in --list
  [5/7]: archive: refactor format-guessing from filename
  [6/7]: archive: match extensions from user-configured formats
  [7/7]: archive: provide builtin .tar.gz filter

-Peff

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH 1/7] archive: reorder option parsing and config reading
  2011-06-15 22:30         ` [RFC/PATCH 0/7] user-configurable git-archive output formats Jeff King
@ 2011-06-15 22:31           ` Jeff King
  2011-06-15 22:33           ` [PATCH 2/7] archive: add user-configurable tar-filter infrastructure Jeff King
                             ` (7 subsequent siblings)
  8 siblings, 0 replies; 56+ messages in thread
From: Jeff King @ 2011-06-15 22:31 UTC (permalink / raw)
  To: René Scharfe; +Cc: Junio C Hamano, J.H., git, git-dev

The archive command does three things during its
initialization phase:

  1. parse command-line options

  2. setup the git directory

  3. read config

During phase (1), if we see any options that do not require
a git directory (like "--list"), we handle them immediately
and exit, making it safe to abort step (2) if we are not in
a git directory.

Step (3) must come after step (2), since the git directory
may influence configuration.  However, this leaves no
possibility of configuration from step (3) impacting the
command-line options in step (1) (which is useful, for
example, for supporting user-configurable output formats).

Instead, let's reorder this to:

  1. setup the git directory, if it exists

  2. read config

  3. parse command-line options

  4. if we are not in a git repository, die

This should have the same external behavior, but puts
configuration before command-line parsing.

Signed-off-by: Jeff King <peff@peff.net>
---
 archive.c |   18 ++++++++++++++----
 1 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/archive.c b/archive.c
index 42f2d2f..2616676 100644
--- a/archive.c
+++ b/archive.c
@@ -387,17 +387,27 @@ static int parse_archive_args(int argc, const char **argv,
 int write_archive(int argc, const char **argv, const char *prefix,
 		int setup_prefix)
 {
+	int nongit = 0;
 	const struct archiver *ar = NULL;
 	struct archiver_args args;
 
-	argc = parse_archive_args(argc, argv, &ar, &args);
 	if (setup_prefix && prefix == NULL)
-		prefix = setup_git_directory();
+		prefix = setup_git_directory_gently(&nongit);
+
+	git_config(git_default_config, NULL);
+
+	argc = parse_archive_args(argc, argv, &ar, &args);
+	if (nongit) {
+		/*
+		 * We know this will die() with an error, so we could just
+		 * die ourselves; but its error message will be more specific
+		 * than what we could write here.
+		 */
+		setup_git_directory();
+	}
 
 	parse_treeish_arg(argv, &args, prefix);
 	parse_pathspec_arg(argv + 1, &args);
 
-	git_config(git_default_config, NULL);
-
 	return ar->write_archive(&args);
 }
-- 
1.7.6.rc1.4.g49204

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 2/7] archive: add user-configurable tar-filter infrastructure
  2011-06-15 22:30         ` [RFC/PATCH 0/7] user-configurable git-archive output formats Jeff King
  2011-06-15 22:31           ` [PATCH 1/7] archive: reorder option parsing and config reading Jeff King
@ 2011-06-15 22:33           ` Jeff King
  2011-06-15 23:33             ` Junio C Hamano
  2011-06-15 22:33           ` [PATCH 3/7] archive: support user tar-filters via --format Jeff King
                             ` (6 subsequent siblings)
  8 siblings, 1 reply; 56+ messages in thread
From: Jeff King @ 2011-06-15 22:33 UTC (permalink / raw)
  To: René Scharfe; +Cc: Junio C Hamano, J.H., git, git-dev

Archive supports two output formats: tar and zip. The tar
files are totally uncompressed, and it is up to the user to
pipe them through gzip or similar. This is no more than
a minor inconvenience when running archive locally. However,
for remote calls to upload-archive, having the server do the
compression can save a lot of bandwidth.

This patch lays the foundation for tar filters; it parses
user configuration into an internal representation, but
doesn't yet do anything with the result.

It also introduces some tests to document the intended
usage.

Signed-off-by: Jeff King <peff@peff.net>
---
 Makefile             |    1 +
 archive-tar-filter.c |  112 ++++++++++++++++++++++++++++++++++++++++++++++++++
 archive.c            |    1 +
 archive.h            |   15 +++++++
 t/t5000-tar-tree.sh  |   54 ++++++++++++++++++++++++
 5 files changed, 183 insertions(+), 0 deletions(-)
 create mode 100644 archive-tar-filter.c

diff --git a/Makefile b/Makefile
index e40ac0c..bd3002f 100644
--- a/Makefile
+++ b/Makefile
@@ -573,6 +573,7 @@ LIB_OBJS += advice.o
 LIB_OBJS += alias.o
 LIB_OBJS += alloc.o
 LIB_OBJS += archive.o
+LIB_OBJS += archive-tar-filter.o
 LIB_OBJS += archive-tar.o
 LIB_OBJS += archive-zip.o
 LIB_OBJS += attr.o
diff --git a/archive-tar-filter.c b/archive-tar-filter.c
new file mode 100644
index 0000000..211f1df
--- /dev/null
+++ b/archive-tar-filter.c
@@ -0,0 +1,112 @@
+#include "cache.h"
+#include "archive.h"
+
+struct tar_filter *tar_filters;
+static struct tar_filter **tar_filters_tail = &tar_filters;
+
+static struct tar_filter *tar_filter_new(const char *name, int namelen)
+{
+	struct tar_filter *tf;
+	tf = xcalloc(1, sizeof(*tf));
+	tf->name = xmemdupz(name, namelen);
+	tf->extensions.strdup_strings = 1;
+	*tar_filters_tail = tf;
+	tar_filters_tail = &tf->next;
+	return tf;
+}
+
+static void tar_filter_free(struct tar_filter *tf)
+{
+	string_list_clear(&tf->extensions, 0);
+	free(tf->name);
+	free(tf->command);
+	free(tf);
+}
+
+static struct tar_filter *tar_filter_by_namelen(const char *name,
+						int len)
+{
+	struct tar_filter *p;
+	for (p = tar_filters; p; p = p->next)
+		if (!strncmp(p->name, name, len) && !p->name[len])
+			return p;
+	return NULL;
+}
+
+struct tar_filter *tar_filter_by_name(const char *name)
+{
+	return tar_filter_by_namelen(name, strlen(name));
+}
+
+static int tar_filter_config(const char *var, const char *value, void *data)
+{
+	struct tar_filter *tf;
+	const char *dot;
+	const char *name;
+	const char *type;
+	int namelen;
+
+	if (prefixcmp(var, "tarfilter."))
+		return 0;
+	dot = strrchr(var, '.');
+	if (dot == var + 9)
+		return 0;
+
+	name = var + 10;
+	namelen = dot - name;
+	type = dot + 1;
+
+	tf = tar_filter_by_namelen(name, namelen);
+	if (!tf)
+		tf = tar_filter_new(name, namelen);
+
+	if (!strcmp(type, "command")) {
+		if (!value)
+			return config_error_nonbool(var);
+		tf->command = xstrdup(value);
+		return 0;
+	}
+	else if (!strcmp(type, "extension")) {
+		if (!value)
+			return config_error_nonbool(var);
+		string_list_append(&tf->extensions, value);
+		return 0;
+	}
+	else if (!strcmp(type, "compressionlevels")) {
+		tf->use_compression = git_config_bool(var, value);
+		return 0;
+	}
+
+	return 0;
+}
+
+static void remove_filters_without_command(void)
+{
+	struct tar_filter *p = tar_filters;
+	struct tar_filter **last = &tar_filters;
+
+	while (p) {
+		if (p->command && *p->command)
+			last = &p->next;
+		else {
+			*last = p->next;
+			tar_filter_free(p);
+		}
+		p = *last;
+	}
+}
+
+/*
+ * We don't want to load twice, since some of our
+ * values actually append rather than overwrite.
+ */
+static int tar_filter_config_loaded;
+extern void tar_filter_load_config(void)
+{
+	if (tar_filter_config_loaded)
+		return;
+	tar_filter_config_loaded = 1;
+
+	git_config(tar_filter_config, NULL);
+	remove_filters_without_command();
+}
diff --git a/archive.c b/archive.c
index 2616676..2ed9259 100644
--- a/archive.c
+++ b/archive.c
@@ -395,6 +395,7 @@ int write_archive(int argc, const char **argv, const char *prefix,
 		prefix = setup_git_directory_gently(&nongit);
 
 	git_config(git_default_config, NULL);
+	tar_filter_load_config();
 
 	argc = parse_archive_args(argc, argv, &ar, &args);
 	if (nongit) {
diff --git a/archive.h b/archive.h
index 038ac35..8386c46 100644
--- a/archive.h
+++ b/archive.h
@@ -1,6 +1,8 @@
 #ifndef ARCHIVE_H
 #define ARCHIVE_H
 
+#include "string-list.h"
+
 struct archiver_args {
 	const char *base;
 	size_t baselen;
@@ -27,4 +29,17 @@ extern int write_zip_archive(struct archiver_args *);
 extern int write_archive_entries(struct archiver_args *args, write_archive_entry_fn_t write_entry);
 extern int write_archive(int argc, const char **argv, const char *prefix, int setup_prefix);
 
+struct tar_filter {
+	char *name;
+	char *command;
+	struct string_list extensions;
+	unsigned use_compression:1;
+	struct tar_filter *next;
+};
+
+extern struct tar_filter *tar_filters;
+extern struct tar_filter *tar_filter_by_name(const char *name);
+
+extern void tar_filter_load_config(void);
+
 #endif	/* ARCHIVE_H */
diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh
index cff1b3e..c3e1a4e 100755
--- a/t/t5000-tar-tree.sh
+++ b/t/t5000-tar-tree.sh
@@ -252,4 +252,58 @@ test_expect_success 'git-archive --prefix=olde-' '
 	test -f h/olde-a/bin/sh
 '
 
+test_expect_success 'setup fake tar filter' '
+	git config tarfilter.fake.command "cat >/dev/null; echo args: "
+'
+
+test_expect_failure 'filter does not allow compression levels by default' '
+	test_must_fail git archive --format=fake -9 HEAD >output
+'
+
+test_expect_failure 'filters can allow compression levels' '
+	git config tarfilter.fake.compressionlevels true &&
+	echo "args: -9" >expect &&
+	git archive --format=fake -9 HEAD >output &&
+	test_cmp expect output
+'
+
+test_expect_failure 'archive --list mentions user filter' '
+	git archive --list >output &&
+	grep "^fake\$" output
+'
+
+test_expect_failure 'archive --list shows remote user filters' '
+	git archive --list --remote=. >output &&
+	grep "^fake\$" output
+'
+
+test_expect_success 'setup slightly more useful tar filter' '
+	git config tarfilter.foo.command "tr ab ba" &&
+	git config --add tarfilter.foo.extension tar.foo &&
+	git config --add tarfilter.foo.extension bar
+'
+
+test_expect_failure 'archive outputs in configurable format' '
+	git archive --format=foo HEAD >config.tar.foo &&
+	tr ab ba <config.tar.foo >config.tar &&
+	test_cmp b.tar config.tar
+'
+
+test_expect_failure 'archive selects implicit format by configured extension' '
+	git archive -o config-implicit.tar.foo HEAD &&
+	test_cmp config.tar.foo config-implicit.tar.foo &&
+	git archive -o config-implicit.bar HEAD &&
+	test_cmp config.tar.foo config-implicit.bar
+'
+
+test_expect_success 'default output format remains tar' '
+	git archive -o config-implicit.baz HEAD &&
+	test_cmp b.tar config-implicit.baz
+'
+
+test_expect_failure 'extension matching requires dot' '
+	git archive -o config-implicittar.foo HEAD &&
+	test_cmp b.tar config-implicittar.foo
+'
+
 test_done
-- 
1.7.6.rc1.4.g49204

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 3/7] archive: support user tar-filters via --format
  2011-06-15 22:30         ` [RFC/PATCH 0/7] user-configurable git-archive output formats Jeff King
  2011-06-15 22:31           ` [PATCH 1/7] archive: reorder option parsing and config reading Jeff King
  2011-06-15 22:33           ` [PATCH 2/7] archive: add user-configurable tar-filter infrastructure Jeff King
@ 2011-06-15 22:33           ` Jeff King
  2011-06-15 22:33           ` [PATCH 4/7] archive: advertise user tar-filters in --list Jeff King
                             ` (5 subsequent siblings)
  8 siblings, 0 replies; 56+ messages in thread
From: Jeff King @ 2011-06-15 22:33 UTC (permalink / raw)
  To: René Scharfe; +Cc: Junio C Hamano, J.H., git, git-dev

The previous commit set up the infrastructure to read
tar-filter configuration. This commit actually uses it to
pipe the tar output through the specified filter.

Signed-off-by: Jeff King <peff@peff.net>
---
 archive-tar-filter.c |   48 ++++++++++++++++++++++++++++++++++++++++++++++++
 archive.c            |   16 +++++++++++++---
 archive.h            |    2 ++
 t/t5000-tar-tree.sh  |    6 +++---
 4 files changed, 66 insertions(+), 6 deletions(-)

diff --git a/archive-tar-filter.c b/archive-tar-filter.c
index 211f1df..ffe510e 100644
--- a/archive-tar-filter.c
+++ b/archive-tar-filter.c
@@ -1,5 +1,6 @@
 #include "cache.h"
 #include "archive.h"
+#include "run-command.h"
 
 struct tar_filter *tar_filters;
 static struct tar_filter **tar_filters_tail = &tar_filters;
@@ -110,3 +111,50 @@ extern void tar_filter_load_config(void)
 	git_config(tar_filter_config, NULL);
 	remove_filters_without_command();
 }
+
+static int write_tar_to_filter(struct archiver_args *args, const char *cmd)
+{
+	struct child_process filter;
+	const char *argv[2];
+	int r;
+
+	memset(&filter, 0, sizeof(filter));
+	argv[0] = cmd;
+	argv[1] = NULL;
+	filter.argv = argv;
+	filter.use_shell = 1;
+	filter.in = -1;
+
+	if (start_command(&filter) < 0)
+		die_errno("unable to start '%s' filter", argv[0]);
+	close(1);
+	if (dup2(filter.in, 1) < 0)
+		die_errno("unable to redirect descriptor");
+	close(filter.in);
+
+	r = write_tar_archive(args);
+
+	close(1);
+	if (finish_command(&filter) != 0)
+		die("'%s' filter reported error", argv[0]);
+
+	return r;
+}
+
+int write_tar_filter_archive(struct archiver_args *args)
+{
+	struct strbuf cmd = STRBUF_INIT;
+	int r;
+
+	if (!args->tar_filter)
+		die("BUG: tar-filter archiver called with no filter defined");
+
+	strbuf_addstr(&cmd, args->tar_filter->command);
+	if (args->tar_filter->use_compression && args->compression_level >= 0)
+		strbuf_addf(&cmd, " -%d", args->compression_level);
+
+	r = write_tar_to_filter(args, cmd.buf);
+
+	strbuf_release(&cmd);
+	return r;
+}
diff --git a/archive.c b/archive.c
index 2ed9259..cf58faa 100644
--- a/archive.c
+++ b/archive.c
@@ -24,6 +24,9 @@ static const struct archiver {
 	{ "tar", write_tar_archive },
 	{ "zip", write_zip_archive, USES_ZLIB_COMPRESSION },
 };
+static const struct archiver tar_filter_archiver = {
+	"tar-filter", write_tar_filter_archive
+};
 
 static void format_subst(const struct commit *commit,
                          const char *src, size_t len,
@@ -364,12 +367,19 @@ static int parse_archive_args(int argc, const char **argv,
 	if (argc < 1)
 		usage_with_options(archive_usage, opts);
 	*ar = lookup_archiver(format);
-	if (!*ar)
-		die("Unknown archive format '%s'", format);
+
+	/* Fallback to user-configured tar filters */
+	if (!*ar) {
+		args->tar_filter = tar_filter_by_name(format);
+		if (!args->tar_filter)
+			die("Unknown archive format '%s'", format);
+		*ar = &tar_filter_archiver;
+	}
 
 	args->compression_level = Z_DEFAULT_COMPRESSION;
 	if (compression_level != -1) {
-		if ((*ar)->flags & USES_ZLIB_COMPRESSION)
+		if ((*ar)->flags & USES_ZLIB_COMPRESSION ||
+		    (args->tar_filter && args->tar_filter->use_compression))
 			args->compression_level = compression_level;
 		else {
 			die("Argument not supported for format '%s': -%d",
diff --git a/archive.h b/archive.h
index 8386c46..fb2bb9e 100644
--- a/archive.h
+++ b/archive.h
@@ -14,6 +14,7 @@ struct archiver_args {
 	unsigned int verbose : 1;
 	unsigned int worktree_attributes : 1;
 	int compression_level;
+	struct tar_filter *tar_filter;
 };
 
 typedef int (*write_archive_fn_t)(struct archiver_args *);
@@ -25,6 +26,7 @@ typedef int (*write_archive_entry_fn_t)(struct archiver_args *args, const unsign
  */
 extern int write_tar_archive(struct archiver_args *);
 extern int write_zip_archive(struct archiver_args *);
+extern int write_tar_filter_archive(struct archiver_args *);
 
 extern int write_archive_entries(struct archiver_args *args, write_archive_entry_fn_t write_entry);
 extern int write_archive(int argc, const char **argv, const char *prefix, int setup_prefix);
diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh
index c3e1a4e..2b2b128 100755
--- a/t/t5000-tar-tree.sh
+++ b/t/t5000-tar-tree.sh
@@ -256,11 +256,11 @@ test_expect_success 'setup fake tar filter' '
 	git config tarfilter.fake.command "cat >/dev/null; echo args: "
 '
 
-test_expect_failure 'filter does not allow compression levels by default' '
+test_expect_success 'filter does not allow compression levels by default' '
 	test_must_fail git archive --format=fake -9 HEAD >output
 '
 
-test_expect_failure 'filters can allow compression levels' '
+test_expect_success 'filters can allow compression levels' '
 	git config tarfilter.fake.compressionlevels true &&
 	echo "args: -9" >expect &&
 	git archive --format=fake -9 HEAD >output &&
@@ -283,7 +283,7 @@ test_expect_success 'setup slightly more useful tar filter' '
 	git config --add tarfilter.foo.extension bar
 '
 
-test_expect_failure 'archive outputs in configurable format' '
+test_expect_success 'archive outputs in configurable format' '
 	git archive --format=foo HEAD >config.tar.foo &&
 	tr ab ba <config.tar.foo >config.tar &&
 	test_cmp b.tar config.tar
-- 
1.7.6.rc1.4.g49204

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 4/7] archive: advertise user tar-filters in --list
  2011-06-15 22:30         ` [RFC/PATCH 0/7] user-configurable git-archive output formats Jeff King
                             ` (2 preceding siblings ...)
  2011-06-15 22:33           ` [PATCH 3/7] archive: support user tar-filters via --format Jeff King
@ 2011-06-15 22:33           ` Jeff King
  2011-06-15 22:34           ` [PATCH 5/7] archive: refactor format-guessing from filename Jeff King
                             ` (4 subsequent siblings)
  8 siblings, 0 replies; 56+ messages in thread
From: Jeff King @ 2011-06-15 22:33 UTC (permalink / raw)
  To: René Scharfe; +Cc: Junio C Hamano, J.H., git, git-dev

These can be selected by --format, so we need to let users
know about them. It is especially important for the remote
case, since this is the only method by which users can find
out which formats the remote has configured.

Signed-off-by: Jeff King <peff@peff.net>
---
 archive.c           |    3 +++
 t/t5000-tar-tree.sh |    4 ++--
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/archive.c b/archive.c
index cf58faa..a987936 100644
--- a/archive.c
+++ b/archive.c
@@ -358,8 +358,11 @@ static int parse_archive_args(int argc, const char **argv,
 		base = "";
 
 	if (list) {
+		struct tar_filter *p;
 		for (i = 0; i < ARRAY_SIZE(archivers); i++)
 			printf("%s\n", archivers[i].name);
+		for (p = tar_filters; p; p = p->next)
+			printf("%s\n", p->name);
 		exit(0);
 	}
 
diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh
index 2b2b128..9f959b1 100755
--- a/t/t5000-tar-tree.sh
+++ b/t/t5000-tar-tree.sh
@@ -267,12 +267,12 @@ test_expect_success 'filters can allow compression levels' '
 	test_cmp expect output
 '
 
-test_expect_failure 'archive --list mentions user filter' '
+test_expect_success 'archive --list mentions user filter' '
 	git archive --list >output &&
 	grep "^fake\$" output
 '
 
-test_expect_failure 'archive --list shows remote user filters' '
+test_expect_success 'archive --list shows remote user filters' '
 	git archive --list --remote=. >output &&
 	grep "^fake\$" output
 '
-- 
1.7.6.rc1.4.g49204

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 5/7] archive: refactor format-guessing from filename
  2011-06-15 22:30         ` [RFC/PATCH 0/7] user-configurable git-archive output formats Jeff King
                             ` (3 preceding siblings ...)
  2011-06-15 22:33           ` [PATCH 4/7] archive: advertise user tar-filters in --list Jeff King
@ 2011-06-15 22:34           ` Jeff King
  2011-06-15 23:48             ` Junio C Hamano
  2011-06-15 22:34           ` [PATCH 6/7] archive: match extensions from user-configured formats Jeff King
                             ` (3 subsequent siblings)
  8 siblings, 1 reply; 56+ messages in thread
From: Jeff King @ 2011-06-15 22:34 UTC (permalink / raw)
  To: René Scharfe; +Cc: Junio C Hamano, J.H., git, git-dev

The process for guessing an archive output format based on
the filename is something like this:

  a. parse --output in cmd_archive; check the filename
     against a static set of mapping heuristics (right now
     it just matches ".zip" for zip files).

  b. if found, stick a fake "--format=zip" at the beginning
     of the arguments list (if the user did specify a
     --format manually, the later option will override our
     fake one)

  c. if it's a remote call, ship the arguments to the remote
     (including the fake), which will call write_archive on
     their end

  d. if it's local, ship the arguments to write_archive
     locally

There are two problems:

  1. The set of mappings is static and at too high a level.
     The write_archive level is going to check config for
     user-defined formats, some of which will specify
     extensions. We need to delay lookup until those are
     parsed, so we can match against them.

  2. For a remote archive call, our set of mappings (or
     formats) may not match the remote side's. This is OK in
     practice right now, because all versions of git
     understand "zip" and "tar". But as new formats are
     added, there is going to be a mismatch between what the
     client can do and what the remote server can do.

To fix (1), this patch refactors the location guessing to
happen at the write_archive level, instead of the
cmd_archive level. So instead of sticking a fake --format
field in the argv list, we actually pass a "name hint" down
the callchain; this hint is used at the appropriate time to
guess the format (if one hasn't been given already).

This patch leaves (2) unfixed. The name_hint is converted to
a "--format" option as before, and passed to the remote.  We
could in theory pass the name hint to the remote side and
let it decide which format to use. But that introduces a
compatibility problem, as we have no place to put that
information during the remote call without adding a new
"--name-hint=" argument. An older version of git would choke
on that, and the client has no way of knowing if the server
is new enough or not (i.e., there is no capabilities
advertisement, as there is with the git protocol itself).

On top of this, it's unclear whether the remote side should
be in charge of format selection, anyway. There is a minor
information leak; the server will learn about the filename
you are using to save. If we sent just the basename, though,
that would lessen the leak and still give the remote side
enough information to make a decision.

But more important is that the name hint is only a hint, and
we default to the tar format. Which means that
inconsistencies between the client's and server's set of
formats will have confusing results. For example, imagine
the client learns about "tar.gz" as an extension for gzip'd
tar ("tgz") files, but the server does not. Locally,
running:

  git archive -o file.tar.gz HEAD

will produce a gzip'd file. If we make the mapping decision
locally, then running:

  git archive --remote=origin -o file.tar.gz HEAD

will send "--format=tgz" to the remote side. The server will
complain, saying that it doesn't know about the tgz format.

If we instead send the name hint to the remote side and let
it make the decision, it will not know what ".tar.gz" is,
and will silently default to a plain tar, without the user
even realizing it.

The flip side of this is an old client talking to a new
server (i.e., only the servers knows about ".tar.gz"). If we
map the filename remotely, then the user is happy. If we map
it locally, though, we will send the server no --format and
it will silently default to tar.

So the question is: should the mapping of filenames to
formats be consistent for a single client (i.e., doing it
locally or against a remote will either produce the same
format, or report an error if the remote does not support
the format), or should it be consistent for multiple clients
hitting the same server (i.e., no matter what machine I am
on, if I use git-archive to hit kernel.org, I will always
see the same format for the same filename)?

I chose consistency on a single client (i.e., we do the
mapping locally), because:

  1. Using git on one machine against multiple remotes is
     more common than using git on many machines against the
     same remote. So it's less likely for the user to be
     surprised.

  2. Even if we wanted to do the reverse, it is not as
     simple as making the decision and writing some code.
     Because the server literally says nothing before we
     send it arguments, it's difficult to get
     interoperability between versions. We'd probably end up
     having to write the server side, wait sufficiently long
     for everybody to have deployed it, and then write the
     client side.

Signed-off-by: Jeff King <peff@peff.net>
---
 archive.c                |   25 +++++++++++++++++++---
 archive.h                |    4 ++-
 builtin/archive.c        |   51 ++++++++++++++-------------------------------
 builtin/upload-archive.c |    2 +-
 4 files changed, 41 insertions(+), 41 deletions(-)

diff --git a/archive.c b/archive.c
index a987936..e04f689 100644
--- a/archive.c
+++ b/archive.c
@@ -302,9 +302,10 @@ static void parse_treeish_arg(const char **argv,
 	  PARSE_OPT_NOARG | PARSE_OPT_NONEG | PARSE_OPT_HIDDEN, NULL, (p) }
 
 static int parse_archive_args(int argc, const char **argv,
-		const struct archiver **ar, struct archiver_args *args)
+		const struct archiver **ar, struct archiver_args *args,
+		const char *name_hint)
 {
-	const char *format = "tar";
+	const char *format = NULL;
 	const char *base = NULL;
 	const char *remote = NULL;
 	const char *exec = NULL;
@@ -366,6 +367,11 @@ static int parse_archive_args(int argc, const char **argv,
 		exit(0);
 	}
 
+	if (!format && name_hint)
+		format = archive_format_from_filename(name_hint);
+	if (!format)
+		format = "tar";
+
 	/* We need at least one parameter -- tree-ish */
 	if (argc < 1)
 		usage_with_options(archive_usage, opts);
@@ -398,7 +404,7 @@ static int parse_archive_args(int argc, const char **argv,
 }
 
 int write_archive(int argc, const char **argv, const char *prefix,
-		int setup_prefix)
+		int setup_prefix, const char *name_hint)
 {
 	int nongit = 0;
 	const struct archiver *ar = NULL;
@@ -410,7 +416,7 @@ int write_archive(int argc, const char **argv, const char *prefix,
 	git_config(git_default_config, NULL);
 	tar_filter_load_config();
 
-	argc = parse_archive_args(argc, argv, &ar, &args);
+	argc = parse_archive_args(argc, argv, &ar, &args, name_hint);
 	if (nongit) {
 		/*
 		 * We know this will die() with an error, so we could just
@@ -425,3 +431,14 @@ int write_archive(int argc, const char **argv, const char *prefix,
 
 	return ar->write_archive(&args);
 }
+
+const char *archive_format_from_filename(const char *filename)
+{
+	const char *ext = strrchr(filename, '.');
+	if (!ext)
+		return NULL;
+	ext++;
+	if (!strcasecmp(ext, "zip"))
+		return "zip";
+	return NULL;
+}
diff --git a/archive.h b/archive.h
index fb2bb9e..894d4c4 100644
--- a/archive.h
+++ b/archive.h
@@ -29,7 +29,7 @@ extern int write_zip_archive(struct archiver_args *);
 extern int write_tar_filter_archive(struct archiver_args *);
 
 extern int write_archive_entries(struct archiver_args *args, write_archive_entry_fn_t write_entry);
-extern int write_archive(int argc, const char **argv, const char *prefix, int setup_prefix);
+extern int write_archive(int argc, const char **argv, const char *prefix, int setup_prefix, const char *name_hint);
 
 struct tar_filter {
 	char *name;
@@ -44,4 +44,6 @@ extern struct tar_filter *tar_filter_by_name(const char *name);
 
 extern void tar_filter_load_config(void);
 
+const char *archive_format_from_filename(const char *filename);
+
 #endif	/* ARCHIVE_H */
diff --git a/builtin/archive.c b/builtin/archive.c
index b14eaba..2578cf5 100644
--- a/builtin/archive.c
+++ b/builtin/archive.c
@@ -24,7 +24,8 @@ static void create_output_file(const char *output_file)
 }
 
 static int run_remote_archiver(int argc, const char **argv,
-			       const char *remote, const char *exec)
+			       const char *remote, const char *exec,
+			       const char *name_hint)
 {
 	char buf[LARGE_PACKET_MAX];
 	int fd[2], i, len, rv;
@@ -37,6 +38,17 @@ static int run_remote_archiver(int argc, const char **argv,
 	transport = transport_get(_remote, _remote->url[0]);
 	transport_connect(transport, "git-upload-archive", exec, fd);
 
+	/*
+	 * Inject a fake --format field at the beginning of the
+	 * arguments, with the format inferred from our output
+	 * filename. This way explicit --format options can override
+	 * it.
+	 */
+	if (name_hint) {
+		const char *format = archive_format_from_filename(name_hint);
+		if (format)
+			packet_write(fd[1], "argument --format=%s\n", format);
+	}
 	for (i = 1; i < argc; i++)
 		packet_write(fd[1], "argument %s\n", argv[i]);
 	packet_flush(fd[1]);
@@ -63,17 +75,6 @@ static int run_remote_archiver(int argc, const char **argv,
 	return !!rv;
 }
 
-static const char *format_from_name(const char *filename)
-{
-	const char *ext = strrchr(filename, '.');
-	if (!ext)
-		return NULL;
-	ext++;
-	if (!strcasecmp(ext, "zip"))
-		return "--format=zip";
-	return NULL;
-}
-
 #define PARSE_OPT_KEEP_ALL ( PARSE_OPT_KEEP_DASHDASH | 	\
 			     PARSE_OPT_KEEP_ARGV0 | 	\
 			     PARSE_OPT_KEEP_UNKNOWN |	\
@@ -84,7 +85,6 @@ int cmd_archive(int argc, const char **argv, const char *prefix)
 	const char *exec = "git-upload-archive";
 	const char *output = NULL;
 	const char *remote = NULL;
-	const char *format_option = NULL;
 	struct option local_opts[] = {
 		OPT_STRING('o', "output", &output, "file",
 			"write the archive to this file"),
@@ -98,32 +98,13 @@ int cmd_archive(int argc, const char **argv, const char *prefix)
 	argc = parse_options(argc, argv, prefix, local_opts, NULL,
 			     PARSE_OPT_KEEP_ALL);
 
-	if (output) {
+	if (output)
 		create_output_file(output);
-		format_option = format_from_name(output);
-	}
-
-	/*
-	 * We have enough room in argv[] to muck it in place, because
-	 * --output must have been given on the original command line
-	 * if we get to this point, and parse_options() must have eaten
-	 * it, i.e. we can add back one element to the array.
-	 *
-	 * We add a fake --format option at the beginning, with the
-	 * format inferred from our output filename.  This way explicit
-	 * --format options can override it, and the fake option is
-	 * inserted before any "--" that might have been given.
-	 */
-	if (format_option) {
-		memmove(argv + 2, argv + 1, sizeof(*argv) * argc);
-		argv[1] = format_option;
-		argv[++argc] = NULL;
-	}
 
 	if (remote)
-		return run_remote_archiver(argc, argv, remote, exec);
+		return run_remote_archiver(argc, argv, remote, exec, output);
 
 	setvbuf(stderr, NULL, _IOLBF, BUFSIZ);
 
-	return write_archive(argc, argv, prefix, 1);
+	return write_archive(argc, argv, prefix, 1, output);
 }
diff --git a/builtin/upload-archive.c b/builtin/upload-archive.c
index 73f788e..e6bb97d 100644
--- a/builtin/upload-archive.c
+++ b/builtin/upload-archive.c
@@ -64,7 +64,7 @@ static int run_upload_archive(int argc, const char **argv, const char *prefix)
 	sent_argv[sent_argc] = NULL;
 
 	/* parse all options sent by the client */
-	return write_archive(sent_argc, sent_argv, prefix, 0);
+	return write_archive(sent_argc, sent_argv, prefix, 0, NULL);
 }
 
 __attribute__((format (printf, 1, 2)))
-- 
1.7.6.rc1.4.g49204

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 6/7] archive: match extensions from user-configured formats
  2011-06-15 22:30         ` [RFC/PATCH 0/7] user-configurable git-archive output formats Jeff King
                             ` (4 preceding siblings ...)
  2011-06-15 22:34           ` [PATCH 5/7] archive: refactor format-guessing from filename Jeff King
@ 2011-06-15 22:34           ` Jeff King
  2011-06-15 22:35           ` [PATCH 7/7] archive: provide builtin .tar.gz filter Jeff King
                             ` (2 subsequent siblings)
  8 siblings, 0 replies; 56+ messages in thread
From: Jeff King @ 2011-06-15 22:34 UTC (permalink / raw)
  To: René Scharfe; +Cc: Junio C Hamano, J.H., git, git-dev

This lets you configure a format like:

  [tarfilter "tgz"]
    command = gzip
    extension = tgz
    extension = tar.gz

and have it automatically used for "foo.tgz" or "foo.tar.gz".

Signed-off-by: Jeff King <peff@peff.net>
---
 archive-tar-filter.c |   29 +++++++++++++++++++++++++++++
 archive.c            |   12 ++++++++++++
 archive.h            |    1 +
 t/t5000-tar-tree.sh  |    4 ++--
 4 files changed, 44 insertions(+), 2 deletions(-)

diff --git a/archive-tar-filter.c b/archive-tar-filter.c
index ffe510e..e749133 100644
--- a/archive-tar-filter.c
+++ b/archive-tar-filter.c
@@ -39,6 +39,35 @@ struct tar_filter *tar_filter_by_name(const char *name)
 	return tar_filter_by_namelen(name, strlen(name));
 }
 
+static int match_extension(const char *filename, const char *ext)
+{
+	int prefixlen = strlen(filename) - strlen(ext);
+
+	/*
+	 * We need 1 character for the '.', and 1 character to ensure that the
+	 * prefix is non-empty (i.e., we don't match ".tar.gz" with no actual
+	 * filename).
+	 */
+	if (prefixlen < 2 || filename[prefixlen-1] != '.')
+		return 0;
+	return !strcmp(filename + prefixlen, ext);
+}
+
+struct tar_filter *tar_filter_by_extension(const char *filename)
+{
+	struct tar_filter *p;
+
+	for (p = tar_filters; p; p = p->next) {
+		int i;
+		for (i = 0; i < p->extensions.nr; i++) {
+			const char *ext = p->extensions.items[i].string;
+			if (match_extension(filename, ext))
+				return p;
+		}
+	}
+	return NULL;
+}
+
 static int tar_filter_config(const char *var, const char *value, void *data)
 {
 	struct tar_filter *tf;
diff --git a/archive.c b/archive.c
index e04f689..e509b6c 100644
--- a/archive.c
+++ b/archive.c
@@ -434,11 +434,23 @@ int write_archive(int argc, const char **argv, const char *prefix,
 
 const char *archive_format_from_filename(const char *filename)
 {
+	struct tar_filter *tf;
 	const char *ext = strrchr(filename, '.');
 	if (!ext)
 		return NULL;
 	ext++;
 	if (!strcasecmp(ext, "zip"))
 		return "zip";
+
+	/*
+	 * Fallback to user-configured tar filters; but note
+	 * that we might have to load config ourselves, first,
+	 * if we are not being called via write_archive.
+	 */
+	tar_filter_load_config();
+	tf = tar_filter_by_extension(filename);
+	if (tf)
+		return tf->name;
+
 	return NULL;
 }
diff --git a/archive.h b/archive.h
index 894d4c4..80c89dc 100644
--- a/archive.h
+++ b/archive.h
@@ -41,6 +41,7 @@ struct tar_filter {
 
 extern struct tar_filter *tar_filters;
 extern struct tar_filter *tar_filter_by_name(const char *name);
+extern struct tar_filter *tar_filter_by_extension(const char *filename);
 
 extern void tar_filter_load_config(void);
 
diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh
index 9f959b1..fe661f3 100755
--- a/t/t5000-tar-tree.sh
+++ b/t/t5000-tar-tree.sh
@@ -289,7 +289,7 @@ test_expect_success 'archive outputs in configurable format' '
 	test_cmp b.tar config.tar
 '
 
-test_expect_failure 'archive selects implicit format by configured extension' '
+test_expect_success 'archive selects implicit format by configured extension' '
 	git archive -o config-implicit.tar.foo HEAD &&
 	test_cmp config.tar.foo config-implicit.tar.foo &&
 	git archive -o config-implicit.bar HEAD &&
@@ -301,7 +301,7 @@ test_expect_success 'default output format remains tar' '
 	test_cmp b.tar config-implicit.baz
 '
 
-test_expect_failure 'extension matching requires dot' '
+test_expect_success 'extension matching requires dot' '
 	git archive -o config-implicittar.foo HEAD &&
 	test_cmp b.tar config-implicittar.foo
 '
-- 
1.7.6.rc1.4.g49204

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 7/7] archive: provide builtin .tar.gz filter
  2011-06-15 22:30         ` [RFC/PATCH 0/7] user-configurable git-archive output formats Jeff King
                             ` (5 preceding siblings ...)
  2011-06-15 22:34           ` [PATCH 6/7] archive: match extensions from user-configured formats Jeff King
@ 2011-06-15 22:35           ` Jeff King
  2011-06-15 23:55             ` Junio C Hamano
  2011-06-18 14:52           ` [RFC/PATCH 0/7] user-configurable git-archive output formats René Scharfe
  2011-06-18 15:40           ` René Scharfe
  8 siblings, 1 reply; 56+ messages in thread
From: Jeff King @ 2011-06-15 22:35 UTC (permalink / raw)
  To: René Scharfe; +Cc: Junio C Hamano, J.H., git, git-dev

This works exactly as if the user had configured it via:

  [tarfilter "tgz"]
	command = gzip
	extension = tgz
	extension = tar.gz
	compressionlevels = true

but since it is so common, it's convenient to have it
builtin without the user needing to do anything.

Signed-off-by: Jeff King <peff@peff.net>
---
 archive-tar-filter.c |   12 ++++++++++++
 t/t5000-tar-tree.sh  |   48 ++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 60 insertions(+), 0 deletions(-)

diff --git a/archive-tar-filter.c b/archive-tar-filter.c
index e749133..de8719a 100644
--- a/archive-tar-filter.c
+++ b/archive-tar-filter.c
@@ -126,6 +126,17 @@ static void remove_filters_without_command(void)
 	}
 }
 
+static void load_builtin_filters(void)
+{
+	struct tar_filter *tf;
+
+	tf = tar_filter_new("tgz", strlen("tgz"));
+	tf->command = xstrdup("gzip");
+	string_list_append(&tf->extensions, "tgz");
+	string_list_append(&tf->extensions, "tar.gz");
+	tf->use_compression = 1;
+}
+
 /*
  * We don't want to load twice, since some of our
  * values actually append rather than overwrite.
@@ -137,6 +148,7 @@ extern void tar_filter_load_config(void)
 		return;
 	tar_filter_config_loaded = 1;
 
+	load_builtin_filters();
 	git_config(tar_filter_config, NULL);
 	remove_filters_without_command();
 }
diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh
index fe661f3..ebad295 100755
--- a/t/t5000-tar-tree.sh
+++ b/t/t5000-tar-tree.sh
@@ -26,6 +26,7 @@ commit id embedding:
 
 . ./test-lib.sh
 UNZIP=${UNZIP:-unzip}
+GUNZIP=${GUNZIP:-gzip -d}
 
 SUBSTFORMAT=%H%n
 
@@ -306,4 +307,51 @@ test_expect_success 'extension matching requires dot' '
 	test_cmp b.tar config-implicittar.foo
 '
 
+test_expect_success 'git archive --format=tgz' '
+	git archive --format=tgz HEAD >j.tgz
+'
+
+test_expect_success 'infer tgz from .tgz filename' '
+	git archive --output=j1.tgz HEAD &&
+	test_cmp j.tgz j1.tgz
+'
+
+test_expect_success 'infer tgz from .tar.gz filename' '
+	git archive --output=j2.tar.gz HEAD &&
+	test_cmp j.tgz j2.tar.gz
+'
+
+if $GUNZIP --version >/dev/null 2>&1; then
+	test_set_prereq GUNZIP
+else
+	say "Skipping some tgz tests because gunzip was not found"
+fi
+
+test_expect_success GUNZIP 'extract tgz file' '
+	$GUNZIP -c <j.tgz >j.tar &&
+	test_cmp b.tar j.tar
+'
+
+test_expect_success GUNZIP 'tgz allows compression levels' '
+	git archive -1 --output=j3.tgz HEAD
+'
+
+test_expect_success 'disable builtin tgz via config' '
+	git config tarfilter.tgz.command ""
+'
+
+test_expect_success 'disabled filter does not appear in --list' '
+	git archive --list >output &&
+	! grep tgz output
+'
+
+test_expect_success 'disabled filter cannot be used' '
+	test_must_fail git archive --format=tgz HEAD >output
+'
+
+test_expect_success 'disabled filter does not match extensions' '
+	git archive -o disabled.tar.gz HEAD &&
+	test_cmp b.tar disabled.tar.gz
+'
+
 test_done
-- 
1.7.6.rc1.4.g49204

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH 2/2] archive: support gzipped tar files
  2011-06-14 23:40       ` Miles Bader
@ 2011-06-15 22:46         ` Jeff King
  0 siblings, 0 replies; 56+ messages in thread
From: Jeff King @ 2011-06-15 22:46 UTC (permalink / raw)
  To: Miles Bader; +Cc: Junio C Hamano, git, René Scharfe, git-dev

On Wed, Jun 15, 2011 at 08:40:22AM +0900, Miles Bader wrote:

> Jeff King <peff@github.com> writes:
> >> I didn't know it was that easy (primarily because I didn't know zlib had a
> >> ready-to-eat interface to do this).
> >
> > Yes, though I think it may be worth doing the more flexible,
> > external-filters approach. See elsewhere in the thread.
> 
> Given the relatively trivial code, isn't it worth doing both...?
>
> One method for flexibility/multi-threaded-speed, the other for
> portability/robustness (doesn't depend on configuration / setup
> details)...

Maybe, although the code is a little less trivial than I hoped (see
René's response for some bugs in my original series).

My series allowing external filters via configuration also comes with
builtin config for gzip. So there's no extra config or setup details for
the user, assuming they can run "gzip" from their PATH.

So I think the only disadvantage now is for people who don't have gzip
at all. I suspect people on such platforms are going to want another
format anyway, but we could still help them out. However, I think
instead of building it specially into the archive code, it would be
cleaner to simply ship a bare-bones version of gzip that only stdio
(i.e., a "git-gzip"). It would be no more code than what the internal
solution would be, it could be easier to read (since the program is
self-contained), and it benefits from the SMP speedup.

Although at the point we are shipping "git-gzip", I really have to
wonder if people wouldn't prefer to just install gzip themselves. So I'm
inclined to wait until somebody complains that git+zlib are easy to get
on their system, but gzip isn't.

-Peff

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 2/7] archive: add user-configurable tar-filter infrastructure
  2011-06-15 22:33           ` [PATCH 2/7] archive: add user-configurable tar-filter infrastructure Jeff King
@ 2011-06-15 23:33             ` Junio C Hamano
  2011-06-16  0:29               ` Jeff King
  0 siblings, 1 reply; 56+ messages in thread
From: Junio C Hamano @ 2011-06-15 23:33 UTC (permalink / raw)
  To: Jeff King; +Cc: René Scharfe, Junio C Hamano, J.H., git, git-dev

Jeff King <peff@peff.net> writes:

> Archive supports two output formats: tar and zip. The tar
> ...
> +static struct tar_filter *tar_filter_by_namelen(const char *name,
> +						int len)
> +{
> +	struct tar_filter *p;
> +	for (p = tar_filters; p; p = p->next)
> +		if (!strncmp(p->name, name, len) && !p->name[len])
> +			return p;
> +	return NULL;
> +}

Makes me wonder if we want to have a generic table that is keyed by name
whose contents can be looked up by counted string. string_list is the
closest thing we already have, but I do not think it has counted string
interface (shouldn't be a rocket surgery to add it, though).

> +static int tar_filter_config(const char *var, const char *value, void *data)
> +{
> ...
> +	if (!strcmp(type, "command")) {
> +		if (!value)
> +			return config_error_nonbool(var);
> +		tf->command = xstrdup(value);

Does this result in small leak if the same filter is multiply defined, say
in /etc/gitconfig and then in ~/.gitconfig?

> diff --git a/archive.h b/archive.h
> index 038ac35..8386c46 100644
> --- a/archive.h
> +++ b/archive.h
> @@ -1,6 +1,8 @@
>  #ifndef ARCHIVE_H
>  #define ARCHIVE_H
>  
> +#include "string-list.h"
> +
>  struct archiver_args {
>  	const char *base;
>  	size_t baselen;
> @@ -27,4 +29,17 @@ extern int write_zip_archive(struct archiver_args *);
>  extern int write_archive_entries(struct archiver_args *args, write_archive_entry_fn_t write_entry);
>  extern int write_archive(int argc, const char **argv, const char *prefix, int setup_prefix);
>  
> +struct tar_filter {
> +	char *name;
> +	char *command;
> +	struct string_list extensions;
> +	unsigned use_compression:1;

I suspect that you plan to pass sprintf("-%d", level) for the ones marked
with this bit, but I wonder if we want to give a bit more control on how a
compression level option is shaped for the particular command, and where
on the command line the option comes.  As long as we are targetting gzip
and nothing else it is fine, and I suspect newer compression commands
would try to mimic the -[0-9] command line interface gzip has (e.g. xz),
so this probably is not an issue in practice.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 5/7] archive: refactor format-guessing from filename
  2011-06-15 22:34           ` [PATCH 5/7] archive: refactor format-guessing from filename Jeff King
@ 2011-06-15 23:48             ` Junio C Hamano
  2011-06-16  0:34               ` Jeff King
  0 siblings, 1 reply; 56+ messages in thread
From: Junio C Hamano @ 2011-06-15 23:48 UTC (permalink / raw)
  To: Jeff King; +Cc: René Scharfe, Junio C Hamano, J.H., git, git-dev

Jeff King <peff@peff.net> writes:

> But more important is that the name hint is only a hint, and
> we default to the tar format. Which means that
> inconsistencies between the client's and server's set of
> formats will have confusing results. For example, imagine
> the client learns about "tar.gz" as an extension for gzip'd
> tar ("tgz") files, but the server does not. Locally,
> running:
>
>   git archive -o file.tar.gz HEAD
>
> will produce a gzip'd file. If we make the mapping decision
> locally, then running:
>
>   git archive --remote=origin -o file.tar.gz HEAD
>
> will send "--format=tgz" to the remote side. The server will
> complain, saying that it doesn't know about the tgz format.

As long as that complaint is clearly marked as coming from the remote
side, the user now knows that tgz is not supported, and can fall back to a
plain tar.

Am I being naïve thinking that barfing (and assuming that the user
understands why the remote end barfed) actually is a good thing?

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 7/7] archive: provide builtin .tar.gz filter
  2011-06-15 22:35           ` [PATCH 7/7] archive: provide builtin .tar.gz filter Jeff King
@ 2011-06-15 23:55             ` Junio C Hamano
  2011-06-15 23:57               ` Junio C Hamano
  2011-06-16  0:38               ` Jeff King
  0 siblings, 2 replies; 56+ messages in thread
From: Junio C Hamano @ 2011-06-15 23:55 UTC (permalink / raw)
  To: Jeff King; +Cc: René Scharfe, J.H., git, git-dev

Jeff King <peff@peff.net> writes:

> +test_expect_success 'git archive --format=tgz' '
> +	git archive --format=tgz HEAD >j.tgz
> +'
> +
> +test_expect_success 'infer tgz from .tgz filename' '
> +	git archive --output=j1.tgz HEAD &&
> +	test_cmp j.tgz j1.tgz
> +'

I suspect this would get intermittent failures for the same reason as
0c8c385 (gitweb: supply '-n' to gzip for identical output, 2011-04-26)

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 7/7] archive: provide builtin .tar.gz filter
  2011-06-15 23:55             ` Junio C Hamano
@ 2011-06-15 23:57               ` Junio C Hamano
  2011-06-16  0:38               ` Jeff King
  1 sibling, 0 replies; 56+ messages in thread
From: Junio C Hamano @ 2011-06-15 23:57 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, René Scharfe, J.H., git, git-dev

Junio C Hamano <gitster@pobox.com> writes:

> Jeff King <peff@peff.net> writes:
>
>> +test_expect_success 'git archive --format=tgz' '
>> +	git archive --format=tgz HEAD >j.tgz
>> +'
>> +
>> +test_expect_success 'infer tgz from .tgz filename' '
>> +	git archive --output=j1.tgz HEAD &&
>> +	test_cmp j.tgz j1.tgz
>> +'
>
> I suspect this would get intermittent failures for the same reason as
> 0c8c385 (gitweb: supply '-n' to gzip for identical output, 2011-04-26)

To be squashed into 7/7, I guess...

 archive-tar-filter.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/archive-tar-filter.c b/archive-tar-filter.c
index de8719a..d6e4e32 100644
--- a/archive-tar-filter.c
+++ b/archive-tar-filter.c
@@ -131,7 +131,7 @@ static void load_builtin_filters(void)
 	struct tar_filter *tf;
 
 	tf = tar_filter_new("tgz", strlen("tgz"));
-	tf->command = xstrdup("gzip");
+	tf->command = xstrdup("gzip -n");
 	string_list_append(&tf->extensions, "tgz");
 	string_list_append(&tf->extensions, "tar.gz");
 	tf->use_compression = 1;

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH 2/7] archive: add user-configurable tar-filter infrastructure
  2011-06-15 23:33             ` Junio C Hamano
@ 2011-06-16  0:29               ` Jeff King
  0 siblings, 0 replies; 56+ messages in thread
From: Jeff King @ 2011-06-16  0:29 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: René Scharfe, J.H., git, git-dev

On Wed, Jun 15, 2011 at 04:33:33PM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > Archive supports two output formats: tar and zip. The tar
> > ...
> > +static struct tar_filter *tar_filter_by_namelen(const char *name,
> > +						int len)
> > +{
> > +	struct tar_filter *p;
> > +	for (p = tar_filters; p; p = p->next)
> > +		if (!strncmp(p->name, name, len) && !p->name[len])
> > +			return p;
> > +	return NULL;
> > +}
> 
> Makes me wonder if we want to have a generic table that is keyed by name
> whose contents can be looked up by counted string. string_list is the
> closest thing we already have, but I do not think it has counted string
> interface (shouldn't be a rocket surgery to add it, though).

I don't know that it would actually make this code significantly clearer
or more efficient. If it were a sorted array, one could do a binary
search, but we are really talking about a handful of elements (if you
did want to refactor, this is almost identical to the matching code in
userdiff, too).

> > +static int tar_filter_config(const char *var, const char *value, void *data)
> > +{
> > ...
> > +	if (!strcmp(type, "command")) {
> > +		if (!value)
> > +			return config_error_nonbool(var);
> > +		tf->command = xstrdup(value);
> 
> Does this result in small leak if the same filter is multiply defined, say
> in /etc/gitconfig and then in ~/.gitconfig?

Yeah, it does. My original version had the builtin gzip statically
allocated, and it wasn't safe to free() anything. But I ended up having
to allocate it dynamically anyway because of the variable-sized list of
extensions, so it would be safe to free(tf->command) here. I'll do that
in my re-roll.

> > +struct tar_filter {
> > +	char *name;
> > +	char *command;
> > +	struct string_list extensions;
> > +	unsigned use_compression:1;
> 
> I suspect that you plan to pass sprintf("-%d", level) for the ones marked
> with this bit, but I wonder if we want to give a bit more control on how a
> compression level option is shaped for the particular command, and where
> on the command line the option comes.  As long as we are targetting gzip
> and nothing else it is fine, and I suspect newer compression commands
> would try to mimic the -[0-9] command line interface gzip has (e.g. xz),
> so this probably is not an issue in practice.

Yeah, I assumed everyone who would want this would support -[0-9]. After
all, all the flag is doing is passing -[0-9] that was supplied to
git-archive. We could allow something like:

  [tarfilter "gzip"]
    command = gzip %(compression)

but I don't see much point. Either you want it or you don't. If there is
a complex mapping of those numbers to some other options in your
command, then point git to a helper script which does the conversion
and then execs your command.

If somebody has a counterexample, I'd be curious to hear it.

-Peff

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 5/7] archive: refactor format-guessing from filename
  2011-06-15 23:48             ` Junio C Hamano
@ 2011-06-16  0:34               ` Jeff King
  0 siblings, 0 replies; 56+ messages in thread
From: Jeff King @ 2011-06-16  0:34 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: René Scharfe, J.H., git, git-dev

On Wed, Jun 15, 2011 at 04:48:32PM -0700, Junio C Hamano wrote:

> > will produce a gzip'd file. If we make the mapping decision
> > locally, then running:
> >
> >   git archive --remote=origin -o file.tar.gz HEAD
> >
> > will send "--format=tgz" to the remote side. The server will
> > complain, saying that it doesn't know about the tgz format.
> 
> As long as that complaint is clearly marked as coming from the remote
> side, the user now knows that tgz is not supported, and can fall back to a
> plain tar.

It is. You can simulate with:

  $ git archive --format=foobar --remote=. HEAD
  remote: fatal: Unknown archive format 'foobar'
  remote: git upload-archive: archiver died with error
  fatal: sent error to the client: git upload-archive: archiver died with error

Or if you want to be really thorough and you have an old git lying
around, you can see the format auto-selection triggers the same code:

  $ git archive -o foo.tar.gz --remote=. --exec='git.v1.7.5 upload-archive' HEAD
  remote: fatal: Unknown archive format 'tgz'
  fatal: sent error to the client: git upload-archive: archiver died with error
  remote: git upload-archive: archiver died with error

> Am I being naïve thinking that barfing (and assuming that the user
> understands why the remote end barfed) actually is a good thing?

No, I think barfing is totally fine there. I worry more about the cases
where we silently produce a format the user was not expecting (the way
mine is coded, it is "the server knows about .tar.gz, but the client
does not; I expect gzip, but I get regular tar").

-Peff

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 7/7] archive: provide builtin .tar.gz filter
  2011-06-15 23:55             ` Junio C Hamano
  2011-06-15 23:57               ` Junio C Hamano
@ 2011-06-16  0:38               ` Jeff King
  2011-06-16  6:27                 ` Junio C Hamano
  1 sibling, 1 reply; 56+ messages in thread
From: Jeff King @ 2011-06-16  0:38 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: René Scharfe, J.H., git, git-dev

On Wed, Jun 15, 2011 at 04:55:57PM -0700, Junio C Hamano wrote:

> > +test_expect_success 'infer tgz from .tgz filename' '
> > +	git archive --output=j1.tgz HEAD &&
> > +	test_cmp j.tgz j1.tgz
> > +'
> 
> I suspect this would get intermittent failures for the same reason as
> 0c8c385 (gitweb: supply '-n' to gzip for identical output, 2011-04-26)

Ick, yeah. I pulled these tests from my original internal
implementation, which I suspect may have been more stable.

The filename will always be stdin, which is OK, but the timestamp will
probably get us.

> diff --git a/archive-tar-filter.c b/archive-tar-filter.c
> index de8719a..d6e4e32 100644
> --- a/archive-tar-filter.c
> +++ b/archive-tar-filter.c
> @@ -131,7 +131,7 @@ static void load_builtin_filters(void)
>  	struct tar_filter *tf;
>  
>  	tf = tar_filter_new("tgz", strlen("tgz"));
> -	tf->command = xstrdup("gzip");
> +	tf->command = xstrdup("gzip -n");
>  	string_list_append(&tf->extensions, "tgz");
>  	string_list_append(&tf->extensions, "tar.gz");
>  	tf->use_compression = 1;

This feels a little wrong, as we are changing what the tool outputs all
the time just to appease a poorly-written test. Maybe nobody cares about
the timestamp field (I certainly don't), but it seems like it might
surprise some users.

-Peff

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 7/7] archive: provide builtin .tar.gz filter
  2011-06-16  0:38               ` Jeff King
@ 2011-06-16  6:27                 ` Junio C Hamano
  2011-06-16  6:51                   ` Jeff King
  0 siblings, 1 reply; 56+ messages in thread
From: Junio C Hamano @ 2011-06-16  6:27 UTC (permalink / raw)
  To: Jeff King; +Cc: René Scharfe, J.H., git, git-dev

Jeff King <peff@github.com> writes:

> This feels a little wrong, as we are changing what the tool outputs all
> the time just to appease a poorly-written test.

Now you confused me.  Isn't '-n' to tell the tool *not* to write timestamp
out, so that we can avoid changing what the tool outputs all the time?

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 7/7] archive: provide builtin .tar.gz filter
  2011-06-16  6:27                 ` Junio C Hamano
@ 2011-06-16  6:51                   ` Jeff King
  2011-06-16  7:56                     ` Chris Webb
  0 siblings, 1 reply; 56+ messages in thread
From: Jeff King @ 2011-06-16  6:51 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: René Scharfe, J.H., git, git-dev

On Wed, Jun 15, 2011 at 11:27:26PM -0700, Junio C Hamano wrote:

> Jeff King <peff@github.com> writes:
> 
> > This feels a little wrong, as we are changing what the tool outputs all
> > the time just to appease a poorly-written test.
> 
> Now you confused me.  Isn't '-n' to tell the tool *not* to write timestamp
> out, so that we can avoid changing what the tool outputs all the time?

No, I mean that people may _want_ the timestamp in day to day use. Using
"-n" all the time suppresses it. And there is no reason to suppress it,
except that our test does not account for it properly. So your patch is
hurting people who don't want "-n" (i.e., want the timestamp) just to
make our test happy.

-Peff

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 7/7] archive: provide builtin .tar.gz filter
  2011-06-16  6:51                   ` Jeff King
@ 2011-06-16  7:56                     ` Chris Webb
  2011-06-16 17:46                       ` Jeff King
  0 siblings, 1 reply; 56+ messages in thread
From: Chris Webb @ 2011-06-16  7:56 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Ren?? Scharfe, J.H., git, git-dev

Jeff King <peff@github.com> writes:

> No, I mean that people may _want_ the timestamp in day to day use. Using
> "-n" all the time suppresses it. And there is no reason to suppress it,
> except that our test does not account for it properly. So your patch is
> hurting people who don't want "-n" (i.e., want the timestamp) just to
> make our test happy.

It's useful to omit the timestamp outside of git too. Source-based package
management systems generally store a URL from which to fetch a source
tarball, and a hash of that source tarball to ensure it hasn't been tampered
with. It's nice to be able to use a gitweb URL like

  http://git.kernel.org/?p=git/git.git;a=snapshot;h=e5af0de202e885b793482d416b8ce9d50dd2b8bc;sf=tgz

as the tarball source, and still be able to verify its integrity against a
prestored hash.

Cheers,

Chris.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 7/7] archive: provide builtin .tar.gz filter
  2011-06-16  7:56                     ` Chris Webb
@ 2011-06-16 17:46                       ` Jeff King
  2011-06-16 18:02                         ` Junio C Hamano
  0 siblings, 1 reply; 56+ messages in thread
From: Jeff King @ 2011-06-16 17:46 UTC (permalink / raw)
  To: Chris Webb; +Cc: Junio C Hamano, Ren?? Scharfe, J.H., git, git-dev

On Thu, Jun 16, 2011 at 08:56:21AM +0100, Chris Webb wrote:

> Jeff King <peff@github.com> writes:
> 
> > No, I mean that people may _want_ the timestamp in day to day use. Using
> > "-n" all the time suppresses it. And there is no reason to suppress it,
> > except that our test does not account for it properly. So your patch is
> > hurting people who don't want "-n" (i.e., want the timestamp) just to
> > make our test happy.
> 
> It's useful to omit the timestamp outside of git too. Source-based package
> management systems generally store a URL from which to fetch a source
> tarball, and a hash of that source tarball to ensure it hasn't been tampered
> with. It's nice to be able to use a gitweb URL like
> 
>   http://git.kernel.org/?p=git/git.git;a=snapshot;h=e5af0de202e885b793482d416b8ce9d50dd2b8bc;sf=tgz
> 
> as the tarball source, and still be able to verify its integrity against a
> prestored hash.

OK. I'm totally willing to accept that people actually prefer the "-n"
behavior. I don't care either way myself. I just don't want the reason
to default to "-n" to be "because our test scripts need it" and not
"because this is what people actually want".

-Peff

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 7/7] archive: provide builtin .tar.gz filter
  2011-06-16 17:46                       ` Jeff King
@ 2011-06-16 18:02                         ` Junio C Hamano
  2011-06-16 18:21                           ` Jeff King
  0 siblings, 1 reply; 56+ messages in thread
From: Junio C Hamano @ 2011-06-16 18:02 UTC (permalink / raw)
  To: Jeff King; +Cc: Chris Webb, Junio C Hamano, Ren?? Scharfe, J.H., git, git-dev

Jeff King <peff@github.com> writes:

> OK. I'm totally willing to accept that people actually prefer the "-n"
> behavior. I don't care either way myself. I just don't want the reason
> to default to "-n" to be "because our test scripts need it" and not
> "because this is what people actually want".

Surely I share the exact feeling, and that is why I quoted the other "-n"
added to gitweb because that was what people actually wanted.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 7/7] archive: provide builtin .tar.gz filter
  2011-06-16 18:02                         ` Junio C Hamano
@ 2011-06-16 18:21                           ` Jeff King
  2011-06-16 18:27                             ` John Szakmeister
  2011-06-16 18:42                             ` Junio C Hamano
  0 siblings, 2 replies; 56+ messages in thread
From: Jeff King @ 2011-06-16 18:21 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Chris Webb, Ren?? Scharfe, J.H., git, git-dev

On Thu, Jun 16, 2011 at 11:02:09AM -0700, Junio C Hamano wrote:

> Jeff King <peff@github.com> writes:
> 
> > OK. I'm totally willing to accept that people actually prefer the "-n"
> > behavior. I don't care either way myself. I just don't want the reason
> > to default to "-n" to be "because our test scripts need it" and not
> > "because this is what people actually want".
> 
> Surely I share the exact feeling, and that is why I quoted the other "-n"
> added to gitweb because that was what people actually wanted.

Fair enough. I'll use "gzip -n" in my re-roll.

Any comment on the "tarfilter" versus "generic archive filter" issue, or
on the general interface? I think getting that right is my biggest issue
in moving forward.

Also, since it's easy via the external helper route, should there be any
other builtin formats? Bzip2? It's not that big a deal for a big hosting
site like kernel.org to stick it in their configuration, but I wonder if
normal users would find it useful.

-Peff

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 7/7] archive: provide builtin .tar.gz filter
  2011-06-16 18:21                           ` Jeff King
@ 2011-06-16 18:27                             ` John Szakmeister
  2011-06-16 18:42                             ` Junio C Hamano
  1 sibling, 0 replies; 56+ messages in thread
From: John Szakmeister @ 2011-06-16 18:27 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Chris Webb, Ren?? Scharfe, J.H., git, git-dev

On Thu, Jun 16, 2011 at 2:21 PM, Jeff King <peff@github.com> wrote:
[snip]
> Also, since it's easy via the external helper route, should there be any
> other builtin formats? Bzip2? It's not that big a deal for a big hosting
> site like kernel.org to stick it in their configuration, but I wonder if
> normal users would find it useful.

I'd certainly find it useful.

-John

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 7/7] archive: provide builtin .tar.gz filter
  2011-06-16 18:21                           ` Jeff King
  2011-06-16 18:27                             ` John Szakmeister
@ 2011-06-16 18:42                             ` Junio C Hamano
  2011-06-16 18:57                               ` Jeff King
  1 sibling, 1 reply; 56+ messages in thread
From: Junio C Hamano @ 2011-06-16 18:42 UTC (permalink / raw)
  To: Jeff King; +Cc: Chris Webb, Ren?? Scharfe, J.H., git, git-dev

Jeff King <peff@github.com> writes:

> Also, since it's easy via the external helper route, should there be any
> other builtin formats? Bzip2? It's not that big a deal for a big hosting
> site like kernel.org to stick it in their configuration, but I wonder if
> normal users would find it useful.

I know k.org statically prepares *.bz2 for any *.gz in /pub/software/
hierarchy, but isn't bzip2 significantly more expensive (like taking ten
times as much memory or four times as much CPI time to squeeze the last
extra 15% out), making it unsuitable for one-shot online use?

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 7/7] archive: provide builtin .tar.gz filter
  2011-06-16 18:42                             ` Junio C Hamano
@ 2011-06-16 18:57                               ` Jeff King
  0 siblings, 0 replies; 56+ messages in thread
From: Jeff King @ 2011-06-16 18:57 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Chris Webb, Ren?? Scharfe, J.H., git, git-dev

On Thu, Jun 16, 2011 at 11:42:17AM -0700, Junio C Hamano wrote:

> Jeff King <peff@github.com> writes:
> 
> > Also, since it's easy via the external helper route, should there be any
> > other builtin formats? Bzip2? It's not that big a deal for a big hosting
> > site like kernel.org to stick it in their configuration, but I wonder if
> > normal users would find it useful.
> 
> I know k.org statically prepares *.bz2 for any *.gz in /pub/software/
> hierarchy, but isn't bzip2 significantly more expensive (like taking ten
> times as much memory or four times as much CPI time to squeeze the last
> extra 15% out), making it unsuitable for one-shot online use?

I will let J.H. comment on how appropriate that is for k.org; he is the
one who mentioned bzip2 in the first place earlier in the thread.

Even if kernel.org wants it, it is a departure from what we now, though.
People who set up remote upload-archive access long ago and upgrade git
would now suddenly let anyone convince their machines to chew up a lot
of CPU. I don't mind too much doing it for gzip, which takes roughly the
same amount of time as zip, which already exists. Silently adding other
formats may be worse, though.

At the same time, it would be nice for people running git-archive
locally to avoid having to configure a bzip2 filter manually. Maybe
upload-archive should ignore these filters by default, and require a
special config variable to enable them?

-Peff

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC/PATCH 0/7] user-configurable git-archive output formats
  2011-06-15 22:30         ` [RFC/PATCH 0/7] user-configurable git-archive output formats Jeff King
                             ` (6 preceding siblings ...)
  2011-06-15 22:35           ` [PATCH 7/7] archive: provide builtin .tar.gz filter Jeff King
@ 2011-06-18 14:52           ` René Scharfe
  2011-06-18 15:28             ` Jakub Narebski
                               ` (2 more replies)
  2011-06-18 15:40           ` René Scharfe
  8 siblings, 3 replies; 56+ messages in thread
From: René Scharfe @ 2011-06-18 14:52 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, J.H., git, git-dev

Am 16.06.2011 00:30, schrieb Jeff King:
> On Tue, Jun 14, 2011 at 04:45:21PM -0400, Jeff King wrote:
> 
>> The gzip path is not configurable at all. Probably it should read the
>> path and arguments from the config file. In fact, we could even allow
>> arbitrary config like:
>>
>>   [tarfilter "tgz"]
>>     command = gzip -c
>>     extension = tgz
>>     extension = tar.gz

Configuration options whose values are appended instead of overwritten
by duplicate definitions are a new concept for git, I think.  Perhaps
it's not a big thing, but I think it's better avoided.

The only (stupid) practical shortcoming I can think if is this, though:
You can't remove anything from the list of supported extensions in a
user config if the system config already contains e.g. tgz and tar.gz.

> Here's a series implementing that. You can configure whatever you want,
> and it includes builtin gzip configuration by default. You can override
> to turn it off, or even switch it to run something like pigz instead.
> 
> My biggest reservation with the patches as-is is that they are very
> tar-centric and not orthogonal. Specifically, they won't handle:
> 
>   1. Other streamable archive formats you would want to pipe through
>      compressors. Do any of these actually exist? I guess we could offer
>      "pax" as a format eventually, and it might be like tar with
>      different defaults? I dunno.
> 
>      Fixing this would not be too hard. Instead of these being
>      "tarfilters", they would be "archive filters", and they would chain
>      to some format, defaulting to "tar".  Since there is no other
>      format right now, we could even punt on writing most of the code
>      until somebody adds one. But we would want to get the naming of the
>      config options right, since those are user-facing. Maybe
>      "archivefilter" (unfortunately the more readable archive.filter is
>      a little awkward with the way we parse config files)?

The pax format is identical to the ustar format, which --format=tar
produces.  The other major format that comes to mind is cpio.  The
(never merged) predecessor of tar-tree actually used that format.

Since then I have been waiting for users to request being able to export
using cpio format (which is simpler and slightly smaller than tar), but
that never happened.  It seems the existence of the pax format really
has pacified the tar vs. cpio war of old.

I'm not sure "filter" is a good name, though.  We have core.pager, which
is technically a filter as well, but for a specific purpose.  And we
have the tar.umask setting as a precedence for format specfic config
options.  So how about tar.<extension>.compressor?

	[tar "tgz"]
		compressor = gzip -cn
	[tar "tar.gz"]
		compressor = gzip -cn
	[tar "tar.bz2"]
		compressor = bzip2 -c

We don't need a compressionlevels option here because we can simply
assume that the compressor commands do support them.  (Side note: this
is not fully true for bzip2, as it doesn't support -0, but I don't think
this is worth special consideration in our code, as long as errors of
the filter are displayed properly.)

And we can also add a config option to restrict the formats creatable by
upload-archive, to address concerns over DoS attacks with expensive
compressors:

	[archive]
		remoteFormats = tar zip tgz tar.gz

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC/PATCH 0/7] user-configurable git-archive output formats
  2011-06-18 14:52           ` [RFC/PATCH 0/7] user-configurable git-archive output formats René Scharfe
@ 2011-06-18 15:28             ` Jakub Narebski
  2011-06-20 15:58             ` Junio C Hamano
  2011-06-21 16:01             ` [RFC/PATCH 0/7] user-configurable git-archive output formats Jeff King
  2 siblings, 0 replies; 56+ messages in thread
From: Jakub Narebski @ 2011-06-18 15:28 UTC (permalink / raw)
  To: René Scharfe; +Cc: Jeff King, Junio C Hamano, J.H., git, git-dev

René Scharfe <rene.scharfe@lsrfire.ath.cx> writes:
> Am 16.06.2011 00:30, schrieb Jeff King:
>> On Tue, Jun 14, 2011 at 04:45:21PM -0400, Jeff King wrote:
>> 
>>> The gzip path is not configurable at all. Probably it should read the
>>> path and arguments from the config file. In fact, we could even allow
>>> arbitrary config like:
>>>
>>>   [tarfilter "tgz"]
>>>     command = gzip -c
>>>     extension = tgz
>>>     extension = tar.gz
> 
> Configuration options whose values are appended instead of overwritten
> by duplicate definitions are a new concept for git, I think.  Perhaps
> it's not a big thing, but I think it's better avoided.

Actually they are not a new concept, but they are quite rare.  `git config`
has even special options to deal with them ('--replace-all', '--add', 
'--get-all', '--unset-all'.

For example "core.gitProxy" can be set multiple times, and of course
"remote.<remotename>.fetch", 'pull' and 'url'.

-- 
Jakub Narebski
Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC/PATCH 0/7] user-configurable git-archive output formats
  2011-06-15 22:30         ` [RFC/PATCH 0/7] user-configurable git-archive output formats Jeff King
                             ` (7 preceding siblings ...)
  2011-06-18 14:52           ` [RFC/PATCH 0/7] user-configurable git-archive output formats René Scharfe
@ 2011-06-18 15:40           ` René Scharfe
  8 siblings, 0 replies; 56+ messages in thread
From: René Scharfe @ 2011-06-18 15:40 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, J.H., git, git-dev

Am 16.06.2011 00:30, schrieb Jeff King:
>   2. In theory you might want to plug in external helpers that are not
>      just stream filters, but actually their own container formats (like
>      zip). I think people who want 7zip would want this.
> 
>      But how does git-archive interact with the helper? By definition
>      the data it wants is the set of files, not a single stream. So
>      either:
> 
>        a. We give the helper a temporary exported checkout, and it
>           generates the stream from that.
> 
>        b. We use tar as the lingua franca of streaming file containers,
>           and let the helper deal with converting to its preferred
>           output format.
> 
>       Option (a) seems horribly inefficient on disk I/O. And if we did
>       want to do that, I think it's largely unrelated to this patch
>       series.
> 
>       You can actually do option (b) with this series. In its worst
>       case, you can do the same as (a): just untar into a temporary
>       directory and compress from there. But a well-written helper could
>       convert tar into the output format on the fly.

Both can be done today, locally.  One just needs to add a tar-to-7z/any
program for 2.b). :)

Currently the easiest way to apply LZMA compression would be to pipe
--format=tar through the xz utils.  This method is supported by your
patches and the result can be read by 7-Zip.  It should be good enough
for most uses; I think we can disregard the whole point 2 until users
start to ask for these other formats or use cases.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC/PATCH 0/7] user-configurable git-archive output formats
  2011-06-18 14:52           ` [RFC/PATCH 0/7] user-configurable git-archive output formats René Scharfe
  2011-06-18 15:28             ` Jakub Narebski
@ 2011-06-20 15:58             ` Junio C Hamano
  2011-06-22  1:19               ` [PATCHv2 0/9] configurable tar compressors Jeff King
  2011-06-21 16:01             ` [RFC/PATCH 0/7] user-configurable git-archive output formats Jeff King
  2 siblings, 1 reply; 56+ messages in thread
From: Junio C Hamano @ 2011-06-20 15:58 UTC (permalink / raw)
  To: René Scharfe; +Cc: Jeff King, J.H., git, git-dev

René Scharfe <rene.scharfe@lsrfire.ath.cx> writes:

> I'm not sure "filter" is a good name, though.  We have core.pager, which
> is technically a filter as well, but for a specific purpose.  And we
> have the tar.umask setting as a precedence for format specfic config
> options.  So how about tar.<extension>.compressor?
>
> 	[tar "tgz"]
> 		compressor = gzip -cn
> 	[tar "tar.gz"]
> 		compressor = gzip -cn
> 	[tar "tar.bz2"]
> 		compressor = bzip2 -c
>
> We don't need a compressionlevels option here because we can simply
> assume that the compressor commands do support them.  (Side note: this
> is not fully true for bzip2, as it doesn't support -0, but I don't think
> this is worth special consideration in our code, as long as errors of
> the filter are displayed properly.)
>
> And we can also add a config option to restrict the formats creatable by
> upload-archive, to address concerns over DoS attacks with expensive
> compressors:
>
> 	[archive]
> 		remoteFormats = tar zip tgz tar.gz

Both sounds sensible, I think.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC/PATCH 0/7] user-configurable git-archive output formats
  2011-06-18 14:52           ` [RFC/PATCH 0/7] user-configurable git-archive output formats René Scharfe
  2011-06-18 15:28             ` Jakub Narebski
  2011-06-20 15:58             ` Junio C Hamano
@ 2011-06-21 16:01             ` Jeff King
  2 siblings, 0 replies; 56+ messages in thread
From: Jeff King @ 2011-06-21 16:01 UTC (permalink / raw)
  To: René Scharfe; +Cc: Junio C Hamano, J.H., git, git-dev

On Sat, Jun 18, 2011 at 04:52:02PM +0200, René Scharfe wrote:

> >> The gzip path is not configurable at all. Probably it should read the
> >> path and arguments from the config file. In fact, we could even allow
> >> arbitrary config like:
> >>
> >>   [tarfilter "tgz"]
> >>     command = gzip -c
> >>     extension = tgz
> >>     extension = tar.gz
> 
> Configuration options whose values are appended instead of overwritten
> by duplicate definitions are a new concept for git, I think.  Perhaps
> it's not a big thing, but I think it's better avoided.
> 
> The only (stupid) practical shortcoming I can think if is this, though:
> You can't remove anything from the list of supported extensions in a
> user config if the system config already contains e.g. tgz and tar.gz.

Yeah, I have mixed feelings on that.

As Jakub pointed out, we already have them in several places. I don't
know that removal is that big a deal in this instance. If we did want to
support it, I think it would make more sense to have a generic solution
at the config level, like:

  [some-section]
    multivalue = foo
    multivalue = bar
    !multivalue
    multivalue = baz
    multivalue = whee

at which point the value is ("baz", "whee"). That matches what we do on
the command line, where:

  git foo --multivalue=foo --multivalue=bar --no-multivalue \
          --multivalue=baz --multivalue=whee

handles the same issue in a similar way.

The other option, of course, is having a single value with list
semantics. But then you have to invent separator syntax. In this
instance whitespace would probably be fine, but I'd rather that each new
multi-valued option did not invent its own syntax, and in the general
case you may need to handle quoting. Plus you may need some kind of
append syntax. For example, if we support "tgz" and "tar.gz" internally,
how do you say 'add "pax.gz"' to that list without reiterating the whole
list?

> The pax format is identical to the ustar format, which --format=tar
> produces.  The other major format that comes to mind is cpio.  The
> (never merged) predecessor of tar-tree actually used that format.

Thanks, cpio is probably the most likely example.

> Since then I have been waiting for users to request being able to export
> using cpio format (which is simpler and slightly smaller than tar), but
> that never happened.  It seems the existence of the pax format really
> has pacified the tar vs. cpio war of old.

Fair enough. I haven't heard anybody clamoring for it either. I just
didn't want to paint us into a corner. Since it seems like the most
likely format and nobody really wants it, it's perhaps not worth
worrying about.

> I'm not sure "filter" is a good name, though.  We have core.pager, which
> is technically a filter as well, but for a specific purpose.

Yeah, any name would have to be "archive filter" or similar. But I would
think being under the "tar" section would be enough to disambiguate it.

> And we have the tar.umask setting as a precedence for format specfic
> config options.  So how about tar.<extension>.compressor?
> 
> 	[tar "tgz"]
> 		compressor = gzip -cn
> 	[tar "tar.gz"]
> 		compressor = gzip -cn
> 	[tar "tar.bz2"]
> 		compressor = bzip2 -c

My two complaints are:

  1. The user has to repeat themselves in describing the command for
     multiple extensions. In practice, that's probably not a big deal,
     though.

  2. The namespace for user-defined extensions is the same as the
     namespace for tar options. I guess we can disambiguate based on the
     number of dots (so, e.g., I know that "tar.umask" is not the umask
     extension, because it doesn't have a third component). It does
     limit us a little bit for adding future options.

     I don't know if it's worth caring about. We have the same problem
     with the diff.* namespace (e.g., diff.color.* exists, but is not a
     userdiff driver). In that case, besides the code being a little
     careful to be tolerant of the clash, I don't think it has been a
     problem.

> We don't need a compressionlevels option here because we can simply
> assume that the compressor commands do support them.

But we discussed elsewhere the concept of a tar-to-7z filter. I'm not
sure I'd call that a "compressor" as much as a filter. And it wouldn't
want the compression-level options (or maybe you would; I don't use it,
but skimming the manpage, it looks like you would want to convert -5
into "-mx=5"; so maybe you would want a wrapper script anyway).

> (Side note: this is not fully true for bzip2, as it doesn't support
> -0, but I don't think this is worth special consideration in our code,
> as long as errors of the filter are displayed properly.)

Yeah, I think that can be ignored. bzip can take care of complaining
itself.

> And we can also add a config option to restrict the formats creatable by
> upload-archive, to address concerns over DoS attacks with expensive
> compressors:
> 
> 	[archive]
> 		remoteFormats = tar zip tgz tar.gz

Right. It does have the ad-hoc list syntax I complained about above,
though.

-Peff

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCHv2 0/9] configurable tar compressors
  2011-06-20 15:58             ` Junio C Hamano
@ 2011-06-22  1:19               ` Jeff King
  2011-06-22  1:20                 ` [PATCHv2 1/9] archive: reorder option parsing and config reading Jeff King
                                   ` (8 more replies)
  0 siblings, 9 replies; 56+ messages in thread
From: Jeff King @ 2011-06-22  1:19 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: René Scharfe, J.H., git, git-dev

Here's my re-roll. In addition to integrating comments on the list, I
made the integration with the archive code a little smoother. Filters
now appear as their own "struct archiver" in the list, and don't have to
be special-cased everywhere.

  [1/9]: archive: reorder option parsing and config reading
  [2/9]: archive-tar: don't reload default config options
  [3/9]: archive: refactor list of archive formats
  [4/9]: archive: pass archiver struct to write_archive callback
  [5/9]: archive: move file extension format-guessing lower
  [6/9]: archive: refactor file extension format-guessing
  [7/9]: archive: implement configurable tar filters
  [8/9]: archive: provide builtin .tar.gz filter
  [9/9]: upload-archive: allow user to turn off filters

-Peff

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCHv2 1/9] archive: reorder option parsing and config reading
  2011-06-22  1:19               ` [PATCHv2 0/9] configurable tar compressors Jeff King
@ 2011-06-22  1:20                 ` Jeff King
  2011-06-22  1:22                 ` [PATCHv2 2/9] archive-tar: don't reload default config options Jeff King
                                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 56+ messages in thread
From: Jeff King @ 2011-06-22  1:20 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: René Scharfe, J.H., git, git-dev

The archive command does three things during its
initialization phase:

  1. parse command-line options

  2. setup the git directory

  3. read config

During phase (1), if we see any options that do not require
a git directory (like "--list"), we handle them immediately
and exit, making it safe to abort step (2) if we are not in
a git directory.

Step (3) must come after step (2), since the git directory
may influence configuration.  However, this leaves no
possibility of configuration from step (3) impacting the
command-line options in step (1) (which is useful, for
example, for supporting user-configurable output formats).

Instead, let's reorder this to:

  1. setup the git directory, if it exists

  2. read config

  3. parse command-line options

  4. if we are not in a git repository, die

This should have the same external behavior, but puts
configuration before command-line parsing.

Signed-off-by: Jeff King <peff@peff.net>
---
Same as v1.

 archive.c |   18 ++++++++++++++----
 1 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/archive.c b/archive.c
index 42f2d2f..2616676 100644
--- a/archive.c
+++ b/archive.c
@@ -387,17 +387,27 @@ static int parse_archive_args(int argc, const char **argv,
 int write_archive(int argc, const char **argv, const char *prefix,
 		int setup_prefix)
 {
+	int nongit = 0;
 	const struct archiver *ar = NULL;
 	struct archiver_args args;
 
-	argc = parse_archive_args(argc, argv, &ar, &args);
 	if (setup_prefix && prefix == NULL)
-		prefix = setup_git_directory();
+		prefix = setup_git_directory_gently(&nongit);
+
+	git_config(git_default_config, NULL);
+
+	argc = parse_archive_args(argc, argv, &ar, &args);
+	if (nongit) {
+		/*
+		 * We know this will die() with an error, so we could just
+		 * die ourselves; but its error message will be more specific
+		 * than what we could write here.
+		 */
+		setup_git_directory();
+	}
 
 	parse_treeish_arg(argv, &args, prefix);
 	parse_pathspec_arg(argv + 1, &args);
 
-	git_config(git_default_config, NULL);
-
 	return ar->write_archive(&args);
 }
-- 
1.7.5.4.44.g4b107

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCHv2 2/9] archive-tar: don't reload default config options
  2011-06-22  1:19               ` [PATCHv2 0/9] configurable tar compressors Jeff King
  2011-06-22  1:20                 ` [PATCHv2 1/9] archive: reorder option parsing and config reading Jeff King
@ 2011-06-22  1:22                 ` Jeff King
  2011-06-22  1:23                 ` [PATCHv2 3/9] archive: refactor list of archive formats Jeff King
                                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 56+ messages in thread
From: Jeff King @ 2011-06-22  1:22 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: René Scharfe, J.H., git, git-dev

We load our own tar-specific config, and then chain to
git_default_config. This is pointless, as our caller should
already have loaded the default config. It also introduces a
needless inconsistency with the zip archiver, which does not
look at the config files at all (and therefore relies on the
caller to have loaded config).

Signed-off-by: Jeff King <peff@peff.net>
---
Last time the tar-filter code was not integrated with the tar code.
Since we are now under tar.*, and since the tar code reads the config
anyway, it makes sense for us to use that config callback. This was just
a little nit I noticed while changing it.

 archive-tar.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/archive-tar.c b/archive-tar.c
index cee06ce..1ab1a2c 100644
--- a/archive-tar.c
+++ b/archive-tar.c
@@ -231,7 +231,7 @@ static int git_tar_config(const char *var, const char *value, void *cb)
 		}
 		return 0;
 	}
-	return git_default_config(var, value, cb);
+	return 0;
 }
 
 int write_tar_archive(struct archiver_args *args)
-- 
1.7.5.4.44.g4b107

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCHv2 3/9] archive: refactor list of archive formats
  2011-06-22  1:19               ` [PATCHv2 0/9] configurable tar compressors Jeff King
  2011-06-22  1:20                 ` [PATCHv2 1/9] archive: reorder option parsing and config reading Jeff King
  2011-06-22  1:22                 ` [PATCHv2 2/9] archive-tar: don't reload default config options Jeff King
@ 2011-06-22  1:23                 ` Jeff King
  2011-06-23 17:05                   ` Thiago Farina
  2011-06-22  1:24                 ` [PATCHv2 4/9] archive: pass archiver struct to write_archive callback Jeff King
                                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 56+ messages in thread
From: Jeff King @ 2011-06-22  1:23 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: René Scharfe, J.H., git, git-dev

Most of the tar and zip code was nicely split out into two
abstracted files which knew only about their specific
formats. The entry point to this code was a single "write
archive" function.

However, as these basic formats grow more complex (e.g., by
handling multiple file extensions and format names), a
static list of the entry point functions won't be enough.
Instead, let's provide a way for the tar and zip code to
tell the main archive code what they support by registering
archiver names and functions.

Signed-off-by: Jeff King <peff@peff.net>
---
New in v2. This turns archivers more into proper objects, rather than a
hard-coded list of functions, and makes the rest of the series much
cleaner.

 archive-tar.c |   16 +++++++++++++---
 archive-zip.c |   13 ++++++++++++-
 archive.c     |   33 +++++++++++++++++----------------
 archive.h     |   17 ++++++++++-------
 4 files changed, 52 insertions(+), 27 deletions(-)

diff --git a/archive-tar.c b/archive-tar.c
index 1ab1a2c..930375b 100644
--- a/archive-tar.c
+++ b/archive-tar.c
@@ -234,12 +234,10 @@ static int git_tar_config(const char *var, const char *value, void *cb)
 	return 0;
 }
 
-int write_tar_archive(struct archiver_args *args)
+static int write_tar_archive(struct archiver_args *args)
 {
 	int err = 0;
 
-	git_config(git_tar_config, NULL);
-
 	if (args->commit_sha1)
 		err = write_global_extended_header(args);
 	if (!err)
@@ -248,3 +246,15 @@ int write_tar_archive(struct archiver_args *args)
 		write_trailer();
 	return err;
 }
+
+static struct archiver tar_archiver = {
+	"tar",
+	write_tar_archive,
+	0
+};
+
+void init_tar_archiver(void)
+{
+	register_archiver(&tar_archiver);
+	git_config(git_tar_config, NULL);
+}
diff --git a/archive-zip.c b/archive-zip.c
index cf28504..a776d83 100644
--- a/archive-zip.c
+++ b/archive-zip.c
@@ -261,7 +261,7 @@ static void dos_time(time_t *time, int *dos_date, int *dos_time)
 	*dos_time = t->tm_sec / 2 + t->tm_min * 32 + t->tm_hour * 2048;
 }
 
-int write_zip_archive(struct archiver_args *args)
+static int write_zip_archive(struct archiver_args *args)
 {
 	int err;
 
@@ -278,3 +278,14 @@ int write_zip_archive(struct archiver_args *args)
 
 	return err;
 }
+
+static struct archiver zip_archiver = {
+	"zip",
+	write_zip_archive,
+	ARCHIVER_WANT_COMPRESSION_LEVELS
+};
+
+void init_zip_archiver(void)
+{
+	register_archiver(&zip_archiver);
+}
diff --git a/archive.c b/archive.c
index 2616676..f0b4e85 100644
--- a/archive.c
+++ b/archive.c
@@ -14,16 +14,15 @@ static char const * const archive_usage[] = {
 	NULL
 };
 
-#define USES_ZLIB_COMPRESSION 1
-
-static const struct archiver {
-	const char *name;
-	write_archive_fn_t write_archive;
-	unsigned int flags;
-} archivers[] = {
-	{ "tar", write_tar_archive },
-	{ "zip", write_zip_archive, USES_ZLIB_COMPRESSION },
-};
+static const struct archiver **archivers;
+static int nr_archivers;
+static int alloc_archivers;
+
+void register_archiver(struct archiver *ar)
+{
+	ALLOC_GROW(archivers, nr_archivers + 1, alloc_archivers);
+	archivers[nr_archivers++] = ar;
+}
 
 static void format_subst(const struct commit *commit,
                          const char *src, size_t len,
@@ -208,9 +207,9 @@ static const struct archiver *lookup_archiver(const char *name)
 	if (!name)
 		return NULL;
 
-	for (i = 0; i < ARRAY_SIZE(archivers); i++) {
-		if (!strcmp(name, archivers[i].name))
-			return &archivers[i];
+	for (i = 0; i < nr_archivers; i++) {
+		if (!strcmp(name, archivers[i]->name))
+			return archivers[i];
 	}
 	return NULL;
 }
@@ -355,8 +354,8 @@ static int parse_archive_args(int argc, const char **argv,
 		base = "";
 
 	if (list) {
-		for (i = 0; i < ARRAY_SIZE(archivers); i++)
-			printf("%s\n", archivers[i].name);
+		for (i = 0; i < nr_archivers; i++)
+			printf("%s\n", archivers[i]->name);
 		exit(0);
 	}
 
@@ -369,7 +368,7 @@ static int parse_archive_args(int argc, const char **argv,
 
 	args->compression_level = Z_DEFAULT_COMPRESSION;
 	if (compression_level != -1) {
-		if ((*ar)->flags & USES_ZLIB_COMPRESSION)
+		if ((*ar)->flags & ARCHIVER_WANT_COMPRESSION_LEVELS)
 			args->compression_level = compression_level;
 		else {
 			die("Argument not supported for format '%s': -%d",
@@ -395,6 +394,8 @@ int write_archive(int argc, const char **argv, const char *prefix,
 		prefix = setup_git_directory_gently(&nongit);
 
 	git_config(git_default_config, NULL);
+	init_tar_archiver();
+	init_zip_archiver();
 
 	argc = parse_archive_args(argc, argv, &ar, &args);
 	if (nongit) {
diff --git a/archive.h b/archive.h
index 038ac35..f39cede 100644
--- a/archive.h
+++ b/archive.h
@@ -14,15 +14,18 @@ struct archiver_args {
 	int compression_level;
 };
 
-typedef int (*write_archive_fn_t)(struct archiver_args *);
+#define ARCHIVER_WANT_COMPRESSION_LEVELS 1
+struct archiver {
+	const char *name;
+	int (*write_archive)(struct archiver_args *);
+	unsigned flags;
+};
+extern void register_archiver(struct archiver *);
 
-typedef int (*write_archive_entry_fn_t)(struct archiver_args *args, const unsigned char *sha1, const char *path, size_t pathlen, unsigned int mode, void *buffer, unsigned long size);
+extern void init_tar_archiver(void);
+extern void init_zip_archiver(void);
 
-/*
- * Archive-format specific backends.
- */
-extern int write_tar_archive(struct archiver_args *);
-extern int write_zip_archive(struct archiver_args *);
+typedef int (*write_archive_entry_fn_t)(struct archiver_args *args, const unsigned char *sha1, const char *path, size_t pathlen, unsigned int mode, void *buffer, unsigned long size);
 
 extern int write_archive_entries(struct archiver_args *args, write_archive_entry_fn_t write_entry);
 extern int write_archive(int argc, const char **argv, const char *prefix, int setup_prefix);
-- 
1.7.5.4.44.g4b107

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCHv2 4/9] archive: pass archiver struct to write_archive callback
  2011-06-22  1:19               ` [PATCHv2 0/9] configurable tar compressors Jeff King
                                   ` (2 preceding siblings ...)
  2011-06-22  1:23                 ` [PATCHv2 3/9] archive: refactor list of archive formats Jeff King
@ 2011-06-22  1:24                 ` Jeff King
  2011-06-22  1:24                 ` [PATCHv2 5/9] archive: move file extension format-guessing lower Jeff King
                                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 56+ messages in thread
From: Jeff King @ 2011-06-22  1:24 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: René Scharfe, J.H., git, git-dev

The current archivers are very static; when you are in the
write_tar_archive function, you know you are writing a tar.
However, to facilitate runtime-configurable archivers
that will share a common write function we need to tell the
function which archiver was used.

As a convenience, we also provide an opaque data pointer in
the archiver struct so that individual archivers can put
something useful there when they register themselves.
Technically they could just use the "name" field to look in
an internal map of names to data, but this is much simpler.

Signed-off-by: Jeff King <peff@peff.net>
---
New in v2; before there was magic special-casing of the tar_filter code.

 archive-tar.c |    3 ++-
 archive-zip.c |    3 ++-
 archive.c     |    2 +-
 archive.h     |    3 ++-
 4 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/archive-tar.c b/archive-tar.c
index 930375b..bed9a9b 100644
--- a/archive-tar.c
+++ b/archive-tar.c
@@ -234,7 +234,8 @@ static int git_tar_config(const char *var, const char *value, void *cb)
 	return 0;
 }
 
-static int write_tar_archive(struct archiver_args *args)
+static int write_tar_archive(const struct archiver *ar,
+			     struct archiver_args *args)
 {
 	int err = 0;
 
diff --git a/archive-zip.c b/archive-zip.c
index a776d83..42df660 100644
--- a/archive-zip.c
+++ b/archive-zip.c
@@ -261,7 +261,8 @@ static void dos_time(time_t *time, int *dos_date, int *dos_time)
 	*dos_time = t->tm_sec / 2 + t->tm_min * 32 + t->tm_hour * 2048;
 }
 
-static int write_zip_archive(struct archiver_args *args)
+static int write_zip_archive(const struct archiver *ar,
+			     struct archiver_args *args)
 {
 	int err;
 
diff --git a/archive.c b/archive.c
index f0b4e85..a0a5beb 100644
--- a/archive.c
+++ b/archive.c
@@ -410,5 +410,5 @@ int write_archive(int argc, const char **argv, const char *prefix,
 	parse_treeish_arg(argv, &args, prefix);
 	parse_pathspec_arg(argv + 1, &args);
 
-	return ar->write_archive(&args);
+	return ar->write_archive(ar, &args);
 }
diff --git a/archive.h b/archive.h
index f39cede..b3cf219 100644
--- a/archive.h
+++ b/archive.h
@@ -17,8 +17,9 @@ struct archiver_args {
 #define ARCHIVER_WANT_COMPRESSION_LEVELS 1
 struct archiver {
 	const char *name;
-	int (*write_archive)(struct archiver_args *);
+	int (*write_archive)(const struct archiver *, struct archiver_args *);
 	unsigned flags;
+	void *data;
 };
 extern void register_archiver(struct archiver *);
 
-- 
1.7.5.4.44.g4b107

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCHv2 5/9] archive: move file extension format-guessing lower
  2011-06-22  1:19               ` [PATCHv2 0/9] configurable tar compressors Jeff King
                                   ` (3 preceding siblings ...)
  2011-06-22  1:24                 ` [PATCHv2 4/9] archive: pass archiver struct to write_archive callback Jeff King
@ 2011-06-22  1:24                 ` Jeff King
  2011-06-22  1:25                 ` [PATCHv2 6/9] archive: refactor file extension format-guessing Jeff King
                                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 56+ messages in thread
From: Jeff King @ 2011-06-22  1:24 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: René Scharfe, J.H., git, git-dev

The process for guessing an archive output format based on
the filename is something like this:

  a. parse --output in cmd_archive; check the filename
     against a static set of mapping heuristics (right now
     it just matches ".zip" for zip files).

  b. if found, stick a fake "--format=zip" at the beginning
     of the arguments list (if the user did specify a
     --format manually, the later option will override our
     fake one)

  c. if it's a remote call, ship the arguments to the remote
     (including the fake), which will call write_archive on
     their end

  d. if it's local, ship the arguments to write_archive
     locally

There are two problems:

  1. The set of mappings is static and at too high a level.
     The write_archive level is going to check config for
     user-defined formats, some of which will specify
     extensions. We need to delay lookup until those are
     parsed, so we can match against them.

  2. For a remote archive call, our set of mappings (or
     formats) may not match the remote side's. This is OK in
     practice right now, because all versions of git
     understand "zip" and "tar". But as new formats are
     added, there is going to be a mismatch between what the
     client can do and what the remote server can do.

To fix (1), this patch refactors the location guessing to
happen at the write_archive level, instead of the
cmd_archive level. So instead of sticking a fake --format
field in the argv list, we actually pass a "name hint" down
the callchain; this hint is used at the appropriate time to
guess the format (if one hasn't been given already).

This patch leaves (2) unfixed. The name_hint is converted to
a "--format" option as before, and passed to the remote.
This means the local side's idea of how extensions map to
formats will take precedence.

Another option would be to pass the name hint to the remote
side and let the remote choose. This isn't a good idea for
two reasons:

  1. There's no room in the protocol for passing that
     information. We can pass a new argument, but older
     versions of git on the server will choke on it.

  2. Letting the remote side decide creates a silent
     inconsistency in user experience. Consider the case
     that the locally installed git knows about the "tar.gz"
     format, but a remote server doesn't.

     Running "git archive -o foo.tar.gz" will use the tar.gz
     format. If we use --remote, and the local side chooses
     the format, then we send "--format=tar.gz" to the
     remote, which will complain about the unknown format.
     But if we let the remote side choose the format, then
     it will realize that it doesn't know about "tar.gz" and
     output uncompressed tar without even issuing a warning.

Signed-off-by: Jeff King <peff@peff.net>
---
Code same as v1, but I cleaned up the commit message to be a little less
rambling.

 archive.c                |   25 +++++++++++++++++++---
 archive.h                |    4 ++-
 builtin/archive.c        |   51 ++++++++++++++-------------------------------
 builtin/upload-archive.c |    2 +-
 4 files changed, 41 insertions(+), 41 deletions(-)

diff --git a/archive.c b/archive.c
index a0a5beb..7d0ca32 100644
--- a/archive.c
+++ b/archive.c
@@ -298,9 +298,10 @@ static void parse_treeish_arg(const char **argv,
 	  PARSE_OPT_NOARG | PARSE_OPT_NONEG | PARSE_OPT_HIDDEN, NULL, (p) }
 
 static int parse_archive_args(int argc, const char **argv,
-		const struct archiver **ar, struct archiver_args *args)
+		const struct archiver **ar, struct archiver_args *args,
+		const char *name_hint)
 {
-	const char *format = "tar";
+	const char *format = NULL;
 	const char *base = NULL;
 	const char *remote = NULL;
 	const char *exec = NULL;
@@ -359,6 +360,11 @@ static int parse_archive_args(int argc, const char **argv,
 		exit(0);
 	}
 
+	if (!format && name_hint)
+		format = archive_format_from_filename(name_hint);
+	if (!format)
+		format = "tar";
+
 	/* We need at least one parameter -- tree-ish */
 	if (argc < 1)
 		usage_with_options(archive_usage, opts);
@@ -384,7 +390,7 @@ static int parse_archive_args(int argc, const char **argv,
 }
 
 int write_archive(int argc, const char **argv, const char *prefix,
-		int setup_prefix)
+		  int setup_prefix, const char *name_hint)
 {
 	int nongit = 0;
 	const struct archiver *ar = NULL;
@@ -397,7 +403,7 @@ int write_archive(int argc, const char **argv, const char *prefix,
 	init_tar_archiver();
 	init_zip_archiver();
 
-	argc = parse_archive_args(argc, argv, &ar, &args);
+	argc = parse_archive_args(argc, argv, &ar, &args, name_hint);
 	if (nongit) {
 		/*
 		 * We know this will die() with an error, so we could just
@@ -412,3 +418,14 @@ int write_archive(int argc, const char **argv, const char *prefix,
 
 	return ar->write_archive(ar, &args);
 }
+
+const char *archive_format_from_filename(const char *filename)
+{
+	const char *ext = strrchr(filename, '.');
+	if (!ext)
+		return NULL;
+	ext++;
+	if (!strcasecmp(ext, "zip"))
+		return "zip";
+	return NULL;
+}
diff --git a/archive.h b/archive.h
index b3cf219..202d528 100644
--- a/archive.h
+++ b/archive.h
@@ -29,6 +29,8 @@ extern void init_zip_archiver(void);
 typedef int (*write_archive_entry_fn_t)(struct archiver_args *args, const unsigned char *sha1, const char *path, size_t pathlen, unsigned int mode, void *buffer, unsigned long size);
 
 extern int write_archive_entries(struct archiver_args *args, write_archive_entry_fn_t write_entry);
-extern int write_archive(int argc, const char **argv, const char *prefix, int setup_prefix);
+extern int write_archive(int argc, const char **argv, const char *prefix, int setup_prefix, const char *name_hint);
+
+const char *archive_format_from_filename(const char *filename);
 
 #endif	/* ARCHIVE_H */
diff --git a/builtin/archive.c b/builtin/archive.c
index b14eaba..2578cf5 100644
--- a/builtin/archive.c
+++ b/builtin/archive.c
@@ -24,7 +24,8 @@ static void create_output_file(const char *output_file)
 }
 
 static int run_remote_archiver(int argc, const char **argv,
-			       const char *remote, const char *exec)
+			       const char *remote, const char *exec,
+			       const char *name_hint)
 {
 	char buf[LARGE_PACKET_MAX];
 	int fd[2], i, len, rv;
@@ -37,6 +38,17 @@ static int run_remote_archiver(int argc, const char **argv,
 	transport = transport_get(_remote, _remote->url[0]);
 	transport_connect(transport, "git-upload-archive", exec, fd);
 
+	/*
+	 * Inject a fake --format field at the beginning of the
+	 * arguments, with the format inferred from our output
+	 * filename. This way explicit --format options can override
+	 * it.
+	 */
+	if (name_hint) {
+		const char *format = archive_format_from_filename(name_hint);
+		if (format)
+			packet_write(fd[1], "argument --format=%s\n", format);
+	}
 	for (i = 1; i < argc; i++)
 		packet_write(fd[1], "argument %s\n", argv[i]);
 	packet_flush(fd[1]);
@@ -63,17 +75,6 @@ static int run_remote_archiver(int argc, const char **argv,
 	return !!rv;
 }
 
-static const char *format_from_name(const char *filename)
-{
-	const char *ext = strrchr(filename, '.');
-	if (!ext)
-		return NULL;
-	ext++;
-	if (!strcasecmp(ext, "zip"))
-		return "--format=zip";
-	return NULL;
-}
-
 #define PARSE_OPT_KEEP_ALL ( PARSE_OPT_KEEP_DASHDASH | 	\
 			     PARSE_OPT_KEEP_ARGV0 | 	\
 			     PARSE_OPT_KEEP_UNKNOWN |	\
@@ -84,7 +85,6 @@ int cmd_archive(int argc, const char **argv, const char *prefix)
 	const char *exec = "git-upload-archive";
 	const char *output = NULL;
 	const char *remote = NULL;
-	const char *format_option = NULL;
 	struct option local_opts[] = {
 		OPT_STRING('o', "output", &output, "file",
 			"write the archive to this file"),
@@ -98,32 +98,13 @@ int cmd_archive(int argc, const char **argv, const char *prefix)
 	argc = parse_options(argc, argv, prefix, local_opts, NULL,
 			     PARSE_OPT_KEEP_ALL);
 
-	if (output) {
+	if (output)
 		create_output_file(output);
-		format_option = format_from_name(output);
-	}
-
-	/*
-	 * We have enough room in argv[] to muck it in place, because
-	 * --output must have been given on the original command line
-	 * if we get to this point, and parse_options() must have eaten
-	 * it, i.e. we can add back one element to the array.
-	 *
-	 * We add a fake --format option at the beginning, with the
-	 * format inferred from our output filename.  This way explicit
-	 * --format options can override it, and the fake option is
-	 * inserted before any "--" that might have been given.
-	 */
-	if (format_option) {
-		memmove(argv + 2, argv + 1, sizeof(*argv) * argc);
-		argv[1] = format_option;
-		argv[++argc] = NULL;
-	}
 
 	if (remote)
-		return run_remote_archiver(argc, argv, remote, exec);
+		return run_remote_archiver(argc, argv, remote, exec, output);
 
 	setvbuf(stderr, NULL, _IOLBF, BUFSIZ);
 
-	return write_archive(argc, argv, prefix, 1);
+	return write_archive(argc, argv, prefix, 1, output);
 }
diff --git a/builtin/upload-archive.c b/builtin/upload-archive.c
index 73f788e..e6bb97d 100644
--- a/builtin/upload-archive.c
+++ b/builtin/upload-archive.c
@@ -64,7 +64,7 @@ static int run_upload_archive(int argc, const char **argv, const char *prefix)
 	sent_argv[sent_argc] = NULL;
 
 	/* parse all options sent by the client */
-	return write_archive(sent_argc, sent_argv, prefix, 0);
+	return write_archive(sent_argc, sent_argv, prefix, 0, NULL);
 }
 
 __attribute__((format (printf, 1, 2)))
-- 
1.7.5.4.44.g4b107

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCHv2 6/9] archive: refactor file extension format-guessing
  2011-06-22  1:19               ` [PATCHv2 0/9] configurable tar compressors Jeff King
                                   ` (4 preceding siblings ...)
  2011-06-22  1:24                 ` [PATCHv2 5/9] archive: move file extension format-guessing lower Jeff King
@ 2011-06-22  1:25                 ` Jeff King
  2011-06-22  1:26                 ` [PATCHv2 7/9] archive: implement configurable tar filters Jeff King
                                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 56+ messages in thread
From: Jeff King @ 2011-06-22  1:25 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: René Scharfe, J.H., git, git-dev

Git-archive will guess a format from the output filename if
no format is explicitly given.  The current function just
hardcodes "zip" to the zip format, and leaves everything
else NULL (which will default to tar). Since we are about
to add user-specified formats, we need to be more flexible.
The new rule is "if a filename ends with a dot and the name
of a format, it matches that format". For the existing "tar"
and "zip" formats, this is identical to the current
behavior. For new user-specified formats, this will do what
the user expects if they name their formats appropriately.

Because we will eventually start matching arbitrary
user-specified extensions that may include dots, the strrchr
search for the final dot is not sufficient. We need to do an
actual suffix match with each extension.

Signed-off-by: Jeff King <peff@peff.net>
---
Similar to v1, except we don't need to special case tar-filter code
anymore.

 archive.c |   25 +++++++++++++++++++------
 1 files changed, 19 insertions(+), 6 deletions(-)

diff --git a/archive.c b/archive.c
index 7d0ca32..41065a8 100644
--- a/archive.c
+++ b/archive.c
@@ -419,13 +419,26 @@ int write_archive(int argc, const char **argv, const char *prefix,
 	return ar->write_archive(ar, &args);
 }
 
+static int match_extension(const char *filename, const char *ext)
+{
+	int prefixlen = strlen(filename) - strlen(ext);
+
+	/*
+	 * We need 1 character for the '.', and 1 character to ensure that the
+	 * prefix is non-empty (k.e., we don't match .tar.gz with no actual
+	 * filename).
+	 */
+	if (prefixlen < 2 || filename[prefixlen-1] != '.')
+		return 0;
+	return !strcmp(filename + prefixlen, ext);
+}
+
 const char *archive_format_from_filename(const char *filename)
 {
-	const char *ext = strrchr(filename, '.');
-	if (!ext)
-		return NULL;
-	ext++;
-	if (!strcasecmp(ext, "zip"))
-		return "zip";
+	int i;
+
+	for (i = 0; i < nr_archivers; i++)
+		if (match_extension(filename, archivers[i]->name))
+			return archivers[i]->name;
 	return NULL;
 }
-- 
1.7.5.4.44.g4b107

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCHv2 7/9] archive: implement configurable tar filters
  2011-06-22  1:19               ` [PATCHv2 0/9] configurable tar compressors Jeff King
                                   ` (5 preceding siblings ...)
  2011-06-22  1:25                 ` [PATCHv2 6/9] archive: refactor file extension format-guessing Jeff King
@ 2011-06-22  1:26                 ` Jeff King
  2011-06-22  1:45                   ` Jeff King
  2011-06-22  6:09                   ` René Scharfe
  2011-06-22  1:27                 ` [PATCHv2 8/9] archive: provide builtin .tar.gz filter Jeff King
  2011-06-22  1:35                 ` [PATCHv2 9/9] upload-archive: allow user to turn off filters Jeff King
  8 siblings, 2 replies; 56+ messages in thread
From: Jeff King @ 2011-06-22  1:26 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: René Scharfe, J.H., git, git-dev

It's common to pipe the tar output produce by "git archive"
through gzip or some other compressor. Locally, this can
easily be done by using a shell pipe. When requesting a
remote archive, though, it cannot be done through the
upload-archive interface.

This patch allows configurable tar filters, so that one
could define a "tar.gz" format that automatically pipes tar
output through gzip.

Signed-off-by: Jeff King <peff@peff.net>
---
This was split across several commits in the previous version of the
series, but due to the cleanups it fits nicely into a single commit.

 Documentation/git-archive.txt |   16 ++++++
 archive-tar.c                 |  107 ++++++++++++++++++++++++++++++++++++++++-
 t/t5000-tar-tree.sh           |   43 ++++++++++++++++
 3 files changed, 165 insertions(+), 1 deletions(-)

diff --git a/Documentation/git-archive.txt b/Documentation/git-archive.txt
index 9c750e2..726bf63 100644
--- a/Documentation/git-archive.txt
+++ b/Documentation/git-archive.txt
@@ -101,6 +101,16 @@ tar.umask::
 	details.  If `--remote` is used then only the configuration of
 	the remote repository takes effect.
 
+tar.<format>.command::
+	This variable specifies a shell command through which the tar
+	output generated by `git archive` should be piped. The command
+	is executed using the shell with the generated tar file on its
+	standard input, and should produce the final output on its
+	standard output. Any compression-level options will be passed
+	to the command (e.g., "-9"). An output file with the same
+	extension as `<format>` will be use this format if no other
+	format is given.
+
 ATTRIBUTES
 ----------
 
@@ -149,6 +159,12 @@ git archive -o latest.zip HEAD::
 	commit on the current branch. Note that the output format is
 	inferred by the extension of the output file.
 
+git config tar.tar.xz.command "xz -c"::
+
+	Configure a "tar.xz" format for making LZMA-compressed tarfiles.
+	You can use it specifying `--format=tar.xz`, or by creating an
+	output file like `-o foo.tar.xz`.
+
 
 SEE ALSO
 --------
diff --git a/archive-tar.c b/archive-tar.c
index bed9a9b..5c30747 100644
--- a/archive-tar.c
+++ b/archive-tar.c
@@ -4,6 +4,7 @@
 #include "cache.h"
 #include "tar.h"
 #include "archive.h"
+#include "run-command.h"
 
 #define RECORDSIZE	(512)
 #define BLOCKSIZE	(RECORDSIZE * 20)
@@ -13,6 +14,9 @@ static unsigned long offset;
 
 static int tar_umask = 002;
 
+static int write_tar_filter_archive(const struct archiver *ar,
+				    struct archiver_args *args);
+
 /* writes out the whole block, but only if it is full */
 static void write_if_needed(void)
 {
@@ -220,6 +224,60 @@ static int write_global_extended_header(struct archiver_args *args)
 	return err;
 }
 
+static struct archiver **tar_filters;
+static int nr_tar_filters;
+static int alloc_tar_filters;
+
+static struct archiver *find_tar_filter(const char *name, int len)
+{
+	int i;
+	for (i = 0; i < nr_tar_filters; i++) {
+		struct archiver *ar = tar_filters[i];
+		if (!strncmp(ar->name, name, len) && !ar->name[len])
+			return ar;
+	}
+	return NULL;
+}
+
+static int tar_filter_config(const char *var, const char *value, void *data)
+{
+	struct archiver *ar;
+	const char *dot;
+	const char *name;
+	const char *type;
+	int namelen;
+
+	if (prefixcmp(var, "tar."))
+		return 0;
+	dot = strrchr(var, '.');
+	if (dot == var + 9)
+		return 0;
+
+	name = var + 4;
+	namelen = dot - name;
+	type = dot + 1;
+
+	ar = find_tar_filter(name, namelen);
+	if (!ar) {
+		ar = xcalloc(1, sizeof(*ar));
+		ar->name = xmemdupz(name, namelen);
+		ar->write_archive = write_tar_filter_archive;
+		ar->flags = ARCHIVER_WANT_COMPRESSION_LEVELS;
+		ALLOC_GROW(tar_filters, nr_tar_filters + 1, alloc_tar_filters);
+		tar_filters[nr_tar_filters++] = ar;
+	}
+
+	if (!strcmp(type, "command")) {
+		if (!value)
+			return config_error_nonbool(var);
+		free(ar->data);
+		ar->data = xstrdup(value);
+		return 0;
+	}
+
+	return 0;
+}
+
 static int git_tar_config(const char *var, const char *value, void *cb)
 {
 	if (!strcmp(var, "tar.umask")) {
@@ -231,7 +289,8 @@ static int git_tar_config(const char *var, const char *value, void *cb)
 		}
 		return 0;
 	}
-	return 0;
+
+	return tar_filter_config(var, value, cb);
 }
 
 static int write_tar_archive(const struct archiver *ar,
@@ -248,6 +307,45 @@ static int write_tar_archive(const struct archiver *ar,
 	return err;
 }
 
+static int write_tar_filter_archive(const struct archiver *ar,
+				    struct archiver_args *args)
+{
+	struct strbuf cmd = STRBUF_INIT;
+	struct child_process filter;
+	const char *argv[2];
+	int r;
+
+	if (!ar->data)
+		die("BUG: tar-filter archiver called with no filter defined");
+
+	strbuf_addstr(&cmd, ar->data);
+	if (args->compression_level >= 0)
+		strbuf_addf(&cmd, " -%d", args->compression_level);
+
+	memset(&filter, 0, sizeof(filter));
+	argv[0] = cmd.buf;
+	argv[1] = NULL;
+	filter.argv = argv;
+	filter.use_shell = 1;
+	filter.in = -1;
+
+	if (start_command(&filter) < 0)
+		die_errno("unable to start '%s' filter", argv[0]);
+	close(1);
+	if (dup2(filter.in, 1) < 0)
+		die_errno("unable to redirect descriptor");
+	close(filter.in);
+
+	r = write_tar_archive(ar, args);
+
+	close(1);
+	if (finish_command(&filter) != 0)
+		die("'%s' filter reported error", argv[0]);
+
+	strbuf_release(&cmd);
+	return r;
+}
+
 static struct archiver tar_archiver = {
 	"tar",
 	write_tar_archive,
@@ -256,6 +354,13 @@ static struct archiver tar_archiver = {
 
 void init_tar_archiver(void)
 {
+	int i;
 	register_archiver(&tar_archiver);
+
 	git_config(git_tar_config, NULL);
+	for (i = 0; i < nr_tar_filters; i++) {
+		/* omit any filters that never had a command configured */
+		if (tar_filters[i]->data)
+			register_archiver(tar_filters[i]);
+	}
 }
diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh
index cff1b3e..1f90692 100755
--- a/t/t5000-tar-tree.sh
+++ b/t/t5000-tar-tree.sh
@@ -252,4 +252,47 @@ test_expect_success 'git-archive --prefix=olde-' '
 	test -f h/olde-a/bin/sh
 '
 
+test_expect_success 'setup tar filters' '
+	git config tar.tar.foo.command "tr ab ba" &&
+	git config tar.bar.command "tr ab ba"
+'
+
+test_expect_success 'archive --list mentions user filter' '
+	git archive --list >output &&
+	grep "^tar\.foo\$" output &&
+	grep "^bar\$" output
+'
+
+test_expect_success 'archive --list shows remote user filters' '
+	git archive --list --remote=. >output &&
+	grep "^tar\.foo\$" output &&
+	grep "^bar\$" output
+'
+
+test_expect_success 'invoke tar filter by format' '
+	git archive --format=tar.foo HEAD >config.tar.foo &&
+	tr ab ba <config.tar.foo >config.tar &&
+	test_cmp b.tar config.tar &&
+	git archive --format=bar HEAD >config.bar &&
+	tr ab ba <config.bar >config.tar &&
+	test_cmp b.tar config.tar
+'
+
+test_expect_success 'invoke tar filter by extension' '
+	git archive -o config-implicit.tar.foo HEAD &&
+	test_cmp config.tar.foo config-implicit.tar.foo &&
+	git archive -o config-implicit.bar HEAD &&
+	test_cmp config.tar.foo config-implicit.bar
+'
+
+test_expect_success 'default output format remains tar' '
+	git archive -o config-implicit.baz HEAD &&
+	test_cmp b.tar config-implicit.baz
+'
+
+test_expect_success 'extension matching requires dot' '
+	git archive -o config-implicittar.foo HEAD &&
+	test_cmp b.tar config-implicittar.foo
+'
+
 test_done
-- 
1.7.5.4.44.g4b107

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCHv2 8/9] archive: provide builtin .tar.gz filter
  2011-06-22  1:19               ` [PATCHv2 0/9] configurable tar compressors Jeff King
                                   ` (6 preceding siblings ...)
  2011-06-22  1:26                 ` [PATCHv2 7/9] archive: implement configurable tar filters Jeff King
@ 2011-06-22  1:27                 ` Jeff King
  2011-06-22  1:35                 ` [PATCHv2 9/9] upload-archive: allow user to turn off filters Jeff King
  8 siblings, 0 replies; 56+ messages in thread
From: Jeff King @ 2011-06-22  1:27 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: René Scharfe, J.H., git, git-dev

This works exactly as if the user had configured it via:

  [tar "tgz"]
	command = gzip -cn
  [tar "tar.gz"]
	command = gzip -cn

but since it is so common, it's convenient to have it
builtin without the user needing to do anything.

Signed-off-by: Jeff King <peff@peff.net>
---
Similar to v1, but rebased. And with docs.

 Documentation/git-archive.txt |   11 +++++++++++
 archive-tar.c                 |    2 ++
 t/t5000-tar-tree.sh           |   38 ++++++++++++++++++++++++++++++++++++++
 3 files changed, 51 insertions(+), 0 deletions(-)

diff --git a/Documentation/git-archive.txt b/Documentation/git-archive.txt
index 726bf63..8b0080a 100644
--- a/Documentation/git-archive.txt
+++ b/Documentation/git-archive.txt
@@ -110,6 +110,9 @@ tar.<format>.command::
 	to the command (e.g., "-9"). An output file with the same
 	extension as `<format>` will be use this format if no other
 	format is given.
++
+The "tar.gz" and "tgz" formats are defined automatically and default to
+`gzip -cn`. You may override them with custom commands.
 
 ATTRIBUTES
 ----------
@@ -143,6 +146,14 @@ git archive --format=tar --prefix=git-1.4.0/ v1.4.0 | gzip >git-1.4.0.tar.gz::
 
 	Create a compressed tarball for v1.4.0 release.
 
+git archive --format=tar.gz --prefix=git-1.4.0/ v1.4.0 >git-1.4.0.tar.gz::
+
+	Same as above, but using the builtin tar.gz handling.
+
+git archive --prefix=git-1.4.0/ -o git-1.4.0.tar.gz v1.4.0::
+
+	Same as above, but the format is inferred from the output file.
+
 git archive --format=tar --prefix=git-1.4.0/ v1.4.0{caret}\{tree\} | gzip >git-1.4.0.tar.gz::
 
 	Create a compressed tarball for v1.4.0 release, but without a
diff --git a/archive-tar.c b/archive-tar.c
index 5c30747..f470ebe 100644
--- a/archive-tar.c
+++ b/archive-tar.c
@@ -357,6 +357,8 @@ void init_tar_archiver(void)
 	int i;
 	register_archiver(&tar_archiver);
 
+	tar_filter_config("tar.tgz.command", "gzip -cn", NULL);
+	tar_filter_config("tar.tar.gz.command", "gzip -cn", NULL);
 	git_config(git_tar_config, NULL);
 	for (i = 0; i < nr_tar_filters; i++) {
 		/* omit any filters that never had a command configured */
diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh
index 1f90692..070250e 100755
--- a/t/t5000-tar-tree.sh
+++ b/t/t5000-tar-tree.sh
@@ -26,6 +26,8 @@ commit id embedding:
 
 . ./test-lib.sh
 UNZIP=${UNZIP:-unzip}
+GZIP=${GZIP:-gzip}
+GUNZIP=${GUNZIP:-gzip -d}
 
 SUBSTFORMAT=%H%n
 
@@ -295,4 +297,40 @@ test_expect_success 'extension matching requires dot' '
 	test_cmp b.tar config-implicittar.foo
 '
 
+if $GZIP --version >/dev/null 2>&1; then
+	test_set_prereq GZIP
+else
+	say "Skipping some tar.gz tests because gzip not found"
+fi
+
+test_expect_success GZIP 'git archive --format=tgz' '
+	git archive --format=tgz HEAD >j.tgz
+'
+
+test_expect_success GZIP 'git archive --format=tar.gz' '
+	git archive --format=tar.gz HEAD >j1.tar.gz &&
+	test_cmp j.tgz j1.tar.gz
+'
+
+test_expect_success GZIP 'infer tgz from .tgz filename' '
+	git archive --output=j2.tgz HEAD &&
+	test_cmp j.tgz j2.tgz
+'
+
+test_expect_success GZIP 'infer tgz from .tar.gz filename' '
+	git archive --output=j3.tar.gz HEAD &&
+	test_cmp j.tgz j3.tar.gz
+'
+
+if $GUNZIP --version >/dev/null 2>&1; then
+	test_set_prereq GUNZIP
+else
+	say "Skipping some tar.gz tests because gunzip was not found"
+fi
+
+test_expect_success GZIP,GUNZIP 'extract tgz file' '
+	$GUNZIP -c <j.tgz >j.tar &&
+	test_cmp b.tar j.tar
+'
+
 test_done
-- 
1.7.5.4.44.g4b107

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCHv2 9/9] upload-archive: allow user to turn off filters
  2011-06-22  1:19               ` [PATCHv2 0/9] configurable tar compressors Jeff King
                                   ` (7 preceding siblings ...)
  2011-06-22  1:27                 ` [PATCHv2 8/9] archive: provide builtin .tar.gz filter Jeff King
@ 2011-06-22  1:35                 ` Jeff King
  2011-06-22  3:17                   ` Jeff King
  8 siblings, 1 reply; 56+ messages in thread
From: Jeff King @ 2011-06-22  1:35 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: René Scharfe, J.H., git, git-dev

Some tar filters may be very expensive to run, so sites do
not want to expose them via upload-archive. This patch lets
users configure tar.<filter>.remote to turn them off.

By default, gzip filters are left on, as they are about as
expensive as creating zip archives.

Signed-off-by: Jeff King <peff@peff.net>
---
This is my response to René's:

  [archive]
    remoteFormats = tar zip tgz tar.gz

It's a little more verbose, but it lets you turn individual formats off
and on without having to enumerate the whole list.

By itself, this may not be that useful. It seems unlikely that somebody
would both to configure a filter globally on a machine serving
upload-pack, and then not want it remotely available. You can also
disable the builtin gzip filter, but it's not really much more expensive
than zip.

But two possible follow-on patches would be:

  1. tar.remote and zip.remote, to turn those formats off. I can see
     wanting to turn off _all_ compression, including zip and gzip, if
     your CPU is terrible (but git servers are almost always I/O bound,
     so I don't know how common that will be). I can also see wanting to
     turn off uncompressed tar, because bandwidth is expensive, CPU is
     cheap, and source code repositories compress well.

  2. adding default bzip2 config; this should be off by default for
     upload-archive

 Documentation/git-archive.txt |    6 ++++++
 archive-tar.c                 |   11 ++++++++++-
 archive-zip.c                 |    2 +-
 archive.c                     |   11 ++++++-----
 archive.h                     |    3 ++-
 builtin/archive.c             |    2 +-
 builtin/upload-archive.c      |    2 +-
 t/t5000-tar-tree.sh           |   25 ++++++++++++++++++++++---
 8 files changed, 49 insertions(+), 13 deletions(-)

diff --git a/Documentation/git-archive.txt b/Documentation/git-archive.txt
index 8b0080a..1320c87 100644
--- a/Documentation/git-archive.txt
+++ b/Documentation/git-archive.txt
@@ -114,6 +114,12 @@ tar.<format>.command::
 The "tar.gz" and "tgz" formats are defined automatically and default to
 `gzip -cn`. You may override them with custom commands.
 
+tar.<format>.remote::
+	If true, enable `<format>` for use by remote clients via
+	linkgit:git-upload-archive[1]. Defaults to false for
+	user-defined formats, but true for the "tar.gz" and "tgz"
+	formats.
+
 ATTRIBUTES
 ----------
 
diff --git a/archive-tar.c b/archive-tar.c
index f470ebe..01ab43f 100644
--- a/archive-tar.c
+++ b/archive-tar.c
@@ -274,6 +274,13 @@ static int tar_filter_config(const char *var, const char *value, void *data)
 		ar->data = xstrdup(value);
 		return 0;
 	}
+	if (!strcmp(type, "remote")) {
+		if (git_config_bool(var, value))
+			tf->flags |= ARCHIVER_REMOTE;
+		else
+			tf->flags &= ~ARCHIVER_REMOTE;
+		return 0;
+	}
 
 	return 0;
 }
@@ -349,7 +356,7 @@ static int write_tar_filter_archive(const struct archiver *ar,
 static struct archiver tar_archiver = {
 	"tar",
 	write_tar_archive,
-	0
+	ARCHIVER_REMOTE
 };
 
 void init_tar_archiver(void)
@@ -358,7 +365,9 @@ void init_tar_archiver(void)
 	register_archiver(&tar_archiver);
 
 	tar_filter_config("tar.tgz.command", "gzip -cn", NULL);
+	tar_filter_config("tar.tgz.remote", "true", NULL);
 	tar_filter_config("tar.tar.gz.command", "gzip -cn", NULL);
+	tar_filter_config("tar.tar.gz.remote", "true", NULL);
 	git_config(git_tar_config, NULL);
 	for (i = 0; i < nr_tar_filters; i++) {
 		/* omit any filters that never had a command configured */
diff --git a/archive-zip.c b/archive-zip.c
index 42df660..3c102a1 100644
--- a/archive-zip.c
+++ b/archive-zip.c
@@ -283,7 +283,7 @@ static int write_zip_archive(const struct archiver *ar,
 static struct archiver zip_archiver = {
 	"zip",
 	write_zip_archive,
-	ARCHIVER_WANT_COMPRESSION_LEVELS
+	ARCHIVER_WANT_COMPRESSION_LEVELS|ARCHIVER_REMOTE
 };
 
 void init_zip_archiver(void)
diff --git a/archive.c b/archive.c
index 41065a8..2a7a28e 100644
--- a/archive.c
+++ b/archive.c
@@ -299,7 +299,7 @@ static void parse_treeish_arg(const char **argv,
 
 static int parse_archive_args(int argc, const char **argv,
 		const struct archiver **ar, struct archiver_args *args,
-		const char *name_hint)
+		const char *name_hint, int is_remote)
 {
 	const char *format = NULL;
 	const char *base = NULL;
@@ -356,7 +356,8 @@ static int parse_archive_args(int argc, const char **argv,
 
 	if (list) {
 		for (i = 0; i < nr_archivers; i++)
-			printf("%s\n", archivers[i]->name);
+			if (!is_remote || archivers[i]->flags & ARCHIVER_REMOTE)
+				printf("%s\n", archivers[i]->name);
 		exit(0);
 	}
 
@@ -369,7 +370,7 @@ static int parse_archive_args(int argc, const char **argv,
 	if (argc < 1)
 		usage_with_options(archive_usage, opts);
 	*ar = lookup_archiver(format);
-	if (!*ar)
+	if (!*ar || (is_remote && !((*ar)->flags & ARCHIVER_REMOTE)))
 		die("Unknown archive format '%s'", format);
 
 	args->compression_level = Z_DEFAULT_COMPRESSION;
@@ -390,7 +391,7 @@ static int parse_archive_args(int argc, const char **argv,
 }
 
 int write_archive(int argc, const char **argv, const char *prefix,
-		  int setup_prefix, const char *name_hint)
+		  int setup_prefix, const char *name_hint, int remote)
 {
 	int nongit = 0;
 	const struct archiver *ar = NULL;
@@ -403,7 +404,7 @@ int write_archive(int argc, const char **argv, const char *prefix,
 	init_tar_archiver();
 	init_zip_archiver();
 
-	argc = parse_archive_args(argc, argv, &ar, &args, name_hint);
+	argc = parse_archive_args(argc, argv, &ar, &args, name_hint, remote);
 	if (nongit) {
 		/*
 		 * We know this will die() with an error, so we could just
diff --git a/archive.h b/archive.h
index 202d528..2b0884f 100644
--- a/archive.h
+++ b/archive.h
@@ -15,6 +15,7 @@ struct archiver_args {
 };
 
 #define ARCHIVER_WANT_COMPRESSION_LEVELS 1
+#define ARCHIVER_REMOTE 2
 struct archiver {
 	const char *name;
 	int (*write_archive)(const struct archiver *, struct archiver_args *);
@@ -29,7 +30,7 @@ extern void init_zip_archiver(void);
 typedef int (*write_archive_entry_fn_t)(struct archiver_args *args, const unsigned char *sha1, const char *path, size_t pathlen, unsigned int mode, void *buffer, unsigned long size);
 
 extern int write_archive_entries(struct archiver_args *args, write_archive_entry_fn_t write_entry);
-extern int write_archive(int argc, const char **argv, const char *prefix, int setup_prefix, const char *name_hint);
+extern int write_archive(int argc, const char **argv, const char *prefix, int setup_prefix, const char *name_hint, int remote);
 
 const char *archive_format_from_filename(const char *filename);
 
diff --git a/builtin/archive.c b/builtin/archive.c
index 2578cf5..883c009 100644
--- a/builtin/archive.c
+++ b/builtin/archive.c
@@ -106,5 +106,5 @@ int cmd_archive(int argc, const char **argv, const char *prefix)
 
 	setvbuf(stderr, NULL, _IOLBF, BUFSIZ);
 
-	return write_archive(argc, argv, prefix, 1, output);
+	return write_archive(argc, argv, prefix, 1, output, 0);
 }
diff --git a/builtin/upload-archive.c b/builtin/upload-archive.c
index e6bb97d..2d0b383 100644
--- a/builtin/upload-archive.c
+++ b/builtin/upload-archive.c
@@ -64,7 +64,7 @@ static int run_upload_archive(int argc, const char **argv, const char *prefix)
 	sent_argv[sent_argc] = NULL;
 
 	/* parse all options sent by the client */
-	return write_archive(sent_argc, sent_argv, prefix, 0, NULL);
+	return write_archive(sent_argc, sent_argv, prefix, 0, NULL, 1);
 }
 
 __attribute__((format (printf, 1, 2)))
diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh
index 070250e..9e3ba98 100755
--- a/t/t5000-tar-tree.sh
+++ b/t/t5000-tar-tree.sh
@@ -256,7 +256,8 @@ test_expect_success 'git-archive --prefix=olde-' '
 
 test_expect_success 'setup tar filters' '
 	git config tar.tar.foo.command "tr ab ba" &&
-	git config tar.bar.command "tr ab ba"
+	git config tar.bar.command "tr ab ba" &&
+	git config tar.bar.remote true
 '
 
 test_expect_success 'archive --list mentions user filter' '
@@ -265,9 +266,9 @@ test_expect_success 'archive --list mentions user filter' '
 	grep "^bar\$" output
 '
 
-test_expect_success 'archive --list shows remote user filters' '
+test_expect_success 'archive --list shows only enabled remote filters' '
 	git archive --list --remote=. >output &&
-	grep "^tar\.foo\$" output &&
+	! grep "^tar\.foo\$" output &&
 	grep "^bar\$" output
 '
 
@@ -297,6 +298,13 @@ test_expect_success 'extension matching requires dot' '
 	test_cmp b.tar config-implicittar.foo
 '
 
+test_expect_success 'only enabled filters are available remotely' '
+	test_must_fail git archive --remote=. --format=tar.foo HEAD \
+		>remote.tar.foo &&
+	git archive --remote=. --format=bar >remote.bar HEAD &&
+	test_cmp remote.bar config.bar
+'
+
 if $GZIP --version >/dev/null 2>&1; then
 	test_set_prereq GZIP
 else
@@ -333,4 +341,15 @@ test_expect_success GZIP,GUNZIP 'extract tgz file' '
 	test_cmp b.tar j.tar
 '
 
+test_expect_success GZIP 'remote tar.gz is allowed by default' '
+	git archive --remote=. --format=tar.gz HEAD >remote.tar.gz &&
+	test_cmp j.tgz remote.tar.gz
+'
+
+test_expect_success GZIP 'remote tar.gz can be disabled' '
+	git config tar.tar.gz.remote false &&
+	test_must_fail git archive --remote=. --format=tar.gz HEAD \
+		>remote.tar.gz
+'
+
 test_done
-- 
1.7.5.4.44.g4b107

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCHv2 7/9] archive: implement configurable tar filters
  2011-06-22  1:26                 ` [PATCHv2 7/9] archive: implement configurable tar filters Jeff King
@ 2011-06-22  1:45                   ` Jeff King
  2011-06-22  6:09                   ` René Scharfe
  1 sibling, 0 replies; 56+ messages in thread
From: Jeff King @ 2011-06-22  1:45 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: René Scharfe, J.H., git, git-dev

On Tue, Jun 21, 2011 at 09:26:31PM -0400, Jeff King wrote:

> It's common to pipe the tar output produce by "git archive"
> through gzip or some other compressor. Locally, this can
> easily be done by using a shell pipe. When requesting a
> remote archive, though, it cannot be done through the
> upload-archive interface.
> 
> This patch allows configurable tar filters, so that one
> could define a "tar.gz" format that automatically pipes tar
> output through gzip.
> 
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> This was split across several commits in the previous version of the
> series, but due to the cleanups it fits nicely into a single commit.

A few comments on what I took and what I didn't:

  1. config is now in tar.<filter>.*; this avoids having yet another
     config section. However, the flat names we give to "git config"
     look a little silly. E.g., "tar.tar.gz.command".

  2. René suggested compressor as the config name. I wanted to stay away
     from that name, as this really is about generic filtering. I expect
     most uses will be compressors, but this is also our method for
     supporting other container formats via external helpers (e.g., you
     could convert to cpio on the fly). The word "filter" is generic,
     but it's also a bit redundant. The whole tar.<name> subsection is
     about the filter. I chose "command", as that is what is used for
     external diff in userdiff drivers, and it makes it clear that we
     are running an external helper. I'm lukewarm on it if somebody
     wants to argue for something else.

  3. There's no config to say you want or don't want -<n> compression
     levels. We always allow them, and it is up to the tool to complain
     if it doesn't want them. My reasoning is that most everything
     either takes them already (e.g., gzip, bzip2, xz), or would require
     a helper script that can either map them (7z) or reject them
     (whatever helper somebody might write to convert tar2cpio on the
     fly).

-Peff

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCHv2 9/9] upload-archive: allow user to turn off filters
  2011-06-22  1:35                 ` [PATCHv2 9/9] upload-archive: allow user to turn off filters Jeff King
@ 2011-06-22  3:17                   ` Jeff King
  0 siblings, 0 replies; 56+ messages in thread
From: Jeff King @ 2011-06-22  3:17 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: René Scharfe, J.H., git, git-dev

On Tue, Jun 21, 2011 at 09:35:45PM -0400, Jeff King wrote:

> Some tar filters may be very expensive to run, so sites do
> not want to expose them via upload-archive. This patch lets
> users configure tar.<filter>.remote to turn them off.
> 
> By default, gzip filters are left on, as they are about as
> expensive as creating zip archives.

Argh, sorry, wrong version of the patch. This one has a slight
refactoring typo that makes it not compile. Correct patch is below.

-- >8 --
Subject: upload-archive: allow user to turn off filters

Some tar filters may be very expensive to run, so sites do
not want to expose them via upload-archive. This patch lets
users configure tar.<filter>.remote to turn them off.

By default, gzip filters are left on, as they are about as
expensive as creating zip archives.

Signed-off-by: Jeff King <peff@peff.net>
---
 Documentation/git-archive.txt |    6 ++++++
 archive-tar.c                 |   11 ++++++++++-
 archive-zip.c                 |    2 +-
 archive.c                     |   11 ++++++-----
 archive.h                     |    3 ++-
 builtin/archive.c             |    2 +-
 builtin/upload-archive.c      |    2 +-
 t/t5000-tar-tree.sh           |   25 ++++++++++++++++++++++---
 8 files changed, 49 insertions(+), 13 deletions(-)

diff --git a/Documentation/git-archive.txt b/Documentation/git-archive.txt
index 8b0080a..1320c87 100644
--- a/Documentation/git-archive.txt
+++ b/Documentation/git-archive.txt
@@ -114,6 +114,12 @@ tar.<format>.command::
 The "tar.gz" and "tgz" formats are defined automatically and default to
 `gzip -cn`. You may override them with custom commands.
 
+tar.<format>.remote::
+	If true, enable `<format>` for use by remote clients via
+	linkgit:git-upload-archive[1]. Defaults to false for
+	user-defined formats, but true for the "tar.gz" and "tgz"
+	formats.
+
 ATTRIBUTES
 ----------
 
diff --git a/archive-tar.c b/archive-tar.c
index f470ebe..20af005 100644
--- a/archive-tar.c
+++ b/archive-tar.c
@@ -274,6 +274,13 @@ static int tar_filter_config(const char *var, const char *value, void *data)
 		ar->data = xstrdup(value);
 		return 0;
 	}
+	if (!strcmp(type, "remote")) {
+		if (git_config_bool(var, value))
+			ar->flags |= ARCHIVER_REMOTE;
+		else
+			ar->flags &= ~ARCHIVER_REMOTE;
+		return 0;
+	}
 
 	return 0;
 }
@@ -349,7 +356,7 @@ static int write_tar_filter_archive(const struct archiver *ar,
 static struct archiver tar_archiver = {
 	"tar",
 	write_tar_archive,
-	0
+	ARCHIVER_REMOTE
 };
 
 void init_tar_archiver(void)
@@ -358,7 +365,9 @@ void init_tar_archiver(void)
 	register_archiver(&tar_archiver);
 
 	tar_filter_config("tar.tgz.command", "gzip -cn", NULL);
+	tar_filter_config("tar.tgz.remote", "true", NULL);
 	tar_filter_config("tar.tar.gz.command", "gzip -cn", NULL);
+	tar_filter_config("tar.tar.gz.remote", "true", NULL);
 	git_config(git_tar_config, NULL);
 	for (i = 0; i < nr_tar_filters; i++) {
 		/* omit any filters that never had a command configured */
diff --git a/archive-zip.c b/archive-zip.c
index 42df660..3c102a1 100644
--- a/archive-zip.c
+++ b/archive-zip.c
@@ -283,7 +283,7 @@ static int write_zip_archive(const struct archiver *ar,
 static struct archiver zip_archiver = {
 	"zip",
 	write_zip_archive,
-	ARCHIVER_WANT_COMPRESSION_LEVELS
+	ARCHIVER_WANT_COMPRESSION_LEVELS|ARCHIVER_REMOTE
 };
 
 void init_zip_archiver(void)
diff --git a/archive.c b/archive.c
index 41065a8..2a7a28e 100644
--- a/archive.c
+++ b/archive.c
@@ -299,7 +299,7 @@ static void parse_treeish_arg(const char **argv,
 
 static int parse_archive_args(int argc, const char **argv,
 		const struct archiver **ar, struct archiver_args *args,
-		const char *name_hint)
+		const char *name_hint, int is_remote)
 {
 	const char *format = NULL;
 	const char *base = NULL;
@@ -356,7 +356,8 @@ static int parse_archive_args(int argc, const char **argv,
 
 	if (list) {
 		for (i = 0; i < nr_archivers; i++)
-			printf("%s\n", archivers[i]->name);
+			if (!is_remote || archivers[i]->flags & ARCHIVER_REMOTE)
+				printf("%s\n", archivers[i]->name);
 		exit(0);
 	}
 
@@ -369,7 +370,7 @@ static int parse_archive_args(int argc, const char **argv,
 	if (argc < 1)
 		usage_with_options(archive_usage, opts);
 	*ar = lookup_archiver(format);
-	if (!*ar)
+	if (!*ar || (is_remote && !((*ar)->flags & ARCHIVER_REMOTE)))
 		die("Unknown archive format '%s'", format);
 
 	args->compression_level = Z_DEFAULT_COMPRESSION;
@@ -390,7 +391,7 @@ static int parse_archive_args(int argc, const char **argv,
 }
 
 int write_archive(int argc, const char **argv, const char *prefix,
-		  int setup_prefix, const char *name_hint)
+		  int setup_prefix, const char *name_hint, int remote)
 {
 	int nongit = 0;
 	const struct archiver *ar = NULL;
@@ -403,7 +404,7 @@ int write_archive(int argc, const char **argv, const char *prefix,
 	init_tar_archiver();
 	init_zip_archiver();
 
-	argc = parse_archive_args(argc, argv, &ar, &args, name_hint);
+	argc = parse_archive_args(argc, argv, &ar, &args, name_hint, remote);
 	if (nongit) {
 		/*
 		 * We know this will die() with an error, so we could just
diff --git a/archive.h b/archive.h
index 202d528..2b0884f 100644
--- a/archive.h
+++ b/archive.h
@@ -15,6 +15,7 @@ struct archiver_args {
 };
 
 #define ARCHIVER_WANT_COMPRESSION_LEVELS 1
+#define ARCHIVER_REMOTE 2
 struct archiver {
 	const char *name;
 	int (*write_archive)(const struct archiver *, struct archiver_args *);
@@ -29,7 +30,7 @@ extern void init_zip_archiver(void);
 typedef int (*write_archive_entry_fn_t)(struct archiver_args *args, const unsigned char *sha1, const char *path, size_t pathlen, unsigned int mode, void *buffer, unsigned long size);
 
 extern int write_archive_entries(struct archiver_args *args, write_archive_entry_fn_t write_entry);
-extern int write_archive(int argc, const char **argv, const char *prefix, int setup_prefix, const char *name_hint);
+extern int write_archive(int argc, const char **argv, const char *prefix, int setup_prefix, const char *name_hint, int remote);
 
 const char *archive_format_from_filename(const char *filename);
 
diff --git a/builtin/archive.c b/builtin/archive.c
index 2578cf5..883c009 100644
--- a/builtin/archive.c
+++ b/builtin/archive.c
@@ -106,5 +106,5 @@ int cmd_archive(int argc, const char **argv, const char *prefix)
 
 	setvbuf(stderr, NULL, _IOLBF, BUFSIZ);
 
-	return write_archive(argc, argv, prefix, 1, output);
+	return write_archive(argc, argv, prefix, 1, output, 0);
 }
diff --git a/builtin/upload-archive.c b/builtin/upload-archive.c
index e6bb97d..2d0b383 100644
--- a/builtin/upload-archive.c
+++ b/builtin/upload-archive.c
@@ -64,7 +64,7 @@ static int run_upload_archive(int argc, const char **argv, const char *prefix)
 	sent_argv[sent_argc] = NULL;
 
 	/* parse all options sent by the client */
-	return write_archive(sent_argc, sent_argv, prefix, 0, NULL);
+	return write_archive(sent_argc, sent_argv, prefix, 0, NULL, 1);
 }
 
 __attribute__((format (printf, 1, 2)))
diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh
index 070250e..9e3ba98 100755
--- a/t/t5000-tar-tree.sh
+++ b/t/t5000-tar-tree.sh
@@ -256,7 +256,8 @@ test_expect_success 'git-archive --prefix=olde-' '
 
 test_expect_success 'setup tar filters' '
 	git config tar.tar.foo.command "tr ab ba" &&
-	git config tar.bar.command "tr ab ba"
+	git config tar.bar.command "tr ab ba" &&
+	git config tar.bar.remote true
 '
 
 test_expect_success 'archive --list mentions user filter' '
@@ -265,9 +266,9 @@ test_expect_success 'archive --list mentions user filter' '
 	grep "^bar\$" output
 '
 
-test_expect_success 'archive --list shows remote user filters' '
+test_expect_success 'archive --list shows only enabled remote filters' '
 	git archive --list --remote=. >output &&
-	grep "^tar\.foo\$" output &&
+	! grep "^tar\.foo\$" output &&
 	grep "^bar\$" output
 '
 
@@ -297,6 +298,13 @@ test_expect_success 'extension matching requires dot' '
 	test_cmp b.tar config-implicittar.foo
 '
 
+test_expect_success 'only enabled filters are available remotely' '
+	test_must_fail git archive --remote=. --format=tar.foo HEAD \
+		>remote.tar.foo &&
+	git archive --remote=. --format=bar >remote.bar HEAD &&
+	test_cmp remote.bar config.bar
+'
+
 if $GZIP --version >/dev/null 2>&1; then
 	test_set_prereq GZIP
 else
@@ -333,4 +341,15 @@ test_expect_success GZIP,GUNZIP 'extract tgz file' '
 	test_cmp b.tar j.tar
 '
 
+test_expect_success GZIP 'remote tar.gz is allowed by default' '
+	git archive --remote=. --format=tar.gz HEAD >remote.tar.gz &&
+	test_cmp j.tgz remote.tar.gz
+'
+
+test_expect_success GZIP 'remote tar.gz can be disabled' '
+	git config tar.tar.gz.remote false &&
+	test_must_fail git archive --remote=. --format=tar.gz HEAD \
+		>remote.tar.gz
+'
+
 test_done
-- 
1.7.5.4.44.g4b107

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCHv2 7/9] archive: implement configurable tar filters
  2011-06-22  1:26                 ` [PATCHv2 7/9] archive: implement configurable tar filters Jeff King
  2011-06-22  1:45                   ` Jeff King
@ 2011-06-22  6:09                   ` René Scharfe
  2011-06-22 14:59                     ` Jeff King
  1 sibling, 1 reply; 56+ messages in thread
From: René Scharfe @ 2011-06-22  6:09 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, J.H., git, git-dev

Hello,

just a quick comment before I drop off the net for a few days.  I like
the series a lot, especially the refactorings in patches 1 to 6.

Am 22.06.2011 03:26, schrieb Jeff King:
> diff --git a/Documentation/git-archive.txt b/Documentation/git-archive.txt
> index 9c750e2..726bf63 100644
> --- a/Documentation/git-archive.txt
> +++ b/Documentation/git-archive.txt
> @@ -101,6 +101,16 @@ tar.umask::
>  	details.  If `--remote` is used then only the configuration of
>  	the remote repository takes effect.
>  
> +tar.<format>.command::

Would switching around format and "command" be better?

	[tar "command"]
		tar.gz = gzip -cn
		tar.xz = xz -c

> diff --git a/archive-tar.c b/archive-tar.c
> index bed9a9b..5c30747 100644
> --- a/archive-tar.c
> +++ b/archive-tar.c
> @@ -4,6 +4,7 @@
>  #include "cache.h"
>  #include "tar.h"
>  #include "archive.h"
> +#include "run-command.h"
>  
>  #define RECORDSIZE	(512)
>  #define BLOCKSIZE	(RECORDSIZE * 20)
> @@ -13,6 +14,9 @@ static unsigned long offset;
>  
>  static int tar_umask = 002;
>  
> +static int write_tar_filter_archive(const struct archiver *ar,
> +				    struct archiver_args *args);
> +
>  /* writes out the whole block, but only if it is full */
>  static void write_if_needed(void)
>  {
> @@ -220,6 +224,60 @@ static int write_global_extended_header(struct archiver_args *args)
>  	return err;
>  }
>  
> +static struct archiver **tar_filters;
> +static int nr_tar_filters;
> +static int alloc_tar_filters;
> +
> +static struct archiver *find_tar_filter(const char *name, int len)
> +{
> +	int i;
> +	for (i = 0; i < nr_tar_filters; i++) {
> +		struct archiver *ar = tar_filters[i];
> +		if (!strncmp(ar->name, name, len) && !ar->name[len])
> +			return ar;
> +	}
> +	return NULL;
> +}
> +
> +static int tar_filter_config(const char *var, const char *value, void *data)
> +{
> +	struct archiver *ar;
> +	const char *dot;
> +	const char *name;
> +	const char *type;
> +	int namelen;
> +
> +	if (prefixcmp(var, "tar."))
> +		return 0;
> +	dot = strrchr(var, '.');
> +	if (dot == var + 9)
> +		return 0;
> +
> +	name = var + 4;
> +	namelen = dot - name;
> +	type = dot + 1;
> +
> +	ar = find_tar_filter(name, namelen);
> +	if (!ar) {
> +		ar = xcalloc(1, sizeof(*ar));
> +		ar->name = xmemdupz(name, namelen);
> +		ar->write_archive = write_tar_filter_archive;
> +		ar->flags = ARCHIVER_WANT_COMPRESSION_LEVELS;
> +		ALLOC_GROW(tar_filters, nr_tar_filters + 1, alloc_tar_filters);
> +		tar_filters[nr_tar_filters++] = ar;
> +	}
> +
> +	if (!strcmp(type, "command")) {
> +		if (!value)
> +			return config_error_nonbool(var);
> +		free(ar->data);
> +		ar->data = xstrdup(value);
> +		return 0;
> +	}

Why not register it right here instead of adding it to the intermediate
list?  And are duplicates handled properly, e.g. system has "gzip -cn"
and local wants "gzip -c"?

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCHv2 7/9] archive: implement configurable tar filters
  2011-06-22  6:09                   ` René Scharfe
@ 2011-06-22 14:59                     ` Jeff King
  0 siblings, 0 replies; 56+ messages in thread
From: Jeff King @ 2011-06-22 14:59 UTC (permalink / raw)
  To: René Scharfe; +Cc: Junio C Hamano, J.H., git, git-dev

On Wed, Jun 22, 2011 at 08:09:51AM +0200, René Scharfe wrote:

> just a quick comment before I drop off the net for a few days.  I like
> the series a lot, especially the refactorings in patches 1 to 6.

Thanks. I didn't know if I was going overboard, but the result looked
cleaner to me, so it is good to have a second opinion.

> > +tar.<format>.command::
> 
> Would switching around format and "command" be better?
> 
> 	[tar "command"]
> 		tar.gz = gzip -cn
> 		tar.xz = xz -c

That violates our usual convention that the first and final components
are static, and the middle part can contain everything. So doing:

  git config tar.command.tar.gz "gzip -cn"

is going to end up as:

  [tar "command.tar"]
    gz = gzip -cn

Plus it doesn't leave room for any additional per-command config keys if
we want to add them in the future.

> > +	ar = find_tar_filter(name, namelen);
> > +	if (!ar) {
> > +		ar = xcalloc(1, sizeof(*ar));
> > +		ar->name = xmemdupz(name, namelen);
> > +		ar->write_archive = write_tar_filter_archive;
> > +		ar->flags = ARCHIVER_WANT_COMPRESSION_LEVELS;
> > +		ALLOC_GROW(tar_filters, nr_tar_filters + 1, alloc_tar_filters);
> > +		tar_filters[nr_tar_filters++] = ar;
> > +	}
> > +
> > +	if (!strcmp(type, "command")) {
> > +		if (!value)
> > +			return config_error_nonbool(var);
> > +		free(ar->data);
> > +		ar->data = xstrdup(value);
> > +		return 0;
> > +	}
> 
> Why not register it right here instead of adding it to the intermediate
> list?

If it were just this patch, you could do that. But as soon as you add
more keys (e.g., a later patch adds tar.*.remote), then you run into the
situation of getting only part of the config at a time, and maybe not
getting the full config for a command at all. For example:

  [tar "tar.gz"]
    remote = true

would make an archiver with no "command" set. We would need to
special-case it everywhere to ignore it when we looked at the list, or
later just remove it.  This patch takes the approach of having a
secondary list of all of the configured bits, and then only registering
those that are actually valid.

It also keeps the configured and builtin lists separate. Otherwise I
have to special-case:

  [tar "zip"]
    command = ...

to ignore the builtin zip archiver, which I think is not something we
want to be able to override in this way.

> And are duplicates handled properly, e.g. system has "gzip -cn"
> and local wants "gzip -c"?

Yes. We look up the archiver in the list of configured ones and
overwrite its command field (that's why the .tar.gz patch actually calls
the config parser as if you had those lines in your config file, instead
of registering static archiver structs).

I should probably include a test for that, though.

-Peff

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCHv2 3/9] archive: refactor list of archive formats
  2011-06-22  1:23                 ` [PATCHv2 3/9] archive: refactor list of archive formats Jeff King
@ 2011-06-23 17:05                   ` Thiago Farina
  2011-06-23 17:30                     ` Jeff King
  0 siblings, 1 reply; 56+ messages in thread
From: Thiago Farina @ 2011-06-23 17:05 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, René Scharfe, J.H., git, git-dev

On Tue, Jun 21, 2011 at 10:23 PM, Jeff King <peff@peff.net> wrote:
> Most of the tar and zip code was nicely split out into two
> abstracted files which knew only about their specific
> formats. The entry point to this code was a single "write
> archive" function.
>
> However, as these basic formats grow more complex (e.g., by
> handling multiple file extensions and format names), a
> static list of the entry point functions won't be enough.
> Instead, let's provide a way for the tar and zip code to
> tell the main archive code what they support by registering
> archiver names and functions.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> New in v2. This turns archivers more into proper objects, rather than a
> hard-coded list of functions, and makes the rest of the series much
> cleaner.
>
>  archive-tar.c |   16 +++++++++++++---
>  archive-zip.c |   13 ++++++++++++-
>  archive.c     |   33 +++++++++++++++++----------------
>  archive.h     |   17 ++++++++++-------
>  4 files changed, 52 insertions(+), 27 deletions(-)
>
> diff --git a/archive-tar.c b/archive-tar.c
> index 1ab1a2c..930375b 100644
> --- a/archive-tar.c
> +++ b/archive-tar.c
> @@ -234,12 +234,10 @@ static int git_tar_config(const char *var, const char *value, void *cb)
>        return 0;
>  }
>
> -int write_tar_archive(struct archiver_args *args)
> +static int write_tar_archive(struct archiver_args *args)
>  {
>        int err = 0;
>
> -       git_config(git_tar_config, NULL);
> -
>        if (args->commit_sha1)
>                err = write_global_extended_header(args);
>        if (!err)
> @@ -248,3 +246,15 @@ int write_tar_archive(struct archiver_args *args)
>                write_trailer();
>        return err;
>  }
> +
> +static struct archiver tar_archiver = {
> +       "tar",
> +       write_tar_archive,
> +       0
A named constant instead of 0, like you did with
ARCHIVER_WANT_COMPRESSION_LEVELS, would be better? 0 here means the
archiver does not want compression?

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCHv2 3/9] archive: refactor list of archive formats
  2011-06-23 17:05                   ` Thiago Farina
@ 2011-06-23 17:30                     ` Jeff King
  0 siblings, 0 replies; 56+ messages in thread
From: Jeff King @ 2011-06-23 17:30 UTC (permalink / raw)
  To: Thiago Farina; +Cc: Junio C Hamano, René Scharfe, J.H., git, git-dev

On Thu, Jun 23, 2011 at 02:05:35PM -0300, Thiago Farina wrote:

> > +static struct archiver tar_archiver = {
> > +       "tar",
> > +       write_tar_archive,
> > +       0
> A named constant instead of 0, like you did with
> ARCHIVER_WANT_COMPRESSION_LEVELS, would be better? 0 here means the
> archiver does not want compression?

It's actually a bit-wise flag, so it is not "no compression", but "no
flags". So "0" is fairly idiomatic. Given that it's a static initializer
that will default to 0, probably a more readable version would be:

  static struct archiver tar_archiver = {
          "tar",
          write_tar_archive
  };

-Peff

^ permalink raw reply	[flat|nested] 56+ messages in thread

end of thread, other threads:[~2011-06-23 17:31 UTC | newest]

Thread overview: 56+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-06-14 18:17 [PATCH 1/2] archive: factor out write phase of tar format Jeff King
2011-06-14 18:18 ` [PATCH 2/2] archive: support gzipped tar files Jeff King
2011-06-14 19:25   ` J.H.
2011-06-14 19:30     ` Jeff King
2011-06-14 19:39   ` René Scharfe
2011-06-14 20:14     ` Jeff King
2011-06-14 20:45       ` Jeff King
2011-06-15 22:30         ` [RFC/PATCH 0/7] user-configurable git-archive output formats Jeff King
2011-06-15 22:31           ` [PATCH 1/7] archive: reorder option parsing and config reading Jeff King
2011-06-15 22:33           ` [PATCH 2/7] archive: add user-configurable tar-filter infrastructure Jeff King
2011-06-15 23:33             ` Junio C Hamano
2011-06-16  0:29               ` Jeff King
2011-06-15 22:33           ` [PATCH 3/7] archive: support user tar-filters via --format Jeff King
2011-06-15 22:33           ` [PATCH 4/7] archive: advertise user tar-filters in --list Jeff King
2011-06-15 22:34           ` [PATCH 5/7] archive: refactor format-guessing from filename Jeff King
2011-06-15 23:48             ` Junio C Hamano
2011-06-16  0:34               ` Jeff King
2011-06-15 22:34           ` [PATCH 6/7] archive: match extensions from user-configured formats Jeff King
2011-06-15 22:35           ` [PATCH 7/7] archive: provide builtin .tar.gz filter Jeff King
2011-06-15 23:55             ` Junio C Hamano
2011-06-15 23:57               ` Junio C Hamano
2011-06-16  0:38               ` Jeff King
2011-06-16  6:27                 ` Junio C Hamano
2011-06-16  6:51                   ` Jeff King
2011-06-16  7:56                     ` Chris Webb
2011-06-16 17:46                       ` Jeff King
2011-06-16 18:02                         ` Junio C Hamano
2011-06-16 18:21                           ` Jeff King
2011-06-16 18:27                             ` John Szakmeister
2011-06-16 18:42                             ` Junio C Hamano
2011-06-16 18:57                               ` Jeff King
2011-06-18 14:52           ` [RFC/PATCH 0/7] user-configurable git-archive output formats René Scharfe
2011-06-18 15:28             ` Jakub Narebski
2011-06-20 15:58             ` Junio C Hamano
2011-06-22  1:19               ` [PATCHv2 0/9] configurable tar compressors Jeff King
2011-06-22  1:20                 ` [PATCHv2 1/9] archive: reorder option parsing and config reading Jeff King
2011-06-22  1:22                 ` [PATCHv2 2/9] archive-tar: don't reload default config options Jeff King
2011-06-22  1:23                 ` [PATCHv2 3/9] archive: refactor list of archive formats Jeff King
2011-06-23 17:05                   ` Thiago Farina
2011-06-23 17:30                     ` Jeff King
2011-06-22  1:24                 ` [PATCHv2 4/9] archive: pass archiver struct to write_archive callback Jeff King
2011-06-22  1:24                 ` [PATCHv2 5/9] archive: move file extension format-guessing lower Jeff King
2011-06-22  1:25                 ` [PATCHv2 6/9] archive: refactor file extension format-guessing Jeff King
2011-06-22  1:26                 ` [PATCHv2 7/9] archive: implement configurable tar filters Jeff King
2011-06-22  1:45                   ` Jeff King
2011-06-22  6:09                   ` René Scharfe
2011-06-22 14:59                     ` Jeff King
2011-06-22  1:27                 ` [PATCHv2 8/9] archive: provide builtin .tar.gz filter Jeff King
2011-06-22  1:35                 ` [PATCHv2 9/9] upload-archive: allow user to turn off filters Jeff King
2011-06-22  3:17                   ` Jeff King
2011-06-21 16:01             ` [RFC/PATCH 0/7] user-configurable git-archive output formats Jeff King
2011-06-18 15:40           ` René Scharfe
2011-06-14 20:30   ` [PATCH 2/2] archive: support gzipped tar files Junio C Hamano
2011-06-14 20:49     ` Jeff King
2011-06-14 23:40       ` Miles Bader
2011-06-15 22:46         ` Jeff King

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.