All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] import-tars: use Archive::Tar instead of unpack()
@ 2007-02-12 14:17 Michael Loeffler
  2007-02-12 14:25 ` Johannes Schindelin
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Michael Loeffler @ 2007-02-12 14:17 UTC (permalink / raw)
  To: git

this is less obscure, does not use gzcat (which is often installed as zcat)
and it is shorter.

Signed-off-by: Michael Loeffler <zvpunry@zvpunry.de>
---

This version does no longer support bzip2 or compress which will be fixed in
an amend. I did this patch to solve 2 problems. Maybe I do another patch with
GetoptLong and bzip2/compress support.

The first is a bug with this $git_mode variable which should be 0644 of
0755, but nothing else I think.

The second problem was the usage of gzcat, I don't have this link
(Debian sid).

diff --git a/contrib/fast-import/import-tars.perl b/contrib/fast-import/import-tars.perl
index 26c42c9..c084573 100755
--- a/contrib/fast-import/import-tars.perl
+++ b/contrib/fast-import/import-tars.perl
@@ -10,6 +10,10 @@
 ##
 
 use strict;
+use Archive::Tar;
+use Archive::Tar::File;
+use Archive::Tar::Constant;
+
 die "usage: import-tars *.tar.{gz,bz2,Z}\n" unless @ARGV;
 
 my $branch_name = 'import-tars';
@@ -23,48 +27,25 @@ foreach my $tar_file (@ARGV)
 {
 	$tar_file =~ m,([^/]+)$,;
 	my $tar_name = $1;
-
-	if ($tar_name =~ s/\.(tar\.gz|tgz)$//) {
-		open(I, '-|', 'gzcat', $tar_file) or die "Unable to gzcat $tar_file: $!\n";
-	} elsif ($tar_name =~ s/\.(tar\.bz2|tbz2)$//) {
-		open(I, '-|', 'bzcat', $tar_file) or die "Unable to bzcat $tar_file: $!\n";
-	} elsif ($tar_name =~ s/\.tar\.Z$//) {
-		open(I, '-|', 'zcat', $tar_file) or die "Unable to zcat $tar_file: $!\n";
-	} elsif ($tar_name =~ s/\.tar$//) {
-		open(I, $tar_file) or die "Unable to open $tar_file: $!\n";
-	} else {
-		die "Unrecognized compression format: $tar_file\n";
-	}
+	$tar_name =~ s/\.(tar|tgz|tar\.gz)$//;
+	my $tar = new Archive::Tar($tar_file) or die "Unable to open $tar_file: $!\n";
 
 	my $commit_time = 0;
 	my $next_mark = 1;
 	my $have_top_dir = 1;
 	my ($top_dir, %files);
 
-	while (read(I, $_, 512) == 512) {
-		my ($name, $mode, $uid, $gid, $size, $mtime,
-			$chksum, $typeflag, $linkname, $magic,
-			$version, $uname, $gname, $devmajor, $devminor,
-			$prefix) = unpack 'Z100 Z8 Z8 Z8 Z12 Z12
-			Z8 Z1 Z100 Z6
-			Z2 Z32 Z32 Z8 Z8 Z*', $_;
-		last unless $name;
-		$mode = oct $mode;
-		$size = oct $size;
-		$mtime = oct $mtime;
-		next if $mode & 0040000;
-
-		print FI "blob\n", "mark :$next_mark\n", "data $size\n";
-		while ($size > 0 && read(I, $_, 512) == 512) {
-			print FI substr($_, 0, $size);
-			$size -= 512;
-		}
-		print FI "\n";
-
-		my $path = "$prefix$name";
-		$files{$path} = [$next_mark++, $mode];
-
-		$commit_time = $mtime if $mtime > $commit_time;
+	foreach my $entry ($tar->get_files()) {
+		next if $entry->type != FILE;
+
+		printf FI "blob\nmark :%s\ndata %s\n%s\n", $next_mark,
+			$entry->size, $entry->get_content();
+
+		my $path = $entry->prefix . $entry->name;
+		$files{$path} = [$next_mark++, $entry->mode];
+
+		$commit_time = $entry->mtime if $entry->mtime > $commit_time;
+
 		$path =~ m,^([^/]+)/,;
 		$top_dir = $1 unless $top_dir;
 		$have_top_dir = 0 if $top_dir ne $1;
@@ -83,10 +64,8 @@ EOF
 	foreach my $path (keys %files)
 	{
 		my ($mark, $mode) = @{$files{$path}};
-		my $git_mode = 0644;
-		$git_mode |= 0700 if $mode & 0111;
 		$path =~ s,^([^/]+)/,, if $have_top_dir;
-		printf FI "M %o :%i %s\n", $git_mode, $mark, $path;
+		printf FI "M %o :%i %s\n", $mode & 0111 ? 0755 : 0644, $mark, $path;
 	}
 	print FI "\n";
 
-- 
1.5.0.rc4

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] import-tars: use Archive::Tar instead of unpack()
  2007-02-12 14:17 [PATCH] import-tars: use Archive::Tar instead of unpack() Michael Loeffler
@ 2007-02-12 14:25 ` Johannes Schindelin
  2007-02-12 17:28 ` Shawn O. Pearce
       [not found] ` <127B27FE-1F9A-4328-A87A-77B907FFEBA7@zvpunry.de>
  2 siblings, 0 replies; 7+ messages in thread
From: Johannes Schindelin @ 2007-02-12 14:25 UTC (permalink / raw)
  To: Michael Loeffler; +Cc: git

Hi,

On Mon, 12 Feb 2007, Michael Loeffler wrote:

> this is less obscure, does not use gzcat (which is often installed as 
> zcat) and it is shorter.

... and it relies on a package which is usually not installed. Whereas you 
can symbolically link zcat to gzcat.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] import-tars: use Archive::Tar instead of unpack()
  2007-02-12 14:17 [PATCH] import-tars: use Archive::Tar instead of unpack() Michael Loeffler
  2007-02-12 14:25 ` Johannes Schindelin
@ 2007-02-12 17:28 ` Shawn O. Pearce
  2007-02-14 16:03   ` Michael Loeffler
       [not found] ` <127B27FE-1F9A-4328-A87A-77B907FFEBA7@zvpunry.de>
  2 siblings, 1 reply; 7+ messages in thread
From: Shawn O. Pearce @ 2007-02-12 17:28 UTC (permalink / raw)
  To: Michael Loeffler; +Cc: git

Michael Loeffler <zvpunry@zvpunry.de> wrote:
> This version does no longer support bzip2 or compress which will be fixed in
> an amend. I did this patch to solve 2 problems. Maybe I do another patch with
> GetoptLong and bzip2/compress support.

bzip2 and compress are popular formats applied to tars.
 
> @@ -10,6 +10,10 @@
>  ##
>  
>  use strict;
> +use Archive::Tar;
> +use Archive::Tar::File;
> +use Archive::Tar::Constant;
> +

I did not apply this hunk.  Not everyone has Archive::Tar installed.
But then again, not everyone will use this example program either.
I'm debating it.  Archive::Tar's parser will certainly be much
more robust than the one I hand-crafted.  It might also let us deal
with symlinks.  ;)

I would considering applying something like this if it would also
support at least bz2.  This is an example program meant to teach
people how to use fast-import, and maybe also to help someone who
wants to quickly import one or more .tar.gz for use with git-grep.
Requiring Archive::Tar here is not the end of Git as we know it. :)

> @@ -83,10 +64,8 @@ EOF
>  	foreach my $path (keys %files)
>  	{
>  		my ($mark, $mode) = @{$files{$path}};
> -		my $git_mode = 0644;
> -		$git_mode |= 0700 if $mode & 0111;
>  		$path =~ s,^([^/]+)/,, if $have_top_dir;
> -		printf FI "M %o :%i %s\n", $git_mode, $mark, $path;
> +		printf FI "M %o :%i %s\n", $mode & 0111 ? 0755 : 0644, $mark, $path;
>  	}
>  	print FI "\n";

This hunk is completely unrelated to the Archive::Tar rewrite.
It also fixes a rather embarrassing bug on my part; I should
have been able to get the mode right!  :-)

I've applied this hunk (and only this hunk) to my fastimport tree
and pushed it out.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] import-tars: use Archive::Tar instead of unpack()
  2007-02-12 17:28 ` Shawn O. Pearce
@ 2007-02-14 16:03   ` Michael Loeffler
  2007-02-15  2:51     ` Shawn O. Pearce
  0 siblings, 1 reply; 7+ messages in thread
From: Michael Loeffler @ 2007-02-14 16:03 UTC (permalink / raw)
  To: git

Am Montag, den 12.02.2007, 12:28 -0500 schrieb Shawn O. Pearce:
... 
> bzip2 and compress are popular formats applied to tars.
yes, and this should be supported. There is an simple example in the
Archive::Tar manpage.

So something like "$tar = new Archive::Tar(*I)" is possible to use the
original filehandle but I removed this code before reading the whole
manpage.

>  
> > @@ -10,6 +10,10 @@
> >  ##
> >  
> >  use strict;
> > +use Archive::Tar;
> > +use Archive::Tar::File;
> > +use Archive::Tar::Constant;
> > +
> 
> I did not apply this hunk.  Not everyone has Archive::Tar installed.
And Archive::Tar is a little bit slower then your unpack() and it reads
the whole file into memory which is not so good.

> But then again, not everyone will use this example program either.
I have used it to play a bit with this nice fast-import stuff.

> I'm debating it.  Archive::Tar's parser will certainly be much
> more robust than the one I hand-crafted.  It might also let us deal
> with symlinks.  ;)
Maybe, but I haven't tested it with some bigger tar files.

> I would considering applying something like this if it would also
> support at least bz2.  This is an example program meant to teach
> people how to use fast-import, and maybe also to help someone who
> wants to quickly import one or more .tar.gz for use with git-grep.
> Requiring Archive::Tar here is not the end of Git as we know it. :)
I think we should stay with your unpack() version, it works, is faster
and doesn't read the whole file into memory.

In the Archive::Tar manpage they use gunzip and uncompress instead of
zcat or gzcat, so what do you think about the following patch?

diff --git a/contrib/fast-import/import-tars.perl b/contrib/fast-import/import-tars.perl
index 990c9e7..5585a8b 100755
--- a/contrib/fast-import/import-tars.perl
+++ b/contrib/fast-import/import-tars.perl
@@ -25,11 +25,14 @@ foreach my $tar_file (@ARGV)
 	my $tar_name = $1;
 
 	if ($tar_name =~ s/\.(tar\.gz|tgz)$//) {
-		open(I, '-|', 'gzcat', $tar_file) or die "Unable to gzcat $tar_file: $!\n";
+		open(I, '-|', 'gunzip', '-c', $tar_file)
+			or die "Unable to gunzip -c $tar_file: $!\n";
 	} elsif ($tar_name =~ s/\.(tar\.bz2|tbz2)$//) {
-		open(I, '-|', 'bzcat', $tar_file) or die "Unable to bzcat $tar_file: $!\n";
+		open(I, '-|', 'bunzip2', '-c', $tar_file)
+			or die "Unable to bunzip2 -c $tar_file: $!\n";
 	} elsif ($tar_name =~ s/\.tar\.Z$//) {
-		open(I, '-|', 'zcat', $tar_file) or die "Unable to zcat $tar_file: $!\n";
+		open(I, '-|', 'uncompress', '-c', $tar_file)
+			or die "Unable to uncompress -c $tar_file: $!\n";
 	} elsif ($tar_name =~ s/\.tar$//) {
 		open(I, $tar_file) or die "Unable to open $tar_file: $!\n";
 	} else {


... the $git_mode patch ...
> 
> This hunk is completely unrelated to the Archive::Tar rewrite.
Yes, I had this change first in another patch but i forgot to mail it to
the list.

...
> I've applied this hunk (and only this hunk) to my fastimport tree
> and pushed it out.
Good, I no longer think this Archive::Tar stuff is such a good idea.

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] import-tars: use Archive::Tar instead of unpack()
  2007-02-14 16:03   ` Michael Loeffler
@ 2007-02-15  2:51     ` Shawn O. Pearce
  0 siblings, 0 replies; 7+ messages in thread
From: Shawn O. Pearce @ 2007-02-15  2:51 UTC (permalink / raw)
  To: Michael Loeffler; +Cc: git

Michael Loeffler <zvpunry@zvpunry.de> wrote:
> In the Archive::Tar manpage they use gunzip and uncompress instead of
> zcat or gzcat, so what do you think about the following patch?
> 
> diff --git a/contrib/fast-import/import-tars.perl b/contrib/fast-import/import-tars.perl
> index 990c9e7..5585a8b 100755
> --- a/contrib/fast-import/import-tars.perl
> +++ b/contrib/fast-import/import-tars.perl
> @@ -25,11 +25,14 @@ foreach my $tar_file (@ARGV)
>  	my $tar_name = $1;
>  
>  	if ($tar_name =~ s/\.(tar\.gz|tgz)$//) {
> -		open(I, '-|', 'gzcat', $tar_file) or die "Unable to gzcat $tar_file: $!\n";
> +		open(I, '-|', 'gunzip', '-c', $tar_file)
> +			or die "Unable to gunzip -c $tar_file: $!\n";
>  	} elsif ($tar_name =~ s/\.(tar\.bz2|tbz2)$//) {
> -		open(I, '-|', 'bzcat', $tar_file) or die "Unable to bzcat $tar_file: $!\n";
> +		open(I, '-|', 'bunzip2', '-c', $tar_file)
> +			or die "Unable to bunzip2 -c $tar_file: $!\n";
>  	} elsif ($tar_name =~ s/\.tar\.Z$//) {
> -		open(I, '-|', 'zcat', $tar_file) or die "Unable to zcat $tar_file: $!\n";
> +		open(I, '-|', 'uncompress', '-c', $tar_file)
> +			or die "Unable to uncompress -c $tar_file: $!\n";
>  	} elsif ($tar_name =~ s/\.tar$//) {
>  		open(I, $tar_file) or die "Unable to open $tar_file: $!\n";
>  	} else {

Yes, that's what I should have done initially.  Thanks.  How about
the following message and sbo?

commit 908387056949c0fb8153fbb84f4dbeb6695611e6
Author: Michael Loeffler <zvpunry@zvpunry.de>
Date:   Wed Feb 14 17:03:12 2007 +0100

    Use gunzip -c over gzcat in import-tars example.
    
    Not everyone has gzcat or bzcat installed on their system, but
    gunzip -c and bunzip2 -c perform the same task and are available
    if the user has installed gzip support or bzip2 support.
    
    Signed-off-by: Michael Loeffler <zvpunry@zvpunry.de>
    Signed-off-by: Shawn O. Pearce <spearce@spearce.org>

-- 
Shawn.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] import-tars: use Archive::Tar instead of unpack()
       [not found] ` <127B27FE-1F9A-4328-A87A-77B907FFEBA7@zvpunry.de>
@ 2007-04-24 10:13   ` Karl Hasselström
  2007-04-24 10:55     ` Sam Vilain
  0 siblings, 1 reply; 7+ messages in thread
From: Karl Hasselström @ 2007-04-24 10:13 UTC (permalink / raw)
  To: Michael Loeffler; +Cc: git

On 2007-04-24 10:39:54 +0200, Michael Loeffler wrote:

> Search for this mail and try the patch.
>
> > Von: Michael Loeffler <zvpunry@zvpunry.de>
> > Datum: 12. Februar 2007 15:17:11 MEZ
> > An: git@vger.kernel.org
> > Betreff: [PATCH] import-tars: use Archive::Tar instead of unpack()

Sorry, the same objection as Shawn raised in the original thread
applies here too: I don't have Archive::Tar installed! :-(

I might try it later, but I don't really have the time to chase after
that dependency right now. Sorry.

-- 
Karl Hasselström, kha@treskal.com
      www.treskal.com/kalle

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] import-tars: use Archive::Tar instead of unpack()
  2007-04-24 10:13   ` Karl Hasselström
@ 2007-04-24 10:55     ` Sam Vilain
  0 siblings, 0 replies; 7+ messages in thread
From: Sam Vilain @ 2007-04-24 10:55 UTC (permalink / raw)
  To: Karl Hasselström; +Cc: Michael Loeffler, git

Karl Hasselström wrote:
> Sorry, the same objection as Shawn raised in the original thread
> applies here too: I don't have Archive::Tar installed! :-(
>
> I might try it later, but I don't really have the time to chase after
> that dependency right now. Sorry.
>   

How about allowing dependencies, so long as they are included as submodules?

Sam.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2007-04-24 10:56 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-02-12 14:17 [PATCH] import-tars: use Archive::Tar instead of unpack() Michael Loeffler
2007-02-12 14:25 ` Johannes Schindelin
2007-02-12 17:28 ` Shawn O. Pearce
2007-02-14 16:03   ` Michael Loeffler
2007-02-15  2:51     ` Shawn O. Pearce
     [not found] ` <127B27FE-1F9A-4328-A87A-77B907FFEBA7@zvpunry.de>
2007-04-24 10:13   ` Karl Hasselström
2007-04-24 10:55     ` Sam Vilain

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.