* [PATCH] import-tars: use Archive::Tar instead of unpack()
@ 2007-02-12 14:17 Michael Loeffler
2007-02-12 14:25 ` Johannes Schindelin
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Michael Loeffler @ 2007-02-12 14:17 UTC (permalink / raw)
To: git
this is less obscure, does not use gzcat (which is often installed as zcat)
and it is shorter.
Signed-off-by: Michael Loeffler <zvpunry@zvpunry.de>
---
This version does no longer support bzip2 or compress which will be fixed in
an amend. I did this patch to solve 2 problems. Maybe I do another patch with
GetoptLong and bzip2/compress support.
The first is a bug with this $git_mode variable which should be 0644 of
0755, but nothing else I think.
The second problem was the usage of gzcat, I don't have this link
(Debian sid).
diff --git a/contrib/fast-import/import-tars.perl b/contrib/fast-import/import-tars.perl
index 26c42c9..c084573 100755
--- a/contrib/fast-import/import-tars.perl
+++ b/contrib/fast-import/import-tars.perl
@@ -10,6 +10,10 @@
##
use strict;
+use Archive::Tar;
+use Archive::Tar::File;
+use Archive::Tar::Constant;
+
die "usage: import-tars *.tar.{gz,bz2,Z}\n" unless @ARGV;
my $branch_name = 'import-tars';
@@ -23,48 +27,25 @@ foreach my $tar_file (@ARGV)
{
$tar_file =~ m,([^/]+)$,;
my $tar_name = $1;
-
- if ($tar_name =~ s/\.(tar\.gz|tgz)$//) {
- open(I, '-|', 'gzcat', $tar_file) or die "Unable to gzcat $tar_file: $!\n";
- } elsif ($tar_name =~ s/\.(tar\.bz2|tbz2)$//) {
- open(I, '-|', 'bzcat', $tar_file) or die "Unable to bzcat $tar_file: $!\n";
- } elsif ($tar_name =~ s/\.tar\.Z$//) {
- open(I, '-|', 'zcat', $tar_file) or die "Unable to zcat $tar_file: $!\n";
- } elsif ($tar_name =~ s/\.tar$//) {
- open(I, $tar_file) or die "Unable to open $tar_file: $!\n";
- } else {
- die "Unrecognized compression format: $tar_file\n";
- }
+ $tar_name =~ s/\.(tar|tgz|tar\.gz)$//;
+ my $tar = new Archive::Tar($tar_file) or die "Unable to open $tar_file: $!\n";
my $commit_time = 0;
my $next_mark = 1;
my $have_top_dir = 1;
my ($top_dir, %files);
- while (read(I, $_, 512) == 512) {
- my ($name, $mode, $uid, $gid, $size, $mtime,
- $chksum, $typeflag, $linkname, $magic,
- $version, $uname, $gname, $devmajor, $devminor,
- $prefix) = unpack 'Z100 Z8 Z8 Z8 Z12 Z12
- Z8 Z1 Z100 Z6
- Z2 Z32 Z32 Z8 Z8 Z*', $_;
- last unless $name;
- $mode = oct $mode;
- $size = oct $size;
- $mtime = oct $mtime;
- next if $mode & 0040000;
-
- print FI "blob\n", "mark :$next_mark\n", "data $size\n";
- while ($size > 0 && read(I, $_, 512) == 512) {
- print FI substr($_, 0, $size);
- $size -= 512;
- }
- print FI "\n";
-
- my $path = "$prefix$name";
- $files{$path} = [$next_mark++, $mode];
-
- $commit_time = $mtime if $mtime > $commit_time;
+ foreach my $entry ($tar->get_files()) {
+ next if $entry->type != FILE;
+
+ printf FI "blob\nmark :%s\ndata %s\n%s\n", $next_mark,
+ $entry->size, $entry->get_content();
+
+ my $path = $entry->prefix . $entry->name;
+ $files{$path} = [$next_mark++, $entry->mode];
+
+ $commit_time = $entry->mtime if $entry->mtime > $commit_time;
+
$path =~ m,^([^/]+)/,;
$top_dir = $1 unless $top_dir;
$have_top_dir = 0 if $top_dir ne $1;
@@ -83,10 +64,8 @@ EOF
foreach my $path (keys %files)
{
my ($mark, $mode) = @{$files{$path}};
- my $git_mode = 0644;
- $git_mode |= 0700 if $mode & 0111;
$path =~ s,^([^/]+)/,, if $have_top_dir;
- printf FI "M %o :%i %s\n", $git_mode, $mark, $path;
+ printf FI "M %o :%i %s\n", $mode & 0111 ? 0755 : 0644, $mark, $path;
}
print FI "\n";
--
1.5.0.rc4
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] import-tars: use Archive::Tar instead of unpack()
2007-02-12 14:17 [PATCH] import-tars: use Archive::Tar instead of unpack() Michael Loeffler
@ 2007-02-12 14:25 ` Johannes Schindelin
2007-02-12 17:28 ` Shawn O. Pearce
[not found] ` <127B27FE-1F9A-4328-A87A-77B907FFEBA7@zvpunry.de>
2 siblings, 0 replies; 7+ messages in thread
From: Johannes Schindelin @ 2007-02-12 14:25 UTC (permalink / raw)
To: Michael Loeffler; +Cc: git
Hi,
On Mon, 12 Feb 2007, Michael Loeffler wrote:
> this is less obscure, does not use gzcat (which is often installed as
> zcat) and it is shorter.
... and it relies on a package which is usually not installed. Whereas you
can symbolically link zcat to gzcat.
Ciao,
Dscho
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] import-tars: use Archive::Tar instead of unpack()
2007-02-12 14:17 [PATCH] import-tars: use Archive::Tar instead of unpack() Michael Loeffler
2007-02-12 14:25 ` Johannes Schindelin
@ 2007-02-12 17:28 ` Shawn O. Pearce
2007-02-14 16:03 ` Michael Loeffler
[not found] ` <127B27FE-1F9A-4328-A87A-77B907FFEBA7@zvpunry.de>
2 siblings, 1 reply; 7+ messages in thread
From: Shawn O. Pearce @ 2007-02-12 17:28 UTC (permalink / raw)
To: Michael Loeffler; +Cc: git
Michael Loeffler <zvpunry@zvpunry.de> wrote:
> This version does no longer support bzip2 or compress which will be fixed in
> an amend. I did this patch to solve 2 problems. Maybe I do another patch with
> GetoptLong and bzip2/compress support.
bzip2 and compress are popular formats applied to tars.
> @@ -10,6 +10,10 @@
> ##
>
> use strict;
> +use Archive::Tar;
> +use Archive::Tar::File;
> +use Archive::Tar::Constant;
> +
I did not apply this hunk. Not everyone has Archive::Tar installed.
But then again, not everyone will use this example program either.
I'm debating it. Archive::Tar's parser will certainly be much
more robust than the one I hand-crafted. It might also let us deal
with symlinks. ;)
I would considering applying something like this if it would also
support at least bz2. This is an example program meant to teach
people how to use fast-import, and maybe also to help someone who
wants to quickly import one or more .tar.gz for use with git-grep.
Requiring Archive::Tar here is not the end of Git as we know it. :)
> @@ -83,10 +64,8 @@ EOF
> foreach my $path (keys %files)
> {
> my ($mark, $mode) = @{$files{$path}};
> - my $git_mode = 0644;
> - $git_mode |= 0700 if $mode & 0111;
> $path =~ s,^([^/]+)/,, if $have_top_dir;
> - printf FI "M %o :%i %s\n", $git_mode, $mark, $path;
> + printf FI "M %o :%i %s\n", $mode & 0111 ? 0755 : 0644, $mark, $path;
> }
> print FI "\n";
This hunk is completely unrelated to the Archive::Tar rewrite.
It also fixes a rather embarrassing bug on my part; I should
have been able to get the mode right! :-)
I've applied this hunk (and only this hunk) to my fastimport tree
and pushed it out.
--
Shawn.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] import-tars: use Archive::Tar instead of unpack()
2007-02-12 17:28 ` Shawn O. Pearce
@ 2007-02-14 16:03 ` Michael Loeffler
2007-02-15 2:51 ` Shawn O. Pearce
0 siblings, 1 reply; 7+ messages in thread
From: Michael Loeffler @ 2007-02-14 16:03 UTC (permalink / raw)
To: git
Am Montag, den 12.02.2007, 12:28 -0500 schrieb Shawn O. Pearce:
...
> bzip2 and compress are popular formats applied to tars.
yes, and this should be supported. There is an simple example in the
Archive::Tar manpage.
So something like "$tar = new Archive::Tar(*I)" is possible to use the
original filehandle but I removed this code before reading the whole
manpage.
>
> > @@ -10,6 +10,10 @@
> > ##
> >
> > use strict;
> > +use Archive::Tar;
> > +use Archive::Tar::File;
> > +use Archive::Tar::Constant;
> > +
>
> I did not apply this hunk. Not everyone has Archive::Tar installed.
And Archive::Tar is a little bit slower then your unpack() and it reads
the whole file into memory which is not so good.
> But then again, not everyone will use this example program either.
I have used it to play a bit with this nice fast-import stuff.
> I'm debating it. Archive::Tar's parser will certainly be much
> more robust than the one I hand-crafted. It might also let us deal
> with symlinks. ;)
Maybe, but I haven't tested it with some bigger tar files.
> I would considering applying something like this if it would also
> support at least bz2. This is an example program meant to teach
> people how to use fast-import, and maybe also to help someone who
> wants to quickly import one or more .tar.gz for use with git-grep.
> Requiring Archive::Tar here is not the end of Git as we know it. :)
I think we should stay with your unpack() version, it works, is faster
and doesn't read the whole file into memory.
In the Archive::Tar manpage they use gunzip and uncompress instead of
zcat or gzcat, so what do you think about the following patch?
diff --git a/contrib/fast-import/import-tars.perl b/contrib/fast-import/import-tars.perl
index 990c9e7..5585a8b 100755
--- a/contrib/fast-import/import-tars.perl
+++ b/contrib/fast-import/import-tars.perl
@@ -25,11 +25,14 @@ foreach my $tar_file (@ARGV)
my $tar_name = $1;
if ($tar_name =~ s/\.(tar\.gz|tgz)$//) {
- open(I, '-|', 'gzcat', $tar_file) or die "Unable to gzcat $tar_file: $!\n";
+ open(I, '-|', 'gunzip', '-c', $tar_file)
+ or die "Unable to gunzip -c $tar_file: $!\n";
} elsif ($tar_name =~ s/\.(tar\.bz2|tbz2)$//) {
- open(I, '-|', 'bzcat', $tar_file) or die "Unable to bzcat $tar_file: $!\n";
+ open(I, '-|', 'bunzip2', '-c', $tar_file)
+ or die "Unable to bunzip2 -c $tar_file: $!\n";
} elsif ($tar_name =~ s/\.tar\.Z$//) {
- open(I, '-|', 'zcat', $tar_file) or die "Unable to zcat $tar_file: $!\n";
+ open(I, '-|', 'uncompress', '-c', $tar_file)
+ or die "Unable to uncompress -c $tar_file: $!\n";
} elsif ($tar_name =~ s/\.tar$//) {
open(I, $tar_file) or die "Unable to open $tar_file: $!\n";
} else {
... the $git_mode patch ...
>
> This hunk is completely unrelated to the Archive::Tar rewrite.
Yes, I had this change first in another patch but i forgot to mail it to
the list.
...
> I've applied this hunk (and only this hunk) to my fastimport tree
> and pushed it out.
Good, I no longer think this Archive::Tar stuff is such a good idea.
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] import-tars: use Archive::Tar instead of unpack()
2007-02-14 16:03 ` Michael Loeffler
@ 2007-02-15 2:51 ` Shawn O. Pearce
0 siblings, 0 replies; 7+ messages in thread
From: Shawn O. Pearce @ 2007-02-15 2:51 UTC (permalink / raw)
To: Michael Loeffler; +Cc: git
Michael Loeffler <zvpunry@zvpunry.de> wrote:
> In the Archive::Tar manpage they use gunzip and uncompress instead of
> zcat or gzcat, so what do you think about the following patch?
>
> diff --git a/contrib/fast-import/import-tars.perl b/contrib/fast-import/import-tars.perl
> index 990c9e7..5585a8b 100755
> --- a/contrib/fast-import/import-tars.perl
> +++ b/contrib/fast-import/import-tars.perl
> @@ -25,11 +25,14 @@ foreach my $tar_file (@ARGV)
> my $tar_name = $1;
>
> if ($tar_name =~ s/\.(tar\.gz|tgz)$//) {
> - open(I, '-|', 'gzcat', $tar_file) or die "Unable to gzcat $tar_file: $!\n";
> + open(I, '-|', 'gunzip', '-c', $tar_file)
> + or die "Unable to gunzip -c $tar_file: $!\n";
> } elsif ($tar_name =~ s/\.(tar\.bz2|tbz2)$//) {
> - open(I, '-|', 'bzcat', $tar_file) or die "Unable to bzcat $tar_file: $!\n";
> + open(I, '-|', 'bunzip2', '-c', $tar_file)
> + or die "Unable to bunzip2 -c $tar_file: $!\n";
> } elsif ($tar_name =~ s/\.tar\.Z$//) {
> - open(I, '-|', 'zcat', $tar_file) or die "Unable to zcat $tar_file: $!\n";
> + open(I, '-|', 'uncompress', '-c', $tar_file)
> + or die "Unable to uncompress -c $tar_file: $!\n";
> } elsif ($tar_name =~ s/\.tar$//) {
> open(I, $tar_file) or die "Unable to open $tar_file: $!\n";
> } else {
Yes, that's what I should have done initially. Thanks. How about
the following message and sbo?
commit 908387056949c0fb8153fbb84f4dbeb6695611e6
Author: Michael Loeffler <zvpunry@zvpunry.de>
Date: Wed Feb 14 17:03:12 2007 +0100
Use gunzip -c over gzcat in import-tars example.
Not everyone has gzcat or bzcat installed on their system, but
gunzip -c and bunzip2 -c perform the same task and are available
if the user has installed gzip support or bzip2 support.
Signed-off-by: Michael Loeffler <zvpunry@zvpunry.de>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
--
Shawn.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] import-tars: use Archive::Tar instead of unpack()
[not found] ` <127B27FE-1F9A-4328-A87A-77B907FFEBA7@zvpunry.de>
@ 2007-04-24 10:13 ` Karl Hasselström
2007-04-24 10:55 ` Sam Vilain
0 siblings, 1 reply; 7+ messages in thread
From: Karl Hasselström @ 2007-04-24 10:13 UTC (permalink / raw)
To: Michael Loeffler; +Cc: git
On 2007-04-24 10:39:54 +0200, Michael Loeffler wrote:
> Search for this mail and try the patch.
>
> > Von: Michael Loeffler <zvpunry@zvpunry.de>
> > Datum: 12. Februar 2007 15:17:11 MEZ
> > An: git@vger.kernel.org
> > Betreff: [PATCH] import-tars: use Archive::Tar instead of unpack()
Sorry, the same objection as Shawn raised in the original thread
applies here too: I don't have Archive::Tar installed! :-(
I might try it later, but I don't really have the time to chase after
that dependency right now. Sorry.
--
Karl Hasselström, kha@treskal.com
www.treskal.com/kalle
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] import-tars: use Archive::Tar instead of unpack()
2007-04-24 10:13 ` Karl Hasselström
@ 2007-04-24 10:55 ` Sam Vilain
0 siblings, 0 replies; 7+ messages in thread
From: Sam Vilain @ 2007-04-24 10:55 UTC (permalink / raw)
To: Karl Hasselström; +Cc: Michael Loeffler, git
Karl Hasselström wrote:
> Sorry, the same objection as Shawn raised in the original thread
> applies here too: I don't have Archive::Tar installed! :-(
>
> I might try it later, but I don't really have the time to chase after
> that dependency right now. Sorry.
>
How about allowing dependencies, so long as they are included as submodules?
Sam.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2007-04-24 10:56 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-02-12 14:17 [PATCH] import-tars: use Archive::Tar instead of unpack() Michael Loeffler
2007-02-12 14:25 ` Johannes Schindelin
2007-02-12 17:28 ` Shawn O. Pearce
2007-02-14 16:03 ` Michael Loeffler
2007-02-15 2:51 ` Shawn O. Pearce
[not found] ` <127B27FE-1F9A-4328-A87A-77B907FFEBA7@zvpunry.de>
2007-04-24 10:13 ` Karl Hasselström
2007-04-24 10:55 ` Sam Vilain
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.