* [SCRIPT] chomp: trim trailing whitespace
@ 2006-05-27 2:27 Jeff Garzik
2006-05-27 4:17 ` H. Peter Anvin
` (2 more replies)
0 siblings, 3 replies; 14+ messages in thread
From: Jeff Garzik @ 2006-05-27 2:27 UTC (permalink / raw)
To: Git Mailing List; +Cc: Linux Kernel
[-- Attachment #1: Type: text/plain, Size: 323 bytes --]
Attached to this email is chomp.pl, a Perl script which removes trailing
whitespace from several files. I've had this for years, as trailing
whitespace is one of my pet peeves.
Now that git-applymbox complains loudly whenever a patch adds trailing
whitespace, I figured this script may be useful to others.
Jeff
[-- Attachment #2: chomp.pl --]
[-- Type: application/x-perl, Size: 1043 bytes --]
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [SCRIPT] chomp: trim trailing whitespace
2006-05-27 2:27 [SCRIPT] chomp: trim trailing whitespace Jeff Garzik
@ 2006-05-27 4:17 ` H. Peter Anvin
2006-05-27 11:42 ` Jeff Garzik
2006-05-27 10:15 ` Jan Engelhardt
2006-05-27 15:28 ` Martin Langhoff
2 siblings, 1 reply; 14+ messages in thread
From: H. Peter Anvin @ 2006-05-27 4:17 UTC (permalink / raw)
To: Jeff Garzik, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 585 bytes --]
Jeff Garzik wrote:
>
> Attached to this email is chomp.pl, a Perl script which removes trailing
> whitespace from several files. I've had this for years, as trailing
> whitespace is one of my pet peeves.
>
> Now that git-applymbox complains loudly whenever a patch adds trailing
> whitespace, I figured this script may be useful to others.
>
This is the script I use for the same purpose. It's a bit more
sophisticated, in that it detects and avoids binary files, and doesn't
throw an error if it encounters a directory (which can happen if you
give it a wildcard.)
-hpa
[-- Attachment #2: cleanfile --]
[-- Type: text/plain, Size: 1126 bytes --]
#!/usr/bin/perl
#
# Clean a text file of stealth whitespace
#
use bytes;
$name = 'cleanfile';
foreach $f ( @ARGV ) {
print STDERR "$name: $f\n";
if (! -f $f) {
print STDERR "$f: not a file\n";
next;
}
if (!open(FILE, '+<', $f)) {
print STDERR "$name: Cannot open file: $f: $!\n";
next;
}
binmode FILE;
# First, verify that it is not a binary file
$is_binary = 0;
while (read(FILE, $data, 65536) > 0) {
if ($data =~ /\0/) {
$is_binary = 1;
last;
}
}
if ($is_binary) {
print STDERR "$name: $f: binary file\n";
next;
}
seek(FILE, 0, 0);
@blanks = ();
@lines = ();
while ( defined($line = <FILE>) ) {
$line =~ s/[ \t\r\n]*$/\n/;
if ( $line eq "\n" ) {
push(@blanks, $line);
} else {
push(@lines, @blanks);
push(@lines, $line);
@blanks = ();
}
}
# Any blanks at the end of the file are discarded
seek(FILE, 0, 0);
print FILE @lines;
if ( !defined($where = tell(FILE)) ||
!truncate(FILE, $where) ) {
die "$name: Failed to truncate modified file: $f: $!\n";
}
close(FILE);
}
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [SCRIPT] chomp: trim trailing whitespace
2006-05-27 2:27 [SCRIPT] chomp: trim trailing whitespace Jeff Garzik
2006-05-27 4:17 ` H. Peter Anvin
@ 2006-05-27 10:15 ` Jan Engelhardt
2006-05-27 10:24 ` Thomas Glanzmann
2006-05-27 11:32 ` Jeff Garzik
2006-05-27 15:28 ` Martin Langhoff
2 siblings, 2 replies; 14+ messages in thread
From: Jan Engelhardt @ 2006-05-27 10:15 UTC (permalink / raw)
To: Jeff Garzik; +Cc: Git Mailing List, Linux Kernel
> Attached to this email is chomp.pl, a Perl script which removes trailing
> whitespace from several files. I've had this for years, as trailing whitespace
> is one of my pet peeves.
>
> Now that git-applymbox complains loudly whenever a patch adds trailing
> whitespace, I figured this script may be useful to others.
>
Pretty long script. How about this two-liner? It does not show 'bytes
chomped' but it also trims trailing whitespace.
#!/usr/bin/perl -i -p
s/[ \t\r\n]+$//
Jan Engelhardt
--
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [SCRIPT] chomp: trim trailing whitespace
2006-05-27 10:15 ` Jan Engelhardt
@ 2006-05-27 10:24 ` Thomas Glanzmann
2006-05-27 10:36 ` Neil Brown
2006-05-27 11:32 ` Jeff Garzik
1 sibling, 1 reply; 14+ messages in thread
From: Thomas Glanzmann @ 2006-05-27 10:24 UTC (permalink / raw)
To: Linux Kernel; +Cc: GIT, Jan Engelhardt
Hello,
> #!/usr/bin/perl -i -p
> s/[ \t\r\n]+$//
perl -p -i -e 's/\s+$//' file1 file2 file3 ...
Thomas
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [SCRIPT] chomp: trim trailing whitespace
2006-05-27 10:24 ` Thomas Glanzmann
@ 2006-05-27 10:36 ` Neil Brown
0 siblings, 0 replies; 14+ messages in thread
From: Neil Brown @ 2006-05-27 10:36 UTC (permalink / raw)
To: Thomas Glanzmann; +Cc: Linux Kernel, GIT, Jan Engelhardt
On Saturday May 27, sithglan@stud.uni-erlangen.de wrote:
> Hello,
>
> > #!/usr/bin/perl -i -p
> > s/[ \t\r\n]+$//
>
> perl -p -i -e 's/\s+$//' file1 file2 file3 ...
>
Uhm... have either of you actually tried those? When I tried, I lose
all the '\n' characters :-(
perl -pi -e 's/[ \t\r]+$//' *.[ch]
seems to actually work.
NeilBrown
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [SCRIPT] chomp: trim trailing whitespace
2006-05-27 10:15 ` Jan Engelhardt
2006-05-27 10:24 ` Thomas Glanzmann
@ 2006-05-27 11:32 ` Jeff Garzik
2006-05-27 11:48 ` Dmitry Fedorov
2006-05-27 12:42 ` Jan Engelhardt
1 sibling, 2 replies; 14+ messages in thread
From: Jeff Garzik @ 2006-05-27 11:32 UTC (permalink / raw)
To: Jan Engelhardt; +Cc: Git Mailing List, Linux Kernel
Jan Engelhardt wrote:
>> Attached to this email is chomp.pl, a Perl script which removes trailing
>> whitespace from several files. I've had this for years, as trailing whitespace
>> is one of my pet peeves.
>>
>> Now that git-applymbox complains loudly whenever a patch adds trailing
>> whitespace, I figured this script may be useful to others.
>>
>
> Pretty long script. How about this two-liner? It does not show 'bytes
> chomped' but it also trims trailing whitespace.
>
> #!/usr/bin/perl -i -p
> s/[ \t\r\n]+$//
Yes, it does, but a bit too aggressive for what we need :)
Jeff
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [SCRIPT] chomp: trim trailing whitespace
2006-05-27 4:17 ` H. Peter Anvin
@ 2006-05-27 11:42 ` Jeff Garzik
2006-05-28 9:24 ` H. Peter Anvin
0 siblings, 1 reply; 14+ messages in thread
From: Jeff Garzik @ 2006-05-27 11:42 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: linux-kernel
H. Peter Anvin wrote:
> Jeff Garzik wrote:
>>
>> Attached to this email is chomp.pl, a Perl script which removes
>> trailing whitespace from several files. I've had this for years, as
>> trailing whitespace is one of my pet peeves.
>>
>> Now that git-applymbox complains loudly whenever a patch adds trailing
>> whitespace, I figured this script may be useful to others.
>>
>
> This is the script I use for the same purpose. It's a bit more
> sophisticated, in that it detects and avoids binary files, and doesn't
> throw an error if it encounters a directory (which can happen if you
> give it a wildcard.)
Chewing the EOF blanks is nice. The only nit I have is that your script
rewrites the file even if nothing was changed.
Jeff
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [SCRIPT] chomp: trim trailing whitespace
2006-05-27 11:32 ` Jeff Garzik
@ 2006-05-27 11:48 ` Dmitry Fedorov
2006-05-27 12:42 ` Jan Engelhardt
1 sibling, 0 replies; 14+ messages in thread
From: Dmitry Fedorov @ 2006-05-27 11:48 UTC (permalink / raw)
To: Git Mailing List, Linux Kernel
[-- Attachment #1: Type: text/plain, Size: 226 bytes --]
Jan Engelhardt wrote:
>> Attached to this email is chomp.pl, a Perl script which removes trailing
>> whitespace from several files. I've had this for years, as
trailing whitespace
>> is one of my pet peeves.
And my scripts.
[-- Attachment #2: find-text-files --]
[-- Type: application/octet-stream, Size: 7833 bytes --]
#!/usr/bin/perl -w
=head1 NAME
find-text-files - traverse a file tree and guess plain text files
=head1 SYNOPSIS
find-text-files [options] dir ...
=head1 DESCRIPTION
This program traverse a file tree, guess plain text files
and outputs their names to STDOUT.
=cut
require 5.004;
use strict;
use integer;
use File::Find;
use Getopt::Long;
use IPC::Open2;
sub usage {
warn "\n".join(" ", @_)."\n" if @_;
warn <<EOF;
Usage:
find-text-files [-exclude='perlre' ...] [-include='perlre' ...] \
[-total] [-excluded] [-included] [-selectors] \
dir ...
EOF
exit(1);
}
=head1 PARAMETERS
=over 4
=item dir ...
Directories list.
=back
=cut
=head1 OPTIONS
=over 4
=item -exclude='perlre' ...
Perl regular expression, case insensitive.
Matched file names excluded from output list.
=item -include='perlre' ...
Perl regular expression, case insensitive.
Matched file names included to output list.
=head2 Note
Directory part of the file name stripped before match,
'^filename\.ext$' will be matched exactly to filename.ext
with any directory prepended.
=item -total
print statistic counters to STDERR.
=item -excluded
print to STDERR what files are excluded and why.
=item -included
print to STDERR what files are included and why.
=item -selectors
Prints exclude/include regular expressions and file suffices and exits.
=back
=head1 HOW IT WORKS
Each of file names checked in that order:
* check against exclude RE; matched file excluded (see -exclude option);
if not matched, then:
* check against include RE; matched file included (see -include option);
if not matched, then:
* check against binary suffices table; matched file excluded;
if not matched, then:
* check against text suffices table; matched file included;
if not matched, then:
* checked by file(1)
All of this allows to avoid file(1)'s misdetection on some texts
and reduce time spent for file(1) calls.
=head1 NOTES
Does not follows symlinks.
Zero size files are skipped.
=cut
my $ help_option = 0;
my @ include_options;
my @ exclude_options;
my $ total_option = 0;
my $ excluded_option = 0;
my $ included_option = 0;
my $selectors_option = 0;
GetOptions(
'help' => \$ help_option,
'exclude=s' => \@ exclude_options,
'include=s' => \@ include_options,
'total' => \$ total_option,
'excluded' => \$ excluded_option,
'included' => \$ included_option,
'selectors' => \$selectors_option,
) or usage;
usage if $help_option;
my %bin_suffices;
my %txt_suffices;
BEGIN
{
map { $bin_suffices{$_} = undef }
(
'gif', 'tif', 'tiff', 'png', 'jpg', 'jpeg',
'avi', 'mpg', 'mpeg',
'o', 'obj', 'exe',
'cab', 'a', 'rar', 'arj', 'zip', 'tar', 'cpio',
'z', 'gz', 'bz', 'bz2', 'tgz', 'tbz', 'tbz2',
'iso', 'bin', 'img', 'imag', 'image',
'diff', 'patch' # diff/patch files could have EOL spaces!
);
map { $txt_suffices{$_} = undef }
(
'txt', 'text', 'html', 'htm', 'xml', 'php',
'c', 'cpp', 'c++', 'cc', 'cxx',
'h', 'hpp', 'h++', 'hh', 'hxx',
'asm', 'inc', 'mod',
'for', 'f77', 'g77',
'java', 'jav',
'bas', 'vb',
'pl', 'pm', 'pod',
'make', 'mak', 'mk',
'awk', 'sh', 'bat', 'cmd', 'rexx', 'rex',
'sql', 'def', 'man',
'cvsignore'
);
}
my $exclude_re = '(,v$)';
map { $exclude_re .= '|('.lc $_.')'; } @exclude_options;
my $include_re = '(^makefile$)';
map { $include_re .= '|('.lc $_.')'; } @include_options;
if ($selectors_option)
{
my $bin_suffices = join(" ", sort keys %bin_suffices);
my $txt_suffices = join(" ", sort keys %txt_suffices);
print STDERR "\n";
print STDERR "Exclude RE: ".$exclude_re."\n";
print STDERR "\n";
print STDERR "Include RE: ".$include_re."\n";
print STDERR "\n";
print STDERR "Exclude suffices: ".$bin_suffices."\n";
print STDERR "\n";
print STDERR "Include suffices: ".$txt_suffices."\n";
print STDERR "\n";
exit 0;
}
scalar(@ARGV) >= 1 or usage("no directory specified");
my (
$total_files_checked,
$total_files_empty,
$total_files_excluded_by_re,
$total_files_included_by_re,
$total_files_excluded_by_suffix,
$total_files_included_by_suffix,
$total_files_excluded_by_file,
$total_files_included_by_file
) = (0,0,0,0,0,0,0,0);
sub _by($$$$)
{
my ($inex_option, $inex_str, $by, $name) = @_;
printf(STDERR "%scluded by %13s: %s\n", $inex_str, $by, $name)
if $inex_option;
}
sub inby($$) { _by($included_option, 'in', $_[0], $_[1]); }
sub exby($$) { _by($excluded_option, 'ex', $_[0], $_[1]); }
local *FILE_RH;
local *FILE_WH;
my $file_pid;
$SIG{PIPE} = sub
{
close FILE_WH;
waitpid $file_pid, 0;
die "file(1) pipe broken"
};
$file_pid = open2(\*FILE_RH, \*FILE_WH, "file -n -f -" )
or die "can't fork: $!";
#+ main work
$| = 1; # STDOUT autoflush
find(\&onfile, @ARGV);
#- main work
close FILE_WH;
waitpid $file_pid, 0;
format STDERR =
Total files: checked empty
------- -------
@>>>>>> @>>>>>>
$total_files_checked, $total_files_empty
suffix re file(1)
------- ------- -------
excluded by: @>>>>>> @>>>>>> @>>>>>>
$total_files_excluded_by_suffix, $total_files_excluded_by_re, $total_files_excluded_by_file
included by: @>>>>>> @>>>>>> @>>>>>>
$total_files_included_by_suffix, $total_files_included_by_re, $total_files_included_by_file
.
write STDERR if $total_option;
exit 0;
sub onfile()
{
my $shortname = $_;
my $ fullname = "$File::Find::name";
return unless -f $shortname;
$total_files_checked++;
if ( ! -s $shortname )
{
$total_files_empty++;
return;
}
my $lcshortname = lc $shortname;
if ( $lcshortname =~ m/$exclude_re/o )
{
exby('RE', $fullname);
$total_files_excluded_by_re++;
return;
}
if ( $lcshortname =~ m/$include_re/o )
{
inby('RE', $fullname);
$total_files_included_by_re++;
}
else # check by suffix
{
my $suffix = $1 if $lcshortname =~ m/\.([^\.]+)$/;
if ( defined $suffix and length $suffix and
exists $bin_suffices{$suffix} )
{
exby('binary suffix', $fullname);
$total_files_excluded_by_suffix++;
return;
}
if ( defined $suffix and length $suffix and
exists $txt_suffices{$suffix} )
{
inby('text suffix', $fullname);
$total_files_included_by_suffix++;
}
else # check by file(1)
{
print FILE_WH $fullname."\n"
or die "bad write to file(1) pipe: $! $?";
my $fread = <FILE_RH>;
defined $fread or die "bad read from file(1) pipe: $! $?";
chomp $fread;
unless ( $fread =~ m|^(.+):\s+(.+)$| )
{
die "file(1) output does not match pattern:\n$fread\n";
}
my ($fname,$fdesc) = ($1,$2);
die "can't parse file(1) output:\n$fread\n"
if (! defined $fname) or (! defined $fdesc);
die "file name after file(1) does not match the original one:\n".
"\tbefore: $fullname\n".
"\tafter : $fname\n"
if $fname ne $fullname;
if ( $fdesc =~ m/^.* (text)|(source).*$/ )
{
inby('file(1)', $fullname);
$total_files_included_by_file++;
}
else
{
exby('file(1)', $fullname);
$total_files_excluded_by_file++;
return;
}
}
}
print $fullname . "\n";
}
=head1 AUTHOR
Dmitry Fedorov <dm.fedorov@gmail.com>
=head1 COPYRIGHT
Copyright (C) 2003 Dmitry Fedorov <dm.fedorov@gmail.com>
=head1 LICENSE
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License,
or (at your option) any later version.
=head1 DISCLAIMER
The author disclaims any responsibility for any mangling of your system
etc, that this script may cause.
=cut
[-- Attachment #3: truncate-eol-whitespace --]
[-- Type: application/octet-stream, Size: 4916 bytes --]
#!/usr/bin/perl -w
=head1 NAME
truncate-eol-whitespace - truncate white spaces at end of line.
=head1 SYNOPSIS
truncate-eol-whitespace [-total] [-truncated] [-nontruncated] [-dry-run] \
[file ...] [-f files-from]
=head1 DESCRIPTION
This program truncates extra white spaces just before end of line
in specified files. File names can be specified as parameters
and/or readed from specified file, '-' for STDIN.
=head1 EXAMPLE
Truncate all text files under DIR:
find-text-files DIR -total | truncate-eol-whitespace -total -f -
=cut
require 5.004;
use strict;
use integer;
use Getopt::Long;
sub usage {
warn "\n".join(" ", @_)."\n" if @_;
warn <<EOF;
Usage:
truncate-eol-whitespace [-total] [-truncated] [-nontruncated] [-dry-run] \
[file ...] [-f files-from]
Warning: this script truncates files! Use -dry-run for test first.
EOF
exit(1);
}
=head1 OPTIONS
=over 4
=item -total
print statistic counters to STDERR.
=item -truncated
print to STDERR what files was truncated;
=item -nontruncated
print to STDERR what files was not truncated;
=item -dry-run
Do not write files, report only
=item file ...
Files to truncate (optional)
=item -f files-from
File name with file names to truncate, one name per line.
Use '-' for STDIN.
=back
=cut
my $ help_option = 0;
my $ dry_run_option = 0;
my $ files_from_option;
my $ total_option = 0;
my $ truncated_option = 0;
my $nontruncated_option = 0;
GetOptions(
'help' => \$ help_option,
'total' => \$ total_option,
'truncated' => \$ truncated_option,
'nontruncated' => \$nontruncated_option,
'dry-run' => \$ dry_run_option,
'f=s' => \$ files_from_option,
) or usage;
usage if $help_option;
usage("no files specified")
if (! defined $files_from_option) and scalar(@ARGV) < 1;
my (
$total_files_checked,
$total_files_empty,
$total_files_truncated,
$total_files_no_chars_truncated
) = (0,0,0,0,0,0,0,0,0,0);
my ( $total_chars_readed, $total_chars_truncated ) = (0,0);
sub truncate_file($)
{
my $fname = shift;
$total_files_checked++;
if ( ! -f $fname )
{
print STDERR "is not a plain file: ".$fname."\n";
return;
}
if ( ! -s $fname )
{
print STDERR "zero size file: ".$fname."\n";
$total_files_empty++;
return;
}
local $/ = undef; # no records, slurp mode
local *IN;
open IN, "< $fname"
or die "Can't open $fname: $!";
my $file = <IN>;
defined $file or die "Can't read $fname: $!";
close IN;
my $length_before = length $file;
$total_chars_readed += $length_before;
$file =~ s/[\000-\011\013-\040]+\n/\n/mg;
my $length_after = length $file;
my $chars_truncated = $length_before - $length_after;
die "size become greater after truncating: ".$fname
if $chars_truncated < 0;
if ( $chars_truncated > 0 )
{
$total_files_truncated++;
$total_chars_truncated += $chars_truncated;
}
else
{
$total_files_no_chars_truncated++;
}
if ( $chars_truncated >0 and $truncated_option )
{
printf(STDOUT "%6u of %6u chars truncated from $fname\n",
$chars_truncated, $length_before);
}
elsif ( $chars_truncated==0 and $nontruncated_option )
{
printf(STDOUT "%-16s chars truncated from $fname\n", 'no');
}
if ( ! $dry_run_option and $chars_truncated > 0 )
{
local *OUT;
open OUT, "> $fname" or die "Can't open $fname: $!";
print OUT $file or die "Can't write $fname: $!";
close OUT or die "Error on closing $fname: $!";
}
}
#+ main work
# do process file names from the @ARGV first
truncate_file($_) while defined ($_ = shift);
if (defined $files_from_option) # do process file names from file|STDIN
{
local *IN;
open (IN, $files_from_option) or die "Can't open $files_from_option: $!";
while ( my $fname = <IN> )
{
chomp $fname;
next if length($fname) < 1; # skip empty lines
truncate_file($fname);
}
}
#- main work
format STDERR =
Total files: checked empty truncated non-truncated
------- ------- ------- -------
@>>>>>> @>>>>>> @>>>>>> @>>>>>>
$total_files_checked, $total_files_empty, $total_files_truncated, $total_files_no_chars_truncated
Total chars truncated: @>>>>>> of @<<<<<<<<<<<<<<<<<
$total_chars_truncated, $total_chars_readed
.
write STDERR if $total_option;
exit 0;
=head1 AUTHOR
Dmitry Fedorov <dm.fedorov@gmail.com>
=head1 COPYRIGHT
Copyright (C) 2003 Dmitry Fedorov <dm.fedorov@gmail.com>
=head1 LICENSE
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License,
or (at your option) any later version.
=head1 DISCLAIMER
The author disclaims any responsibility for any mangling of your system
etc, that this script may cause.
=cut
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [SCRIPT] chomp: trim trailing whitespace
2006-05-27 11:32 ` Jeff Garzik
2006-05-27 11:48 ` Dmitry Fedorov
@ 2006-05-27 12:42 ` Jan Engelhardt
2006-05-28 8:33 ` Keith Owens
1 sibling, 1 reply; 14+ messages in thread
From: Jan Engelhardt @ 2006-05-27 12:42 UTC (permalink / raw)
To: Jeff Garzik; +Cc: Git Mailing List, Linux Kernel
>> Pretty long script. How about this two-liner? It does not show 'bytes
>> chomped' but it also trims trailing whitespace.
>>
>> #!/usr/bin/perl -i -p
>> s/[ \t\r\n]+$//
>
> Yes, it does, but a bit too aggressive for what we need :)
>
Whoops, should have been s/[ \t\r]+$//
And the CL form is
perl -i -pe '...'
Somehow, you can't group it to -ipe, but who cares.
Jan Engelhardt
--
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [SCRIPT] chomp: trim trailing whitespace
2006-05-27 2:27 [SCRIPT] chomp: trim trailing whitespace Jeff Garzik
2006-05-27 4:17 ` H. Peter Anvin
2006-05-27 10:15 ` Jan Engelhardt
@ 2006-05-27 15:28 ` Martin Langhoff
2006-05-27 16:13 ` Linus Torvalds
2 siblings, 1 reply; 14+ messages in thread
From: Martin Langhoff @ 2006-05-27 15:28 UTC (permalink / raw)
To: Jeff Garzik; +Cc: Git Mailing List, Linux Kernel
I love perl golf for this kind of stuff... but git-stripspace is part
of git already. Even then, I tend to do it with perl -pi -e ''
constructs ;-)
cheers,
m
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [SCRIPT] chomp: trim trailing whitespace
2006-05-27 15:28 ` Martin Langhoff
@ 2006-05-27 16:13 ` Linus Torvalds
2006-05-28 10:00 ` Johannes Schindelin
0 siblings, 1 reply; 14+ messages in thread
From: Linus Torvalds @ 2006-05-27 16:13 UTC (permalink / raw)
To: Martin Langhoff; +Cc: Jeff Garzik, Git Mailing List, Linux Kernel
On Sun, 28 May 2006, Martin Langhoff wrote:
>
> I love perl golf for this kind of stuff... but git-stripspace is part
> of git already. Even then, I tend to do it with perl -pi -e ''
> constructs ;-)
Well, git-stripspace actually does something slightly differently, in that
it also removes extraneous all-whitespace lines from the beginning, the
end, and the middle (in the middle, the rule is: two or more empty lines
are collapsed into one).
Ie it's a total hack for parsing just commit messages (and it is in C,
because I can personally write 25 lines of C in about a millionth of the
time I can write 3 lines of perl).
Linus
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [SCRIPT] chomp: trim trailing whitespace
2006-05-27 12:42 ` Jan Engelhardt
@ 2006-05-28 8:33 ` Keith Owens
0 siblings, 0 replies; 14+ messages in thread
From: Keith Owens @ 2006-05-28 8:33 UTC (permalink / raw)
To: Jan Engelhardt; +Cc: Jeff Garzik, Git Mailing List, Linux Kernel
Jan Engelhardt (on Sat, 27 May 2006 14:42:02 +0200 (MEST)) wrote:
>And the CL form is
> perl -i -pe '...'
>Somehow, you can't group it to -ipe, but who cares.
-i takes an optional extension which is used to optionally create
backup files. As such, -i must be followed by space if you want no
extension (and no backup).
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [SCRIPT] chomp: trim trailing whitespace
2006-05-27 11:42 ` Jeff Garzik
@ 2006-05-28 9:24 ` H. Peter Anvin
0 siblings, 0 replies; 14+ messages in thread
From: H. Peter Anvin @ 2006-05-28 9:24 UTC (permalink / raw)
To: Jeff Garzik; +Cc: linux-kernel
[-- Attachment #1: Type: text/plain, Size: 907 bytes --]
Jeff Garzik wrote:
> H. Peter Anvin wrote:
>> Jeff Garzik wrote:
>>>
>>> Attached to this email is chomp.pl, a Perl script which removes
>>> trailing whitespace from several files. I've had this for years, as
>>> trailing whitespace is one of my pet peeves.
>>>
>>> Now that git-applymbox complains loudly whenever a patch adds
>>> trailing whitespace, I figured this script may be useful to others.
>>>
>>
>> This is the script I use for the same purpose. It's a bit more
>> sophisticated, in that it detects and avoids binary files, and doesn't
>> throw an error if it encounters a directory (which can happen if you
>> give it a wildcard.)
>
> Chewing the EOF blanks is nice. The only nit I have is that your script
> rewrites the file even if nothing was changed.
>
Ah, good point. Attached version fixes that. It still doesn't break
hard links, which may be a desirable feature.
-hpa
[-- Attachment #2: cleanfile --]
[-- Type: text/plain, Size: 1418 bytes --]
#!/usr/bin/perl
#
# Clean a text file of stealth whitespace
#
use bytes;
$name = 'cleanfile';
foreach $f ( @ARGV ) {
print STDERR "$name: $f\n";
if (! -f $f) {
print STDERR "$f: not a file\n";
next;
}
if (!open(FILE, '+<', $f)) {
print STDERR "$name: Cannot open file: $f: $!\n";
next;
}
binmode FILE;
# First, verify that it is not a binary file
$is_binary = 0;
while (read(FILE, $data, 65536) > 0) {
if ($data =~ /\0/) {
$is_binary = 1;
last;
}
}
if ($is_binary) {
print STDERR "$name: $f: binary file\n";
next;
}
seek(FILE, 0, 0);
$in_bytes = 0;
$out_bytes = 0;
$blank_bytes = 0;
@blanks = ();
@lines = ();
while ( defined($line = <FILE>) ) {
$in_bytes += length($line);
$line =~ s/[ \t\r\n]*$/\n/;
if ( $line eq "\n" ) {
push(@blanks, $line);
$blank_bytes += length($line);
} else {
push(@lines, @blanks);
$out_bytes += $blank_bytes;
push(@lines, $line);
$out_bytes += length($line);
@blanks = ();
$blank_bytes = 0;
}
}
# Any blanks at the end of the file are discarded
if ($in_bytes != $out_bytes) {
# Only write to the file if changed
seek(FILE, 0, 0);
print FILE @lines;
if ( !defined($where = tell(FILE)) ||
!truncate(FILE, $where) ) {
die "$name: Failed to truncate modified file: $f: $!\n";
}
}
close(FILE);
}
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [SCRIPT] chomp: trim trailing whitespace
2006-05-27 16:13 ` Linus Torvalds
@ 2006-05-28 10:00 ` Johannes Schindelin
0 siblings, 0 replies; 14+ messages in thread
From: Johannes Schindelin @ 2006-05-28 10:00 UTC (permalink / raw)
To: Linus Torvalds
Cc: Martin Langhoff, Jeff Garzik, Git Mailing List, Linux Kernel
Hi,
On Sat, 27 May 2006, Linus Torvalds wrote:
> Well, git-stripspace actually does something slightly differently, in that
> it also removes extraneous all-whitespace lines from the beginning, the
> end, and the middle (in the middle, the rule is: two or more empty lines
> are collapsed into one).
>
> Ie it's a total hack for parsing just commit messages (and it is in C,
> because I can personally write 25 lines of C in about a millionth of the
> time I can write 3 lines of perl).
But there is no good reason not to add some code and a command line
switch, so that this tool with a very generic name actually performs what
a normal person would expect from that name.
Ciao,
Dscho
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2006-05-28 10:00 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-05-27 2:27 [SCRIPT] chomp: trim trailing whitespace Jeff Garzik
2006-05-27 4:17 ` H. Peter Anvin
2006-05-27 11:42 ` Jeff Garzik
2006-05-28 9:24 ` H. Peter Anvin
2006-05-27 10:15 ` Jan Engelhardt
2006-05-27 10:24 ` Thomas Glanzmann
2006-05-27 10:36 ` Neil Brown
2006-05-27 11:32 ` Jeff Garzik
2006-05-27 11:48 ` Dmitry Fedorov
2006-05-27 12:42 ` Jan Engelhardt
2006-05-28 8:33 ` Keith Owens
2006-05-27 15:28 ` Martin Langhoff
2006-05-27 16:13 ` Linus Torvalds
2006-05-28 10:00 ` Johannes Schindelin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).