This is on top of my just-submitted [1] which in turn is on top of send-email work of mine sitting in "next". I was meaning to hold off on these patches for a bit, but given the concurrent on-list discussion about doing config discovery in send-email I wanted to send this now. This combines by not-picked-up[1] recent patches to remove the support for the "sendemail.smtpssl" variable with the later patches showing where that effort was really going. As noted in the subject this speeds up git-send-email invocations by ~2x or more, and brings the very slow t9001 test from running in ~26s on my box to ~12s. It's no longer consistently the slowest test I run. This is basically done in two ways: We lazily invoke "git config" to get config, before it's very eager, and deferring Perl compilation with s/use/require/g. 1. https://lore.kernel.org/git/patch-1.1-92571a8cf7-20210512T094803Z-avarab@gmail.com/ 2. https://lore.kernel.org/git/cover-0.2-00000000000-20210411T144128Z-avarab@gmail.com/ Ævar Arnfjörð Bjarmason (9): send-email: remove non-working support for "sendemail.smtpssl" send-email: refactor sendemail.smtpencryption config parsing send-email: lazily load config for a big speedup send-email: lazily shell out to "git var" send-email: use function syntax instead of barewords send-email: get rid of indirect object syntax send-email: lazily load modules for a big speedup perl: lazily load some common Git.pm setup code send-email: move trivial config handling to Perl Documentation/config/sendemail.txt | 3 - git-send-email.perl | 145 +++++++++++++++++------------ perl/Git.pm | 49 +++++----- 3 files changed, 111 insertions(+), 86 deletions(-) -- 2.31.1.909.g789bb6d90e
Remove the already dead code to support "sendemail.smtssl" by finally removing the dead code supporting the configuration option. In f6bebd121ac (git-send-email: add support for TLS via Net::SMTP::SSL, 2008-06-25) the --smtp-ssl command-line option was documented as deprecated, later in 65180c66186 (List send-email config options in config.txt., 2009-07-22) the "sendemail.smtpssl" configuration option was also documented as such. Then in in 3ff15040e22 (send-email: fix regression in sendemail.identity parsing, 2019-05-17) I unintentionally removed support for it by introducing a bug in read_config(). As can be seen from the diff context we've already returned unless $enc i defined, so it's not possible for us to reach the "elsif" branch here. This code was therefore already dead since Git v2.23.0. So let's just remove it. We were already 11 years into a stated deprecation period of this variable when 3ff15040e22 landed, now it's around 13. Since it hasn't worked anyway for around 2 years it looks like we can safely remove it. The --smtp-ssl option is still deprecated, if someone cares they can follow-up and remove that too, but unlike the config option that one could still be in use in the wild. I'm just removing this code that's provably unused already. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- Documentation/config/sendemail.txt | 3 --- git-send-email.perl | 6 +----- 2 files changed, 1 insertion(+), 8 deletions(-) diff --git a/Documentation/config/sendemail.txt b/Documentation/config/sendemail.txt index cbc5af42fd..50baa5d6bf 100644 --- a/Documentation/config/sendemail.txt +++ b/Documentation/config/sendemail.txt @@ -8,9 +8,6 @@ sendemail.smtpEncryption:: See linkgit:git-send-email[1] for description. Note that this setting is not subject to the 'identity' mechanism. -sendemail.smtpssl (deprecated):: - Deprecated alias for 'sendemail.smtpEncryption = ssl'. - sendemail.smtpsslcertpath:: Path to ca-certificates (either a directory or a single file). Set it to an empty string to disable certificate verification. diff --git a/git-send-email.perl b/git-send-email.perl index 175da07d94..3ef194fe2d 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -383,11 +383,7 @@ sub read_config { my $enc = Git::config(@repo, $setting); return unless defined $enc; return if $configured->{$setting}++; - if (defined $enc) { - $smtp_encryption = $enc; - } elsif (Git::config_bool(@repo, "$prefix.smtpssl")) { - $smtp_encryption = 'ssl'; - } + $smtp_encryption = $enc; } } -- 2.31.1.909.g789bb6d90e
With the removal of the support for sendemail.smtpssl in the preceding commit the parsing of sendemail.smtpencryption is no longer special, and can by moved to %config_settings. This gets us rid of an unconditional call to Git::config(), which as we'll see in subsequent commits matters for startup performance. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- git-send-email.perl | 9 +-------- 1 file changed, 1 insertion(+), 8 deletions(-) diff --git a/git-send-email.perl b/git-send-email.perl index 3ef194fe2d..f9d859da3e 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -277,6 +277,7 @@ sub do_edit { ); my %config_settings = ( + "smtpencryption" => \$smtp_encryption, "smtpserver" => \$smtp_server, "smtpserverport" => \$smtp_server_port, "smtpserveroption" => \@smtp_server_options, @@ -377,14 +378,6 @@ sub read_config { $$target = $v; } } - - if (!defined $smtp_encryption) { - my $setting = "$prefix.smtpencryption"; - my $enc = Git::config(@repo, $setting); - return unless defined $enc; - return if $configured->{$setting}++; - $smtp_encryption = $enc; - } } # sendemail.identity yields to --identity. We must parse this -- 2.31.1.909.g789bb6d90e
Reduce the time it takes git-send-email to get to even the most trivial of tasks (such as serving up its "-h" output) by first listing config keys that exist, and only then only call e.g. "git config --bool" on them if they do. Over a lot of runs this speeds the time to "-h" up for me from ~250ms to ~150ms, and the runtime of t9001-send-email.sh goes from ~25s to ~20s. This introduces a race condition where we'll do the "wrong" thing if a config key were to be inserted between us discovering the list and calling read_config(), i.e. we won't know about the racily added key. In theory this is a change in behavior, in practice it doesn't matter. The config_regexp() function being changed here was added in dd84e528a34 (git-send-email: die if sendmail.* config is set, 2020-07-23) for use by git-send-email. So we can change its odd return value in the case where no values are found by "git config". The difference in the *.pm code would matter if it was invoked in scalar context, but now it no longer is. Arguably this caching belongs in Git.pm itself, but in lieu of modifying it for all its callers let's only do this for "git send-email". The other big potential win would be "git svn", but unlike "git send-email" it doesn't check tens of config variables one at a time at startup (in my brief testing it doesn't check any). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- git-send-email.perl | 35 ++++++++++++++++++++++++++--------- perl/Git.pm | 4 ++-- 2 files changed, 28 insertions(+), 11 deletions(-) diff --git a/git-send-email.perl b/git-send-email.perl index f9d859da3e..5b0b7c33ec 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -337,11 +337,13 @@ sub signal_handler { # Read our sendemail.* config sub read_config { - my ($configured, $prefix) = @_; + my ($known_keys, $configured, $prefix) = @_; foreach my $setting (keys %config_bool_settings) { my $target = $config_bool_settings{$setting}; - my $v = Git::config_bool(@repo, "$prefix.$setting"); + my $key = "$prefix.$setting"; + next unless exists $known_keys->{$key}; + my $v = Git::config_bool(@repo, $key); next unless defined $v; next if $configured->{$setting}++; $$target = $v; @@ -349,8 +351,10 @@ sub read_config { foreach my $setting (keys %config_path_settings) { my $target = $config_path_settings{$setting}; + my $key = "$prefix.$setting"; + next unless exists $known_keys->{$key}; if (ref($target) eq "ARRAY") { - my @values = Git::config_path(@repo, "$prefix.$setting"); + my @values = Git::config_path(@repo, $key); next unless @values; next if $configured->{$setting}++; @$target = @values; @@ -365,14 +369,16 @@ sub read_config { foreach my $setting (keys %config_settings) { my $target = $config_settings{$setting}; + my $key = "$prefix.$setting"; + next unless exists $known_keys->{$key}; if (ref($target) eq "ARRAY") { - my @values = Git::config(@repo, "$prefix.$setting"); + my @values = Git::config(@repo, $key); next unless @values; next if $configured->{$setting}++; @$target = @values; } else { - my $v = Git::config(@repo, "$prefix.$setting"); + my $v = Git::config(@repo, $key); next unless defined $v; next if $configured->{$setting}++; $$target = $v; @@ -380,9 +386,20 @@ sub read_config { } } +# Save ourselves a lot of work of shelling out to 'git config' (it +# parses 'bool' etc.) by only doing so for config keys that exist. +my %known_config_keys; +{ + my @known_config_keys = Git::config_regexp("^sende?mail[.]"); + @known_config_keys{@known_config_keys} = (); +} + # sendemail.identity yields to --identity. We must parse this # special-case first before the rest of the config is read. -$identity = Git::config(@repo, "sendemail.identity"); +{ + my $key = "sendemail.identity"; + $identity = Git::config(@repo, $key) if exists $known_config_keys{$key}; +} my $rc = GetOptions( "identity=s" => \$identity, "no-identity" => \$no_identity, @@ -393,8 +410,8 @@ sub read_config { # Now we know enough to read the config { my %configured; - read_config(\%configured, "sendemail.$identity") if defined $identity; - read_config(\%configured, "sendemail"); + read_config(\%known_config_keys, \%configured, "sendemail.$identity") if defined $identity; + read_config(\%known_config_keys, \%configured, "sendemail"); } # Begin by accumulating all the variables (defined above), that we will end up @@ -478,7 +495,7 @@ sub read_config { usage(); } -if ($forbid_sendmail_variables && (scalar Git::config_regexp("^sendmail[.]")) != 0) { +if ($forbid_sendmail_variables && grep { /^sendmail/s } keys %known_config_keys) { die __("fatal: found configuration options for 'sendmail'\n" . "git-send-email is configured with the sendemail.* options - note the 'e'.\n" . "Set sendemail.forbidSendmailVariables to false to disable this check.\n"); diff --git a/perl/Git.pm b/perl/Git.pm index 73ebbf80cc..06a10b175a 100644 --- a/perl/Git.pm +++ b/perl/Git.pm @@ -754,8 +754,8 @@ sub config_regexp { } catch Git::Error::Command with { my $E = shift; if ($E->value() == 1) { - my @matches = (); - return @matches; + # Key(s) not found. + return; } else { throw $E; } -- 2.31.1.909.g789bb6d90e
Optimize git-send-email by only shelling out to "git var" if we need to. This is easily done by re-inventing our own small version of perl's Memoize module. I suppose I could just use Memoize itself, but in a subsequent patch I'll be micro-optimizing send-email's use of dependencies. Using Memoize is a measly extra 5-10 milliseconds, but as we'll see that'll end up mattering for us in the end. This brings the runtime of a plain "send-email" from around ~160-170ms to ~140m-150ms. The runtime of the tests is around the same, or around ~20s. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- git-send-email.perl | 20 +++++++++++++++----- 1 file changed, 15 insertions(+), 5 deletions(-) diff --git a/git-send-email.perl b/git-send-email.perl index 5b0b7c33ec..9ff315f775 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -560,8 +560,18 @@ sub read_config { } my ($repoauthor, $repocommitter); -($repoauthor) = Git::ident_person(@repo, 'author'); -($repocommitter) = Git::ident_person(@repo, 'committer'); +{ + my %cache; + my ($author, $committer); + my $common = sub { + my ($what) = @_; + return $cache{$what} if exists $cache{$what}; + ($cache{$what}) = Git::ident_person(@repo, $what); + return $cache{$what}; + }; + $repoauthor = sub { $common->('author') }; + $repocommitter = sub { $common->('committer') }; +} sub parse_address_line { return map { $_->format } Mail::Address->parse($_[0]); @@ -749,7 +759,7 @@ sub get_patch_subject { or die sprintf(__("Failed to open for writing %s: %s"), $compose_filename, $!); - my $tpl_sender = $sender || $repoauthor || $repocommitter || ''; + my $tpl_sender = $sender || $repoauthor->() || $repocommitter->() || ''; my $tpl_subject = $initial_subject || ''; my $tpl_in_reply_to = $initial_in_reply_to || ''; my $tpl_reply_to = $reply_to || ''; @@ -955,7 +965,7 @@ sub file_declares_8bit_cte { $sender =~ s/^\s+|\s+$//g; ($sender) = expand_aliases($sender); } else { - $sender = $repoauthor || $repocommitter || ''; + $sender = $repoauthor->() || $repocommitter->() || ''; } # $sender could be an already sanitized address @@ -1104,7 +1114,7 @@ sub make_message_id { $uniq = "$message_id_stamp-$message_id_serial"; my $du_part; - for ($sender, $repocommitter, $repoauthor) { + for ($sender, $repocommitter->(), $repoauthor->()) { $du_part = extract_valid_address(sanitize_address($_)); last if (defined $du_part and $du_part ne ''); } -- 2.31.1.909.g789bb6d90e
Change calls like "__ 'foo'" to "__('foo')" so the Perl compiler doesn't have to guess that "__" is a function. This makes the code more readable. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- git-send-email.perl | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/git-send-email.perl b/git-send-email.perl index 9ff315f775..da46925aa0 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -678,7 +678,7 @@ sub is_format_patch_arg { if (defined($format_patch)) { return $format_patch; } - die sprintf(__ <<EOF, $f, $f); + die sprintf(__(<<EOF), $f, $f); File '%s' exists but it could also be the range of commits to produce patches for. Please disambiguate by... @@ -764,7 +764,7 @@ sub get_patch_subject { my $tpl_in_reply_to = $initial_in_reply_to || ''; my $tpl_reply_to = $reply_to || ''; - print $c <<EOT1, Git::prefix_lines("GIT: ", __ <<EOT2), <<EOT3; + print $c <<EOT1, Git::prefix_lines("GIT: ", __(<<EOT2)), <<EOT3; From $tpl_sender # This line is ignored. EOT1 Lines beginning in "GIT:" will be removed. -- 2.31.1.909.g789bb6d90e
Change indirect object syntax such as "new X ARGS" to "X->new(ARGS)". This allows perl to see what "new" is at compile-time without having loaded Term::ReadLine. This doesn't matter now, but will in a subsequent commit when we start lazily loading it. Let's do the same for the adjacent "FakeTerm" package for consistency, even though we're not going to conditionally load it. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- git-send-email.perl | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/git-send-email.perl b/git-send-email.perl index da46925aa0..f9c780ceed 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -194,11 +194,11 @@ sub format_2822_time { my @repo = $repo ? ($repo) : (); my $term = eval { $ENV{"GIT_SEND_EMAIL_NOTTY"} - ? new Term::ReadLine 'git-send-email', \*STDIN, \*STDOUT - : new Term::ReadLine 'git-send-email'; + ? Term::ReadLine->new('git-send-email', \*STDIN, \*STDOUT) + : Term::ReadLine->new('git-send-email'); }; if ($@) { - $term = new FakeTerm "$@: going non-interactive"; + $term = FakeTerm->new("$@: going non-interactive"); } # Behavior modification variables -- 2.31.1.909.g789bb6d90e
Optimize the time git-send-email takes to do even the simplest of things (such as serving up "-h") from around ~150ms to ~80ms-~90ms by lazily loading the modules it requires. Before this change Devel::TraceUse would report 99/97 used modules under NO_GETTEXT=[|Y], respectively. Now it's 52/37. It now takes ~15s to run t9001-send-email.sh, down from ~20s. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- git-send-email.perl | 70 +++++++++++++++++++++++++-------------------- 1 file changed, 39 insertions(+), 31 deletions(-) diff --git a/git-send-email.perl b/git-send-email.perl index f9c780ceed..907dc1425f 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -19,20 +19,10 @@ use 5.008; use strict; use warnings; -use POSIX qw/strftime/; -use Term::ReadLine; use Getopt::Long; -use Text::ParseWords; -use Term::ANSIColor; -use File::Temp qw/ tempdir tempfile /; -use File::Spec::Functions qw(catdir catfile); use Git::LoadCPAN::Error qw(:try); -use Cwd qw(abs_path cwd); use Git; use Git::I18N; -use Net::Domain (); -use Net::SMTP (); -use Git::LoadCPAN::Mail::Address; Getopt::Long::Configure qw/ pass_through /; @@ -166,7 +156,6 @@ sub format_2822_time { ); } -my $have_email_valid = eval { require Email::Valid; 1 }; my $smtp; my $auth; my $num_sent = 0; @@ -192,14 +181,6 @@ sub format_2822_time { my $repo = eval { Git->repository() }; my @repo = $repo ? ($repo) : (); -my $term = eval { - $ENV{"GIT_SEND_EMAIL_NOTTY"} - ? Term::ReadLine->new('git-send-email', \*STDIN, \*STDOUT) - : Term::ReadLine->new('git-send-email'); -}; -if ($@) { - $term = FakeTerm->new("$@: going non-interactive"); -} # Behavior modification variables my ($quiet, $dry_run) = (0, 0); @@ -309,9 +290,10 @@ sub do_edit { # Handle Uncouth Termination sub signal_handler { + require Term::ANSIColor; # Make text normal - print color("reset"), "\n"; + print Term::ANSIColor::color("reset"), "\n"; # SMTP password masked system "stty echo"; @@ -574,11 +556,13 @@ sub read_config { } sub parse_address_line { + require Git::LoadCPAN::Mail::Address; return map { $_->format } Mail::Address->parse($_[0]); } sub split_addrs { - return quotewords('\s*,\s*', 1, @_); + require Text::ParseWords; + return Text::ParseWords::quotewords('\s*,\s*', 1, @_); } my %aliases; @@ -627,10 +611,11 @@ sub parse_sendmail_aliases { s/\\"/"/g foreach @addr; $aliases{$alias} = \@addr }}}, - mailrc => sub { my $fh = shift; while (<$fh>) { + mailrc => sub { my $fh = shift; while (<$fh>) { if (/^alias\s+(\S+)\s+(.*?)\s*$/) { + require Text::ParseWords; # spaces delimit multiple addresses - $aliases{$1} = [ quotewords('\s+', 0, $2) ]; + $aliases{$1} = [ Text::ParseWords::quotewords('\s+', 0, $2) ]; }}}, pine => sub { my $fh = shift; my $f='\t[^\t]*'; for (my $x = ''; defined($x); $x = $_) { @@ -702,7 +687,8 @@ sub is_format_patch_arg { opendir my $dh, $f or die sprintf(__("Failed to opendir %s: %s"), $f, $!); - push @files, grep { -f $_ } map { catfile($f, $_) } + require File::Spec; + push @files, grep { -f $_ } map { File::Spec->catfile($f, $_) } sort readdir $dh; closedir $dh; } elsif ((-f $f or -p $f) and !is_format_patch_arg($f)) { @@ -715,7 +701,8 @@ sub is_format_patch_arg { if (@rev_list_opts) { die __("Cannot run git format-patch from outside a repository\n") unless $repo; - push @files, $repo->command('format-patch', '-o', tempdir(CLEANUP => 1), @rev_list_opts); + require File::Temp; + push @files, $repo->command('format-patch', '-o', File::Temp::tempdir(CLEANUP => 1), @rev_list_opts); } @files = handle_backup_files(@files); @@ -752,9 +739,10 @@ sub get_patch_subject { if ($compose) { # Note that this does not need to be secure, but we will make a small # effort to have it be unique + require File::Temp; $compose_filename = ($repo ? - tempfile(".gitsendemail.msg.XXXXXX", DIR => $repo->repo_path()) : - tempfile(".gitsendemail.msg.XXXXXX", DIR => "."))[1]; + File::Temp::tempfile(".gitsendemail.msg.XXXXXX", DIR => $repo->repo_path()) : + File::Temp::tempfile(".gitsendemail.msg.XXXXXX", DIR => "."))[1]; open my $c, ">", $compose_filename or die sprintf(__("Failed to open for writing %s: %s"), $compose_filename, $!); @@ -861,6 +849,19 @@ sub get_patch_subject { do_edit(@files); } +sub term { + my $term = eval { + require Term::ReadLine; + $ENV{"GIT_SEND_EMAIL_NOTTY"} + ? Term::ReadLine->new('git-send-email', \*STDIN, \*STDOUT) + : Term::ReadLine->new('git-send-email'); + }; + if ($@) { + $term = FakeTerm->new("$@: going non-interactive"); + } + return $term; +} + sub ask { my ($prompt, %arg) = @_; my $valid_re = $arg{valid_re}; @@ -868,6 +869,7 @@ sub ask { my $confirm_only = $arg{confirm_only}; my $resp; my $i = 0; + my $term = term(); return defined $default ? $default : undef unless defined $term->IN and defined fileno($term->IN) and defined $term->OUT and defined fileno($term->OUT); @@ -1048,6 +1050,7 @@ sub extract_valid_address { return $address if ($address =~ /^($local_part_regexp)$/); $address =~ s/^\s*<(.*)>\s*$/$1/; + my $have_email_valid = eval { require Email::Valid; 1 }; if ($have_email_valid) { return scalar Email::Valid->address($address); } @@ -1107,7 +1110,8 @@ sub validate_address_list { sub make_message_id { my $uniq; if (!defined $message_id_stamp) { - $message_id_stamp = strftime("%Y%m%d%H%M%S.$$", gmtime(time)); + require POSIX; + $message_id_stamp = POSIX::strftime("%Y%m%d%H%M%S.$$", gmtime(time)); $message_id_serial = 0; } $message_id_serial++; @@ -1277,6 +1281,7 @@ sub valid_fqdn { sub maildomain_net { my $maildomain; + require Net::Domain; my $domain = Net::Domain::domainname(); $maildomain = $domain if valid_fqdn($domain); @@ -1287,6 +1292,7 @@ sub maildomain_mta { my $maildomain; for my $host (qw(mailhost localhost)) { + require Net::SMTP; my $smtp = Net::SMTP->new($host); if (defined $smtp) { my $domain = $smtp->domain; @@ -1965,13 +1971,15 @@ sub validate_patch { my ($fn, $xfer_encoding) = @_; if ($repo) { - my $validate_hook = catfile($repo->hooks_path(), + require File::Spec; + my $validate_hook = File::Spec->catfile($repo->hooks_path(), 'sendemail-validate'); my $hook_error; if (-x $validate_hook) { - my $target = abs_path($fn); + require Cwd; + my $target = Cwd::abs_path($fn); # The hook needs a correct cwd and GIT_DIR. - my $cwd_save = cwd(); + my $cwd_save = Cwd::cwd(); chdir($repo->wc_path() or $repo->repo_path()) or die("chdir: $!"); local $ENV{"GIT_DIR"} = $repo->repo_path(); -- 2.31.1.909.g789bb6d90e
Instead of unconditionally requiring modules such as File::Spec, let's only load them when needed. This speeds up code that only needs a subset of the features Git.pm provides. This brings a plain invocation of "git send-email" down from 52/37 loaded modules under NO_GETTEXT=[|Y] to 39/18, and it now takes ~60-~70ms instead of ~80-~90ms. The runtime of t9001-send-email.sh test is down to ~13s from ~15s. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- perl/Git.pm | 35 ++++++++++++++++++----------------- 1 file changed, 18 insertions(+), 17 deletions(-) diff --git a/perl/Git.pm b/perl/Git.pm index 06a10b175a..f18852fb09 100644 --- a/perl/Git.pm +++ b/perl/Git.pm @@ -11,9 +11,6 @@ package Git; use strict; use warnings $ENV{GIT_PERL_FATAL_WARNINGS} ? qw(FATAL all) : (); -use File::Temp (); -use File::Spec (); - BEGIN { our ($VERSION, @ISA, @EXPORT, @EXPORT_OK); @@ -103,12 +100,9 @@ =head1 DESCRIPTION =cut -use Carp qw(carp croak); # but croak is bad - throw instead +sub carp { require Carp; goto &Carp::carp } +sub croak { require Carp; goto &Carp::croak } use Git::LoadCPAN::Error qw(:try); -use Cwd qw(abs_path cwd); -use IPC::Open2 qw(open2); -use Fcntl qw(SEEK_SET SEEK_CUR); -use Time::Local qw(timegm); } @@ -191,13 +185,15 @@ sub repository { $dir = undef; }; + require Cwd; if ($dir) { + require File::Spec; File::Spec->file_name_is_absolute($dir) or $dir = $opts{Directory} . '/' . $dir; - $opts{Repository} = abs_path($dir); + $opts{Repository} = Cwd::abs_path($dir); # If --git-dir went ok, this shouldn't die either. my $prefix = $search->command_oneline('rev-parse', '--show-prefix'); - $dir = abs_path($opts{Directory}) . '/'; + $dir = Cwd::abs_path($opts{Directory}) . '/'; if ($prefix) { if (substr($dir, -length($prefix)) ne $prefix) { throw Error::Simple("rev-parse confused me - $dir does not have trailing $prefix"); @@ -223,7 +219,7 @@ sub repository { throw Error::Simple("fatal: Not a git repository: $dir"); } - $opts{Repository} = abs_path($dir); + $opts{Repository} = Cwd::abs_path($dir); } delete $opts{Directory}; @@ -408,10 +404,12 @@ sub command_bidi_pipe { my $cwd_save = undef; if ($self) { shift; - $cwd_save = cwd(); + require Cwd; + $cwd_save = Cwd::cwd(); _setup_git_cmd_env($self); } - $pid = open2($in, $out, 'git', @_); + require IPC::Open2; + $pid = IPC::Open2::open2($in, $out, 'git', @_); chdir($cwd_save) if $cwd_save; return ($pid, $in, $out, join(' ', @_)); } @@ -538,7 +536,8 @@ sub get_tz_offset { my $t = shift || time; my @t = localtime($t); $t[5] += 1900; - my $gm = timegm(@t); + require Time::Local; + my $gm = Time::Local::timegm(@t); my $sign = qw( + + - )[ $gm <=> $t ]; return sprintf("%s%02d%02d", $sign, (gmtime(abs($t - $gm)))[2,1]); } @@ -629,7 +628,8 @@ sub hooks_path { my ($self) = @_; my $dir = $self->command_oneline('rev-parse', '--git-path', 'hooks'); - my $abs = abs_path($dir); + require Cwd; + my $abs = Cwd::abs_path($dir); return $abs; } @@ -1353,6 +1353,7 @@ sub _temp_cache { my $n = $name; $n =~ s/\W/_/g; # no strange chars + require File::Temp; ($$temp_fd, $fname) = File::Temp::tempfile( "Git_${n}_XXXXXX", UNLINK => 1, DIR => $tmpdir, ) or throw Error::Simple("couldn't open new temp file"); @@ -1375,9 +1376,9 @@ sub temp_reset { truncate $temp_fd, 0 or throw Error::Simple("couldn't truncate file"); - sysseek($temp_fd, 0, SEEK_SET) and seek($temp_fd, 0, SEEK_SET) + sysseek($temp_fd, 0, Fcntl::SEEK_SET()) and seek($temp_fd, 0, Fcntl::SEEK_SET()) or throw Error::Simple("couldn't seek to beginning of file"); - sysseek($temp_fd, 0, SEEK_CUR) == 0 and tell($temp_fd) == 0 + sysseek($temp_fd, 0, Fcntl::SEEK_CUR()) == 0 and tell($temp_fd) == 0 or throw Error::Simple("expected file position to be reset"); } -- 2.31.1.909.g789bb6d90e
Optimize the startup time of git-send-email by using an amended config_regexp() function to retrieve the list of config keys and values we're interested in. For boolean keys we can handle the [true|false] case ourselves, and the "--get" case didn't need any parsing. Let's leave "--path" and other "--bool" cases to "git config". As noted in a preceding commit we're free to change the config_regexp() function, it's only used by "git send-email". This brings the runtime of "git send-email" from ~60-~70ms to a very steady ~40ms on my test box. We no run just one "git config" invocation on startup instead of 8, the exact number will differ based on the local sendemail.* config. I happen to have 8 of those set. This brings the runtime of t9001-send-email.sh from ~13s down to ~12s for me. The change there is less impressive as many of those tests set various config values, and we're also getting to the point of diminishing returns for optimizing "git send-email" itself. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- git-send-email.perl | 17 ++++++++++------- perl/Git.pm | 10 +++++----- 2 files changed, 15 insertions(+), 12 deletions(-) diff --git a/git-send-email.perl b/git-send-email.perl index 907dc1425f..35ba19470d 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -325,7 +325,10 @@ sub read_config { my $target = $config_bool_settings{$setting}; my $key = "$prefix.$setting"; next unless exists $known_keys->{$key}; - my $v = Git::config_bool(@repo, $key); + my $v = (@{$known_keys->{$key}} == 1 && + $known_keys->{$key}->[0] =~ /^(?:true|false)$/s) + ? $known_keys->{$key}->[0] eq 'true' + : Git::config_bool(@repo, $key); next unless defined $v; next if $configured->{$setting}++; $$target = $v; @@ -354,14 +357,12 @@ sub read_config { my $key = "$prefix.$setting"; next unless exists $known_keys->{$key}; if (ref($target) eq "ARRAY") { - my @values = Git::config(@repo, $key); - next unless @values; + my @values = @{$known_keys->{$key}}; next if $configured->{$setting}++; @$target = @values; } else { - my $v = Git::config(@repo, $key); - next unless defined $v; + my $v = $known_keys->{$key}->[0]; next if $configured->{$setting}++; $$target = $v; } @@ -372,8 +373,10 @@ sub read_config { # parses 'bool' etc.) by only doing so for config keys that exist. my %known_config_keys; { - my @known_config_keys = Git::config_regexp("^sende?mail[.]"); - @known_config_keys{@known_config_keys} = (); + my @kv = Git::config_regexp("^sende?mail[.]"); + while (my ($k, $v) = splice @kv, 0, 2) { + push @{$known_config_keys{$k}} => $v; + } } # sendemail.identity yields to --identity. We must parse this diff --git a/perl/Git.pm b/perl/Git.pm index f18852fb09..a9020d0d01 100644 --- a/perl/Git.pm +++ b/perl/Git.pm @@ -739,18 +739,18 @@ sub config_int { =item config_regexp ( RE ) Retrieve the list of configuration key names matching the regular -expression C<RE>. The return value is a list of strings matching -this regex. +expression C<RE>. The return value is an ARRAY of key-value pairs. =cut sub config_regexp { my ($self, $regex) = _maybe_self(@_); try { - my @cmd = ('config', '--name-only', '--get-regexp', $regex); + my @cmd = ('config', '--null', '--get-regexp', $regex); unshift @cmd, $self if $self; - my @matches = command(@cmd); - return @matches; + my $data = command(@cmd); + my (@kv) = map { split /\n/, $_, 2 } split /\0/, $data; + return @kv; } catch Git::Error::Command with { my $E = shift; if ($E->value() == 1) { -- 2.31.1.909.g789bb6d90e
On 2021-05-12 15:48:17+0200, Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
> Remove the already dead code to support "sendemail.smtssl" by finally
s/smtssl/smtpssl/
--
Danh
Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote: > As noted in the subject this speeds up git-send-email invocations by > ~2x or more, and brings the very slow t9001 test from running in ~26s > on my box to ~12s. It's no longer consistently the slowest test I run. > > This is basically done in two ways: We lazily invoke "git config" to > get config, before it's very eager, and deferring Perl compilation > with s/use/require/g. Nice. I've been doing similar things elsewhere and hoping to find time to get around to git-svn at some point. > Ævar Arnfjörð Bjarmason (9): > send-email: remove non-working support for "sendemail.smtpssl" > send-email: refactor sendemail.smtpencryption config parsing > send-email: lazily load config for a big speedup > send-email: lazily shell out to "git var" > send-email: use function syntax instead of barewords > send-email: get rid of indirect object syntax > send-email: lazily load modules for a big speedup > perl: lazily load some common Git.pm setup code > send-email: move trivial config handling to Perl I spotted some further optimizations for 7 and 8, but otherwise consider this series: Reviewed-by: Eric Wong <e@80x24.org>
Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
> @@ -408,10 +404,12 @@ sub command_bidi_pipe {
> my $cwd_save = undef;
> if ($self) {
> shift;
> - $cwd_save = cwd();
> + require Cwd;
> + $cwd_save = Cwd::cwd();
> _setup_git_cmd_env($self);
(This also applies to 7/9)
Cwd::cwd() execs /bin/pwd (at least on Perl 5.28.1 in Debian 10x).
There should be a benefit from using Cwd::getcwd() here, instead.
strace -f -e execve perl -MCwd -E 'Cwd::cwd()
...confirms the execve is happening.
getcwd() takes 25ms vs 27ms of cwd() for me using the
"schedutil" CPU governor under Linux:
time perl -MCwd -E 'Cwd::getcwd()'
time perl -MCwd -E 'Cwd::cwd()'
On Wed, May 12, 2021 at 9:48 AM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote: > Optimize the startup time of git-send-email by using an amended > config_regexp() function to retrieve the list of config keys and > values we're interested in. > > For boolean keys we can handle the [true|false] case ourselves, and > the "--get" case didn't need any parsing. Let's leave "--path" and > other "--bool" cases to "git config". As noted in a preceding commit > we're free to change the config_regexp() function, it's only used by > "git send-email". > > This brings the runtime of "git send-email" from ~60-~70ms to a very > steady ~40ms on my test box. We no run just one "git config" s/no/now/ > invocation on startup instead of 8, the exact number will differ based > on the local sendemail.* config. I happen to have 8 of those set. > > This brings the runtime of t9001-send-email.sh from ~13s down to ~12s > for me. The change there is less impressive as many of those tests set > various config values, and we're also getting to the point of > diminishing returns for optimizing "git send-email" itself. > > Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
On Wed, May 12, 2021 at 03:48:25PM +0200, Ævar Arnfjörð Bjarmason wrote: > Optimize the startup time of git-send-email by using an amended > config_regexp() function to retrieve the list of config keys and > values we're interested in. > > For boolean keys we can handle the [true|false] case ourselves, and > the "--get" case didn't need any parsing. Let's leave "--path" and > other "--bool" cases to "git config". As noted in a preceding commit > we're free to change the config_regexp() function, it's only used by > "git send-email". I think both of these cases should be safe. It is a bit unfortunate to have to go through these contortions, but this is definitely the best we can do for now. I think in the long run it would be nice to have a "--stdin" mode for git-config, where we could do something like: git config --stdin <<\EOF key=foo.bar type=bool default=false key=another.key type=color default=red EOF But that doesn't exist yet, and using it would probably involve rearranging send-email a bit (we would need an up-front list of all of the keys we care about and their types). So I'm perfectly content with this strategy in the meantime. > diff --git a/perl/Git.pm b/perl/Git.pm > index f18852fb09..a9020d0d01 100644 > --- a/perl/Git.pm > +++ b/perl/Git.pm > @@ -739,18 +739,18 @@ sub config_int { > =item config_regexp ( RE ) > > Retrieve the list of configuration key names matching the regular > -expression C<RE>. The return value is a list of strings matching > -this regex. > +expression C<RE>. The return value is an ARRAY of key-value pairs. The Git.pm interface is public-facing, isn't it? Are we breaking third-party callers of config_regexp here? -Peff
On Wed, May 12, 2021 at 03:48:21PM +0200, Ævar Arnfjörð Bjarmason wrote: > Change calls like "__ 'foo'" to "__('foo')" so the Perl compiler > doesn't have to guess that "__" is a function. This makes the code > more readable. I think this is a good change. And while you do qualify it as "more" readable, I think the main issue with readability here: > diff --git a/git-send-email.perl b/git-send-email.perl > index 9ff315f775..da46925aa0 100755 > --- a/git-send-email.perl > +++ b/git-send-email.perl > @@ -678,7 +678,7 @@ sub is_format_patch_arg { > if (defined($format_patch)) { > return $format_patch; > } > - die sprintf(__ <<EOF, $f, $f); > + die sprintf(__(<<EOF), $f, $f); > File '%s' exists but it could also be the range of commits > to produce patches for. Please disambiguate by... is how far the "EOF" marker is from the actual here-doc, syntactically (plus the ugly indentation). I suspect this and other places might be nicer to assign from the here-doc into a well-named variable. But I think it's fine to leave for now (and I worry a bit that it may turn into a rabbit hole, so we might be better to just leave it alone anyway). -Peff
On Wed, May 12, 2021 at 03:48:16PM +0200, Ævar Arnfjörð Bjarmason wrote:
> This combines by not-picked-up[1] recent patches to remove the support
> for the "sendemail.smtpssl" variable with the later patches showing
> where that effort was really going.
>
> As noted in the subject this speeds up git-send-email invocations by
> ~2x or more, and brings the very slow t9001 test from running in ~26s
> on my box to ~12s. It's no longer consistently the slowest test I run.
Nice. I have observed that with a decent number of cores, the running
time of the entire test suite correlates strongly with the running time
of t9001. :)
Here are timings for individual tests run with "prove --state=slow,save".
(This is on an 8-core machine using -j32, skipping cvs/svn/p4 tests,
and using a tmpfs via --root). The timings were computed with:
perl -MYAML -e '
$_ = do { local $/; <> };
# prove puts this non-YAML cruft at the end
s/\.\.\.$//s;
my $t = YAML::Load($_)->{tests};
print "$_->[1] $_->[0]\n" for
sort { $b->[1] <=> $a->[1] }
map { [$_, $t->{$_}->{elapsed}] }
keys(%$t);
' t/.prove | head
Before your patches, the whole sweet takes ~60-63s, and the top timings
(from a 63s run) are:
63.2607979774475 t9001-send-email.sh
51.742644071579 t0027-auto-crlf.sh
37.7909920215607 t3070-wildmatch.sh
27.09605717659 t7610-mergetool.sh
24.7028169631958 t7112-reset-submodule.sh
24.5535898208618 t5572-pull-submodule.sh
23.8404550552368 t9500-gitweb-standalone-no-errors.sh
22.3544380664825 t7400-submodule-basic.sh
21.7017750740051 t5510-fetch.sh
21.4575610160828 t3305-notes-fanout.sh
Now after, which takes ~54-59s (this is from a 54s run):
46.796669960022 t0027-auto-crlf.sh
32.5747599601746 t3070-wildmatch.sh
21.5069420337677 t7610-mergetool.sh
20.8392388820648 t1701-racy-split-index.sh
19.7403028011322 t5572-pull-submodule.sh
19.7386808395386 t9001-send-email.sh
19.4622302055359 t7112-reset-submodule.sh
18.9555768966675 t9500-gitweb-standalone-no-errors.sh
18.0672709941864 t7400-submodule-basic.sh
17.641391992569 t5510-fetch.sh
I have some messy patches to split t9001 into two segments. They were
waiting to get polished, but perhaps I can just discard them now. :)
Some side notes for those interested in timing the test suite:
- If I run t9001 standalone, it goes much faster, of course; the CPU
throttles down when we're running all the tests in parallel.
- Those are with "-x --verbose-log", which is nice for catching flaky
results. Dropping those seems to shave a few seconds off.
- A big chunk of time for t0027 and t3070 is spent running the sed-based
chain-linting for their huge tables of auto-generated tests (1400+
and 1800+ respectively). Dropping the sed linting for just those
tests knocks off about 30 CPU-seconds.
-Peff
On Wed, May 12, 2021 at 03:48:16PM +0200, Ævar Arnfjörð Bjarmason wrote:
> As noted in the subject this speeds up git-send-email invocations by
> ~2x or more, and brings the very slow t9001 test from running in ~26s
> on my box to ~12s. It's no longer consistently the slowest test I run.
>
> This is basically done in two ways: We lazily invoke "git config" to
> get config, before it's very eager, and deferring Perl compilation
> with s/use/require/g.
Splitting my reply, since the other one got deep into test-suite timing
details.
The techniques here look overall pretty reasonable. I think the module
lazy-loading makes the overall code a _little_ uglier, but IMHO the
speedup you're getting is worth it (I was surprised how much of the
improvement comes from that versus avoiding git-config subprocesses).
My only concern is changing the interface of Git::config_regexp() in the
final patch. Do we need to have a config_regexp_with_values() to avoid
breaking third-party users of the module?
-Peff
Jeff King wrote:
> It is a bit unfortunate to have to go through these contortions, but
> this is definitely the best we can do for now. I think in the long run
> it would be nice to have a "--stdin" mode for git-config, where we could
> do something like:
>
> git config --stdin <<\EOF
> key=foo.bar
> type=bool
> default=false
>
> key=another.key
> type=color
> default=red
> EOF
Why do we even have to specify the type? Shouldn't there be a registry
of configurations (a schema), so that all users don't have to do this?
--
Felipe Contreras
On Thu, May 13 2021, Felipe Contreras wrote:
> Jeff King wrote:
>> It is a bit unfortunate to have to go through these contortions, but
>> this is definitely the best we can do for now. I think in the long run
>> it would be nice to have a "--stdin" mode for git-config, where we could
>> do something like:
>>
>> git config --stdin <<\EOF
>> key=foo.bar
>> type=bool
>> default=false
>>
>> key=another.key
>> type=color
>> default=red
>> EOF
>
> Why do we even have to specify the type? Shouldn't there be a registry
> of configurations (a schema), so that all users don't have to do this?
Yes, we should be moving towards that. Currently we're not there, the
closest we've got is the generated config-list.h.
If we closed that loop properly you should be able to do:
git config --get --type=dwim clean.requireForce # or auto, or no --type
Or whatever, instead of:
git config --get --type=bool clean.requireForce
But we're not there yet, there's also edge cases here and there that
make a plain exhaustive registry hard, and send-email is one of them.
In its case the value of sendemail.identity=something (if any)
determines if/how e.g. sendemail.something.smtpEncryption is
interpreted.
I think we can do it with just a list of config variables where the
variable is either a string or a regex (in this case,
"^sendemail(?:[^.]+\.)?\.smtpEncryption$"), but I haven't tried. There
may be other tricky special-cases.
On Thu, May 13, 2021 at 02:04:08AM -0500, Felipe Contreras wrote:
> Jeff King wrote:
> > It is a bit unfortunate to have to go through these contortions, but
> > this is definitely the best we can do for now. I think in the long run
> > it would be nice to have a "--stdin" mode for git-config, where we could
> > do something like:
> >
> > git config --stdin <<\EOF
> > key=foo.bar
> > type=bool
> > default=false
> >
> > key=another.key
> > type=color
> > default=red
> > EOF
>
> Why do we even have to specify the type? Shouldn't there be a registry
> of configurations (a schema), so that all users don't have to do this?
One of the purposes of git-config is to serve third-party scripts that
store their own config keys that Git does not know about. So we can't
know the set of all possible types that will be asked about.
Obviously we could have git-config know internally about all of the keys
other parts of Git would ask about. But generally we have pushed that
knowledge out to the users of the keys, rather than any kind of central
registry.
-Peff
On Wed, May 12 2021, Jeff King wrote: > On Wed, May 12, 2021 at 03:48:16PM +0200, Ævar Arnfjörð Bjarmason wrote: > >> As noted in the subject this speeds up git-send-email invocations by >> ~2x or more, and brings the very slow t9001 test from running in ~26s >> on my box to ~12s. It's no longer consistently the slowest test I run. >> >> This is basically done in two ways: We lazily invoke "git config" to >> get config, before it's very eager, and deferring Perl compilation >> with s/use/require/g. > > Splitting my reply, since the other one got deep into test-suite timing > details. > > The techniques here look overall pretty reasonable. I think the module > lazy-loading makes the overall code a _little_ uglier, but IMHO the > speedup you're getting is worth it (I was surprised how much of the > improvement comes from that versus avoiding git-config subprocesses). Yeah, it's mostly uglier, but I think it's worth it. Some parts are better afterwards though, i.e. the SOME_RARE_BARE_WORD is now Module::It::Is::In::SOME_RARE_BARE_WORD, which makes it easier to understand where it's from. > My only concern is changing the interface of Git::config_regexp() in the > final patch. Do we need to have a config_regexp_with_values() to avoid > breaking third-party users of the module? As noted in 3/9 I don't think we need to worry about it, it's recently introduced (a few months) API in Git.pm for send-email itself. I think we can just change it. In general I think it's unfortunate that we have (at least in principle) a "public by default" module like Git.pm that's mostly for our own use. This series doesn't try to deal with that in general at all, I'm somewhat of the opinion that we should just fork it at this point. I.e. have a Git.pm we freeze in time, and a Git/Ours.pm that's going to be the private API. I stopped with these optimizations at the point of refactoring away Error.pm, which is a large contributor to compilation time, but as long as it's a public API that can't be done without changing the public API. If all we needed to worry about was send-email, git-svn etc. just changing it to Perl-native exceptions would be trivial.
On Thu, May 13, 2021 at 09:37:36AM +0200, Ævar Arnfjörð Bjarmason wrote: > > My only concern is changing the interface of Git::config_regexp() in the > > final patch. Do we need to have a config_regexp_with_values() to avoid > > breaking third-party users of the module? > > As noted in 3/9 I don't think we need to worry about it, it's recently > introduced (a few months) API in Git.pm for send-email itself. I think > we can just change it. Ah, thanks for pointing that out. I _thought_ I had seen you mention it earlier, but when I went back to look I couldn't find it. I'm not entirely convinced, though. I agree it's probably not heavily used, but the existing interface was shipped in three releases already (v2.29 and up). > In general I think it's unfortunate that we have (at least in principle) > a "public by default" module like Git.pm that's mostly for our own use. I'd certainly agree with that sentiment. :) > This series doesn't try to deal with that in general at all, I'm > somewhat of the opinion that we should just fork it at this > point. I.e. have a Git.pm we freeze in time, and a Git/Ours.pm that's > going to be the private API. > > I stopped with these optimizations at the point of refactoring away > Error.pm, which is a large contributor to compilation time, but as long > as it's a public API that can't be done without changing the public > API. If all we needed to worry about was send-email, git-svn etc. just > changing it to Perl-native exceptions would be trivial. Yeah, I don't have any real problem with that, as long as we don't break third-party scripts that we've promised not to. I'd even be OK with deprecating Git.pm and eventually phasing it out, if we think it's a maintenance burden. -Peff
Jeff King wrote:
> On Thu, May 13, 2021 at 02:04:08AM -0500, Felipe Contreras wrote:
>
> > Jeff King wrote:
> > > It is a bit unfortunate to have to go through these contortions, but
> > > this is definitely the best we can do for now. I think in the long run
> > > it would be nice to have a "--stdin" mode for git-config, where we could
> > > do something like:
> > >
> > > git config --stdin <<\EOF
> > > key=foo.bar
> > > type=bool
> > > default=false
> > >
> > > key=another.key
> > > type=color
> > > default=red
> > > EOF
> >
> > Why do we even have to specify the type? Shouldn't there be a registry
> > of configurations (a schema), so that all users don't have to do this?
>
> One of the purposes of git-config is to serve third-party scripts that
> store their own config keys that Git does not know about. So we can't
> know the set of all possible types that will be asked about.
Yes, I know, I maintain several tools that have such configurations. For
those you would need to specify the type (or find some way to install
the schema so that git parses it).
But I'm talking about git.git configurations. If you don't specify the
type in --stdin it should fetch it from some database. That would be
much more user-friendly.
--
Felipe Contreras
On Thu, May 13 2021, Felipe Contreras wrote:
> Jeff King wrote:
>> On Thu, May 13, 2021 at 02:04:08AM -0500, Felipe Contreras wrote:
>>
>> > Jeff King wrote:
>> > > It is a bit unfortunate to have to go through these contortions, but
>> > > this is definitely the best we can do for now. I think in the long run
>> > > it would be nice to have a "--stdin" mode for git-config, where we could
>> > > do something like:
>> > >
>> > > git config --stdin <<\EOF
>> > > key=foo.bar
>> > > type=bool
>> > > default=false
>> > >
>> > > key=another.key
>> > > type=color
>> > > default=red
>> > > EOF
>> >
>> > Why do we even have to specify the type? Shouldn't there be a registry
>> > of configurations (a schema), so that all users don't have to do this?
>>
>> One of the purposes of git-config is to serve third-party scripts that
>> store their own config keys that Git does not know about. So we can't
>> know the set of all possible types that will be asked about.
>
> Yes, I know, I maintain several tools that have such configurations. For
> those you would need to specify the type (or find some way to install
> the schema so that git parses it).
>
> But I'm talking about git.git configurations. If you don't specify the
> type in --stdin it should fetch it from some database. That would be
> much more user-friendly.
For what it's worth my idea of hacking a plumbing thingy for
git-send-email here before ultimately deciding that my simpler caching
approach was easier and gave me 95% of the win, was to just teach it a
mode where it spews out all config variables \0-delimited with all
possible interpretations of it. I.e.:
some.variable 123 bool true path 123 string 123 [...]
It's rather cheap to do the "interpret this as --type=X for me" on the
C-level, so we might as well spew out all possible interpretations.
That means that any external tool would be guaranteed to only need one
"git config" invocation to parse any of its config, i.e. in a case where
variable X decides if variable Y is a bool or path or whatever. They'd
already have all possible values.
Something like:
git config -l -z --type=bool,path
git config -l -z --type=ALL
A re-roll of [1], the work I based v1 on top of has landed in master, so this has been rebased on master. The changes here are minor, just a typo fix / commit message clarification, moving "require" closer to where it's used, and finally a new 10/10 patch to s/cwd/getcwd/g. As noted in the commit message I don't think that'll make any difference in practice. The "time" Eric posted was for loading Cwd.pm and then doing cwd() or getcwd(), but when we run it we've already paid the cost of loading Cwd.pm. But it was an easy change to make, so let's make it anyway. 1. https://lore.kernel.org/git/cover-0.9-0000000000-20210512T132955Z-avarab@gmail.com/ Ævar Arnfjörð Bjarmason (10): send-email: remove non-working support for "sendemail.smtpssl" send-email: refactor sendemail.smtpencryption config parsing send-email: lazily load config for a big speedup send-email: lazily shell out to "git var" send-email: use function syntax instead of barewords send-email: get rid of indirect object syntax send-email: lazily load modules for a big speedup perl: lazily load some common Git.pm setup code send-email: move trivial config handling to Perl perl: nano-optimize by replacing Cwd::cwd() with Cwd::getcwd() Documentation/config/sendemail.txt | 3 - git-send-email.perl | 146 +++++++++++++++++------------ perl/Git.pm | 49 +++++----- 3 files changed, 111 insertions(+), 87 deletions(-) Range-diff against v1: 1: 92571a8cf7f < -: ----------- Makefile: make PERL_DEFINES recursively expanded 2: 85b706d43fc ! 1: 8474acae689 send-email: remove non-working support for "sendemail.smtpssl" @@ Metadata ## Commit message ## send-email: remove non-working support for "sendemail.smtpssl" - Remove the already dead code to support "sendemail.smtssl" by finally + Remove the already dead code to support "sendemail.smtpssl" by finally removing the dead code supporting the configuration option. In f6bebd121ac (git-send-email: add support for TLS via 3: c22af817f10 = 2: b87f53adbed send-email: refactor sendemail.smtpencryption config parsing 4: 1e14d322535 = 3: 1b27a393ae3 send-email: lazily load config for a big speedup 5: e1df469d5fe = 4: acee22b77d2 send-email: lazily shell out to "git var" 6: 8846d40fc02 = 5: f317cd1c01e send-email: use function syntax instead of barewords 7: 0dde0e14ef6 = 6: fc27024f838 send-email: get rid of indirect object syntax 8: 55a0b07062f ! 7: f86f5453d7a send-email: lazily load modules for a big speedup @@ git-send-email.perl: sub do_edit { # Handle Uncouth Termination sub signal_handler { -+ require Term::ANSIColor; - +- # Make text normal - print color("reset"), "\n"; ++ require Term::ANSIColor; + print Term::ANSIColor::color("reset"), "\n"; # SMTP password masked 9: 2312346f71e = 8: 86641377c0d perl: lazily load some common Git.pm setup code 10: 0d87c9a5a37 ! 9: 895c9e29a96 send-email: move trivial config handling to Perl @@ Commit message Optimize the startup time of git-send-email by using an amended config_regexp() function to retrieve the list of config keys and - values we're interested in. + values we're interested in. See the earlier "send-email: lazily load + config for a big speedup" commit for why changing its interface is OK. For boolean keys we can handle the [true|false] case ourselves, and the "--get" case didn't need any parsing. Let's leave "--path" and @@ Commit message "git send-email". This brings the runtime of "git send-email" from ~60-~70ms to a very - steady ~40ms on my test box. We no run just one "git config" + steady ~40ms on my test box. We now run just one "git config" invocation on startup instead of 8, the exact number will differ based on the local sendemail.* config. I happen to have 8 of those set. -: ----------- > 10: 97455f993d5 perl: nano-optimize by replacing Cwd::cwd() with Cwd::getcwd() -- 2.32.0.rc0.405.g5d387561bb3
Remove the already dead code to support "sendemail.smtpssl" by finally removing the dead code supporting the configuration option. In f6bebd121ac (git-send-email: add support for TLS via Net::SMTP::SSL, 2008-06-25) the --smtp-ssl command-line option was documented as deprecated, later in 65180c66186 (List send-email config options in config.txt., 2009-07-22) the "sendemail.smtpssl" configuration option was also documented as such. Then in in 3ff15040e22 (send-email: fix regression in sendemail.identity parsing, 2019-05-17) I unintentionally removed support for it by introducing a bug in read_config(). As can be seen from the diff context we've already returned unless $enc i defined, so it's not possible for us to reach the "elsif" branch here. This code was therefore already dead since Git v2.23.0. So let's just remove it. We were already 11 years into a stated deprecation period of this variable when 3ff15040e22 landed, now it's around 13. Since it hasn't worked anyway for around 2 years it looks like we can safely remove it. The --smtp-ssl option is still deprecated, if someone cares they can follow-up and remove that too, but unlike the config option that one could still be in use in the wild. I'm just removing this code that's provably unused already. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- Documentation/config/sendemail.txt | 3 --- git-send-email.perl | 6 +----- 2 files changed, 1 insertion(+), 8 deletions(-) diff --git a/Documentation/config/sendemail.txt b/Documentation/config/sendemail.txt index cbc5af42fdf..50baa5d6bfb 100644 --- a/Documentation/config/sendemail.txt +++ b/Documentation/config/sendemail.txt @@ -8,9 +8,6 @@ sendemail.smtpEncryption:: See linkgit:git-send-email[1] for description. Note that this setting is not subject to the 'identity' mechanism. -sendemail.smtpssl (deprecated):: - Deprecated alias for 'sendemail.smtpEncryption = ssl'. - sendemail.smtpsslcertpath:: Path to ca-certificates (either a directory or a single file). Set it to an empty string to disable certificate verification. diff --git a/git-send-email.perl b/git-send-email.perl index 175da07d946..3ef194fe2da 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -383,11 +383,7 @@ sub read_config { my $enc = Git::config(@repo, $setting); return unless defined $enc; return if $configured->{$setting}++; - if (defined $enc) { - $smtp_encryption = $enc; - } elsif (Git::config_bool(@repo, "$prefix.smtpssl")) { - $smtp_encryption = 'ssl'; - } + $smtp_encryption = $enc; } } -- 2.32.0.rc0.405.g5d387561bb3
With the removal of the support for sendemail.smtpssl in the preceding commit the parsing of sendemail.smtpencryption is no longer special, and can by moved to %config_settings. This gets us rid of an unconditional call to Git::config(), which as we'll see in subsequent commits matters for startup performance. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- git-send-email.perl | 9 +-------- 1 file changed, 1 insertion(+), 8 deletions(-) diff --git a/git-send-email.perl b/git-send-email.perl index 3ef194fe2da..f9d859da3ed 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -277,6 +277,7 @@ sub do_edit { ); my %config_settings = ( + "smtpencryption" => \$smtp_encryption, "smtpserver" => \$smtp_server, "smtpserverport" => \$smtp_server_port, "smtpserveroption" => \@smtp_server_options, @@ -377,14 +378,6 @@ sub read_config { $$target = $v; } } - - if (!defined $smtp_encryption) { - my $setting = "$prefix.smtpencryption"; - my $enc = Git::config(@repo, $setting); - return unless defined $enc; - return if $configured->{$setting}++; - $smtp_encryption = $enc; - } } # sendemail.identity yields to --identity. We must parse this -- 2.32.0.rc0.405.g5d387561bb3
Reduce the time it takes git-send-email to get to even the most trivial of tasks (such as serving up its "-h" output) by first listing config keys that exist, and only then only call e.g. "git config --bool" on them if they do. Over a lot of runs this speeds the time to "-h" up for me from ~250ms to ~150ms, and the runtime of t9001-send-email.sh goes from ~25s to ~20s. This introduces a race condition where we'll do the "wrong" thing if a config key were to be inserted between us discovering the list and calling read_config(), i.e. we won't know about the racily added key. In theory this is a change in behavior, in practice it doesn't matter. The config_regexp() function being changed here was added in dd84e528a34 (git-send-email: die if sendmail.* config is set, 2020-07-23) for use by git-send-email. So we can change its odd return value in the case where no values are found by "git config". The difference in the *.pm code would matter if it was invoked in scalar context, but now it no longer is. Arguably this caching belongs in Git.pm itself, but in lieu of modifying it for all its callers let's only do this for "git send-email". The other big potential win would be "git svn", but unlike "git send-email" it doesn't check tens of config variables one at a time at startup (in my brief testing it doesn't check any). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- git-send-email.perl | 35 ++++++++++++++++++++++++++--------- perl/Git.pm | 4 ++-- 2 files changed, 28 insertions(+), 11 deletions(-) diff --git a/git-send-email.perl b/git-send-email.perl index f9d859da3ed..5b0b7c33ec6 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -337,11 +337,13 @@ sub signal_handler { # Read our sendemail.* config sub read_config { - my ($configured, $prefix) = @_; + my ($known_keys, $configured, $prefix) = @_; foreach my $setting (keys %config_bool_settings) { my $target = $config_bool_settings{$setting}; - my $v = Git::config_bool(@repo, "$prefix.$setting"); + my $key = "$prefix.$setting"; + next unless exists $known_keys->{$key}; + my $v = Git::config_bool(@repo, $key); next unless defined $v; next if $configured->{$setting}++; $$target = $v; @@ -349,8 +351,10 @@ sub read_config { foreach my $setting (keys %config_path_settings) { my $target = $config_path_settings{$setting}; + my $key = "$prefix.$setting"; + next unless exists $known_keys->{$key}; if (ref($target) eq "ARRAY") { - my @values = Git::config_path(@repo, "$prefix.$setting"); + my @values = Git::config_path(@repo, $key); next unless @values; next if $configured->{$setting}++; @$target = @values; @@ -365,14 +369,16 @@ sub read_config { foreach my $setting (keys %config_settings) { my $target = $config_settings{$setting}; + my $key = "$prefix.$setting"; + next unless exists $known_keys->{$key}; if (ref($target) eq "ARRAY") { - my @values = Git::config(@repo, "$prefix.$setting"); + my @values = Git::config(@repo, $key); next unless @values; next if $configured->{$setting}++; @$target = @values; } else { - my $v = Git::config(@repo, "$prefix.$setting"); + my $v = Git::config(@repo, $key); next unless defined $v; next if $configured->{$setting}++; $$target = $v; @@ -380,9 +386,20 @@ sub read_config { } } +# Save ourselves a lot of work of shelling out to 'git config' (it +# parses 'bool' etc.) by only doing so for config keys that exist. +my %known_config_keys; +{ + my @known_config_keys = Git::config_regexp("^sende?mail[.]"); + @known_config_keys{@known_config_keys} = (); +} + # sendemail.identity yields to --identity. We must parse this # special-case first before the rest of the config is read. -$identity = Git::config(@repo, "sendemail.identity"); +{ + my $key = "sendemail.identity"; + $identity = Git::config(@repo, $key) if exists $known_config_keys{$key}; +} my $rc = GetOptions( "identity=s" => \$identity, "no-identity" => \$no_identity, @@ -393,8 +410,8 @@ sub read_config { # Now we know enough to read the config { my %configured; - read_config(\%configured, "sendemail.$identity") if defined $identity; - read_config(\%configured, "sendemail"); + read_config(\%known_config_keys, \%configured, "sendemail.$identity") if defined $identity; + read_config(\%known_config_keys, \%configured, "sendemail"); } # Begin by accumulating all the variables (defined above), that we will end up @@ -478,7 +495,7 @@ sub read_config { usage(); } -if ($forbid_sendmail_variables && (scalar Git::config_regexp("^sendmail[.]")) != 0) { +if ($forbid_sendmail_variables && grep { /^sendmail/s } keys %known_config_keys) { die __("fatal: found configuration options for 'sendmail'\n" . "git-send-email is configured with the sendemail.* options - note the 'e'.\n" . "Set sendemail.forbidSendmailVariables to false to disable this check.\n"); diff --git a/perl/Git.pm b/perl/Git.pm index 73ebbf80cc6..06a10b175a7 100644 --- a/perl/Git.pm +++ b/perl/Git.pm @@ -754,8 +754,8 @@ sub config_regexp { } catch Git::Error::Command with { my $E = shift; if ($E->value() == 1) { - my @matches = (); - return @matches; + # Key(s) not found. + return; } else { throw $E; } -- 2.32.0.rc0.405.g5d387561bb3
Optimize git-send-email by only shelling out to "git var" if we need to. This is easily done by re-inventing our own small version of perl's Memoize module. I suppose I could just use Memoize itself, but in a subsequent patch I'll be micro-optimizing send-email's use of dependencies. Using Memoize is a measly extra 5-10 milliseconds, but as we'll see that'll end up mattering for us in the end. This brings the runtime of a plain "send-email" from around ~160-170ms to ~140m-150ms. The runtime of the tests is around the same, or around ~20s. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- git-send-email.perl | 20 +++++++++++++++----- 1 file changed, 15 insertions(+), 5 deletions(-) diff --git a/git-send-email.perl b/git-send-email.perl index 5b0b7c33ec6..9ff315f775c 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -560,8 +560,18 @@ sub read_config { } my ($repoauthor, $repocommitter); -($repoauthor) = Git::ident_person(@repo, 'author'); -($repocommitter) = Git::ident_person(@repo, 'committer'); +{ + my %cache; + my ($author, $committer); + my $common = sub { + my ($what) = @_; + return $cache{$what} if exists $cache{$what}; + ($cache{$what}) = Git::ident_person(@repo, $what); + return $cache{$what}; + }; + $repoauthor = sub { $common->('author') }; + $repocommitter = sub { $common->('committer') }; +} sub parse_address_line { return map { $_->format } Mail::Address->parse($_[0]); @@ -749,7 +759,7 @@ sub get_patch_subject { or die sprintf(__("Failed to open for writing %s: %s"), $compose_filename, $!); - my $tpl_sender = $sender || $repoauthor || $repocommitter || ''; + my $tpl_sender = $sender || $repoauthor->() || $repocommitter->() || ''; my $tpl_subject = $initial_subject || ''; my $tpl_in_reply_to = $initial_in_reply_to || ''; my $tpl_reply_to = $reply_to || ''; @@ -955,7 +965,7 @@ sub file_declares_8bit_cte { $sender =~ s/^\s+|\s+$//g; ($sender) = expand_aliases($sender); } else { - $sender = $repoauthor || $repocommitter || ''; + $sender = $repoauthor->() || $repocommitter->() || ''; } # $sender could be an already sanitized address @@ -1104,7 +1114,7 @@ sub make_message_id { $uniq = "$message_id_stamp-$message_id_serial"; my $du_part; - for ($sender, $repocommitter, $repoauthor) { + for ($sender, $repocommitter->(), $repoauthor->()) { $du_part = extract_valid_address(sanitize_address($_)); last if (defined $du_part and $du_part ne ''); } -- 2.32.0.rc0.405.g5d387561bb3
Change calls like "__ 'foo'" to "__('foo')" so the Perl compiler doesn't have to guess that "__" is a function. This makes the code more readable. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- git-send-email.perl | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/git-send-email.perl b/git-send-email.perl index 9ff315f775c..da46925aa05 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -678,7 +678,7 @@ sub is_format_patch_arg { if (defined($format_patch)) { return $format_patch; } - die sprintf(__ <<EOF, $f, $f); + die sprintf(__(<<EOF), $f, $f); File '%s' exists but it could also be the range of commits to produce patches for. Please disambiguate by... @@ -764,7 +764,7 @@ sub get_patch_subject { my $tpl_in_reply_to = $initial_in_reply_to || ''; my $tpl_reply_to = $reply_to || ''; - print $c <<EOT1, Git::prefix_lines("GIT: ", __ <<EOT2), <<EOT3; + print $c <<EOT1, Git::prefix_lines("GIT: ", __(<<EOT2)), <<EOT3; From $tpl_sender # This line is ignored. EOT1 Lines beginning in "GIT:" will be removed. -- 2.32.0.rc0.405.g5d387561bb3
Change indirect object syntax such as "new X ARGS" to "X->new(ARGS)". This allows perl to see what "new" is at compile-time without having loaded Term::ReadLine. This doesn't matter now, but will in a subsequent commit when we start lazily loading it. Let's do the same for the adjacent "FakeTerm" package for consistency, even though we're not going to conditionally load it. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- git-send-email.perl | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/git-send-email.perl b/git-send-email.perl index da46925aa05..f9c780ceed6 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -194,11 +194,11 @@ sub format_2822_time { my @repo = $repo ? ($repo) : (); my $term = eval { $ENV{"GIT_SEND_EMAIL_NOTTY"} - ? new Term::ReadLine 'git-send-email', \*STDIN, \*STDOUT - : new Term::ReadLine 'git-send-email'; + ? Term::ReadLine->new('git-send-email', \*STDIN, \*STDOUT) + : Term::ReadLine->new('git-send-email'); }; if ($@) { - $term = new FakeTerm "$@: going non-interactive"; + $term = FakeTerm->new("$@: going non-interactive"); } # Behavior modification variables -- 2.32.0.rc0.405.g5d387561bb3
Optimize the time git-send-email takes to do even the simplest of things (such as serving up "-h") from around ~150ms to ~80ms-~90ms by lazily loading the modules it requires. Before this change Devel::TraceUse would report 99/97 used modules under NO_GETTEXT=[|Y], respectively. Now it's 52/37. It now takes ~15s to run t9001-send-email.sh, down from ~20s. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- git-send-email.perl | 71 +++++++++++++++++++++++++-------------------- 1 file changed, 39 insertions(+), 32 deletions(-) diff --git a/git-send-email.perl b/git-send-email.perl index f9c780ceed6..a521f37c341 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -19,20 +19,10 @@ use 5.008; use strict; use warnings; -use POSIX qw/strftime/; -use Term::ReadLine; use Getopt::Long; -use Text::ParseWords; -use Term::ANSIColor; -use File::Temp qw/ tempdir tempfile /; -use File::Spec::Functions qw(catdir catfile); use Git::LoadCPAN::Error qw(:try); -use Cwd qw(abs_path cwd); use Git; use Git::I18N; -use Net::Domain (); -use Net::SMTP (); -use Git::LoadCPAN::Mail::Address; Getopt::Long::Configure qw/ pass_through /; @@ -166,7 +156,6 @@ sub format_2822_time { ); } -my $have_email_valid = eval { require Email::Valid; 1 }; my $smtp; my $auth; my $num_sent = 0; @@ -192,14 +181,6 @@ sub format_2822_time { my $repo = eval { Git->repository() }; my @repo = $repo ? ($repo) : (); -my $term = eval { - $ENV{"GIT_SEND_EMAIL_NOTTY"} - ? Term::ReadLine->new('git-send-email', \*STDIN, \*STDOUT) - : Term::ReadLine->new('git-send-email'); -}; -if ($@) { - $term = FakeTerm->new("$@: going non-interactive"); -} # Behavior modification variables my ($quiet, $dry_run) = (0, 0); @@ -309,9 +290,9 @@ sub do_edit { # Handle Uncouth Termination sub signal_handler { - # Make text normal - print color("reset"), "\n"; + require Term::ANSIColor; + print Term::ANSIColor::color("reset"), "\n"; # SMTP password masked system "stty echo"; @@ -574,11 +555,13 @@ sub read_config { } sub parse_address_line { + require Git::LoadCPAN::Mail::Address; return map { $_->format } Mail::Address->parse($_[0]); } sub split_addrs { - return quotewords('\s*,\s*', 1, @_); + require Text::ParseWords; + return Text::ParseWords::quotewords('\s*,\s*', 1, @_); } my %aliases; @@ -627,10 +610,11 @@ sub parse_sendmail_aliases { s/\\"/"/g foreach @addr; $aliases{$alias} = \@addr }}}, - mailrc => sub { my $fh = shift; while (<$fh>) { + mailrc => sub { my $fh = shift; while (<$fh>) { if (/^alias\s+(\S+)\s+(.*?)\s*$/) { + require Text::ParseWords; # spaces delimit multiple addresses - $aliases{$1} = [ quotewords('\s+', 0, $2) ]; + $aliases{$1} = [ Text::ParseWords::quotewords('\s+', 0, $2) ]; }}}, pine => sub { my $fh = shift; my $f='\t[^\t]*'; for (my $x = ''; defined($x); $x = $_) { @@ -702,7 +686,8 @@ sub is_format_patch_arg { opendir my $dh, $f or die sprintf(__("Failed to opendir %s: %s"), $f, $!); - push @files, grep { -f $_ } map { catfile($f, $_) } + require File::Spec; + push @files, grep { -f $_ } map { File::Spec->catfile($f, $_) } sort readdir $dh; closedir $dh; } elsif ((-f $f or -p $f) and !is_format_patch_arg($f)) { @@ -715,7 +700,8 @@ sub is_format_patch_arg { if (@rev_list_opts) { die __("Cannot run git format-patch from outside a repository\n") unless $repo; - push @files, $repo->command('format-patch', '-o', tempdir(CLEANUP => 1), @rev_list_opts); + require File::Temp; + push @files, $repo->command('format-patch', '-o', File::Temp::tempdir(CLEANUP => 1), @rev_list_opts); } @files = handle_backup_files(@files); @@ -752,9 +738,10 @@ sub get_patch_subject { if ($compose) { # Note that this does not need to be secure, but we will make a small # effort to have it be unique + require File::Temp; $compose_filename = ($repo ? - tempfile(".gitsendemail.msg.XXXXXX", DIR => $repo->repo_path()) : - tempfile(".gitsendemail.msg.XXXXXX", DIR => "."))[1]; + File::Temp::tempfile(".gitsendemail.msg.XXXXXX", DIR => $repo->repo_path()) : + File::Temp::tempfile(".gitsendemail.msg.XXXXXX", DIR => "."))[1]; open my $c, ">", $compose_filename or die sprintf(__("Failed to open for writing %s: %s"), $compose_filename, $!); @@ -861,6 +848,19 @@ sub get_patch_subject { do_edit(@files); } +sub term { + my $term = eval { + require Term::ReadLine; + $ENV{"GIT_SEND_EMAIL_NOTTY"} + ? Term::ReadLine->new('git-send-email', \*STDIN, \*STDOUT) + : Term::ReadLine->new('git-send-email'); + }; + if ($@) { + $term = FakeTerm->new("$@: going non-interactive"); + } + return $term; +} + sub ask { my ($prompt, %arg) = @_; my $valid_re = $arg{valid_re}; @@ -868,6 +868,7 @@ sub ask { my $confirm_only = $arg{confirm_only}; my $resp; my $i = 0; + my $term = term(); return defined $default ? $default : undef unless defined $term->IN and defined fileno($term->IN) and defined $term->OUT and defined fileno($term->OUT); @@ -1048,6 +1049,7 @@ sub extract_valid_address { return $address if ($address =~ /^($local_part_regexp)$/); $address =~ s/^\s*<(.*)>\s*$/$1/; + my $have_email_valid = eval { require Email::Valid; 1 }; if ($have_email_valid) { return scalar Email::Valid->address($address); } @@ -1107,7 +1109,8 @@ sub validate_address_list { sub make_message_id { my $uniq; if (!defined $message_id_stamp) { - $message_id_stamp = strftime("%Y%m%d%H%M%S.$$", gmtime(time)); + require POSIX; + $message_id_stamp = POSIX::strftime("%Y%m%d%H%M%S.$$", gmtime(time)); $message_id_serial = 0; } $message_id_serial++; @@ -1277,6 +1280,7 @@ sub valid_fqdn { sub maildomain_net { my $maildomain; + require Net::Domain; my $domain = Net::Domain::domainname(); $maildomain = $domain if valid_fqdn($domain); @@ -1287,6 +1291,7 @@ sub maildomain_mta { my $maildomain; for my $host (qw(mailhost localhost)) { + require Net::SMTP; my $smtp = Net::SMTP->new($host); if (defined $smtp) { my $domain = $smtp->domain; @@ -1965,13 +1970,15 @@ sub validate_patch { my ($fn, $xfer_encoding) = @_; if ($repo) { - my $validate_hook = catfile($repo->hooks_path(), + require File::Spec; + my $validate_hook = File::Spec->catfile($repo->hooks_path(), 'sendemail-validate'); my $hook_error; if (-x $validate_hook) { - my $target = abs_path($fn); + require Cwd; + my $target = Cwd::abs_path($fn); # The hook needs a correct cwd and GIT_DIR. - my $cwd_save = cwd(); + my $cwd_save = Cwd::cwd(); chdir($repo->wc_path() or $repo->repo_path()) or die("chdir: $!"); local $ENV{"GIT_DIR"} = $repo->repo_path(); -- 2.32.0.rc0.405.g5d387561bb3
Instead of unconditionally requiring modules such as File::Spec, let's only load them when needed. This speeds up code that only needs a subset of the features Git.pm provides. This brings a plain invocation of "git send-email" down from 52/37 loaded modules under NO_GETTEXT=[|Y] to 39/18, and it now takes ~60-~70ms instead of ~80-~90ms. The runtime of t9001-send-email.sh test is down to ~13s from ~15s. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- perl/Git.pm | 35 ++++++++++++++++++----------------- 1 file changed, 18 insertions(+), 17 deletions(-) diff --git a/perl/Git.pm b/perl/Git.pm index 06a10b175a7..f18852fb09c 100644 --- a/perl/Git.pm +++ b/perl/Git.pm @@ -11,9 +11,6 @@ package Git; use strict; use warnings $ENV{GIT_PERL_FATAL_WARNINGS} ? qw(FATAL all) : (); -use File::Temp (); -use File::Spec (); - BEGIN { our ($VERSION, @ISA, @EXPORT, @EXPORT_OK); @@ -103,12 +100,9 @@ =head1 DESCRIPTION =cut -use Carp qw(carp croak); # but croak is bad - throw instead +sub carp { require Carp; goto &Carp::carp } +sub croak { require Carp; goto &Carp::croak } use Git::LoadCPAN::Error qw(:try); -use Cwd qw(abs_path cwd); -use IPC::Open2 qw(open2); -use Fcntl qw(SEEK_SET SEEK_CUR); -use Time::Local qw(timegm); } @@ -191,13 +185,15 @@ sub repository { $dir = undef; }; + require Cwd; if ($dir) { + require File::Spec; File::Spec->file_name_is_absolute($dir) or $dir = $opts{Directory} . '/' . $dir; - $opts{Repository} = abs_path($dir); + $opts{Repository} = Cwd::abs_path($dir); # If --git-dir went ok, this shouldn't die either. my $prefix = $search->command_oneline('rev-parse', '--show-prefix'); - $dir = abs_path($opts{Directory}) . '/'; + $dir = Cwd::abs_path($opts{Directory}) . '/'; if ($prefix) { if (substr($dir, -length($prefix)) ne $prefix) { throw Error::Simple("rev-parse confused me - $dir does not have trailing $prefix"); @@ -223,7 +219,7 @@ sub repository { throw Error::Simple("fatal: Not a git repository: $dir"); } - $opts{Repository} = abs_path($dir); + $opts{Repository} = Cwd::abs_path($dir); } delete $opts{Directory}; @@ -408,10 +404,12 @@ sub command_bidi_pipe { my $cwd_save = undef; if ($self) { shift; - $cwd_save = cwd(); + require Cwd; + $cwd_save = Cwd::cwd(); _setup_git_cmd_env($self); } - $pid = open2($in, $out, 'git', @_); + require IPC::Open2; + $pid = IPC::Open2::open2($in, $out, 'git', @_); chdir($cwd_save) if $cwd_save; return ($pid, $in, $out, join(' ', @_)); } @@ -538,7 +536,8 @@ sub get_tz_offset { my $t = shift || time; my @t = localtime($t); $t[5] += 1900; - my $gm = timegm(@t); + require Time::Local; + my $gm = Time::Local::timegm(@t); my $sign = qw( + + - )[ $gm <=> $t ]; return sprintf("%s%02d%02d", $sign, (gmtime(abs($t - $gm)))[2,1]); } @@ -629,7 +628,8 @@ sub hooks_path { my ($self) = @_; my $dir = $self->command_oneline('rev-parse', '--git-path', 'hooks'); - my $abs = abs_path($dir); + require Cwd; + my $abs = Cwd::abs_path($dir); return $abs; } @@ -1353,6 +1353,7 @@ sub _temp_cache { my $n = $name; $n =~ s/\W/_/g; # no strange chars + require File::Temp; ($$temp_fd, $fname) = File::Temp::tempfile( "Git_${n}_XXXXXX", UNLINK => 1, DIR => $tmpdir, ) or throw Error::Simple("couldn't open new temp file"); @@ -1375,9 +1376,9 @@ sub temp_reset { truncate $temp_fd, 0 or throw Error::Simple("couldn't truncate file"); - sysseek($temp_fd, 0, SEEK_SET) and seek($temp_fd, 0, SEEK_SET) + sysseek($temp_fd, 0, Fcntl::SEEK_SET()) and seek($temp_fd, 0, Fcntl::SEEK_SET()) or throw Error::Simple("couldn't seek to beginning of file"); - sysseek($temp_fd, 0, SEEK_CUR) == 0 and tell($temp_fd) == 0 + sysseek($temp_fd, 0, Fcntl::SEEK_CUR()) == 0 and tell($temp_fd) == 0 or throw Error::Simple("expected file position to be reset"); } -- 2.32.0.rc0.405.g5d387561bb3
Optimize the startup time of git-send-email by using an amended config_regexp() function to retrieve the list of config keys and values we're interested in. See the earlier "send-email: lazily load config for a big speedup" commit for why changing its interface is OK. For boolean keys we can handle the [true|false] case ourselves, and the "--get" case didn't need any parsing. Let's leave "--path" and other "--bool" cases to "git config". As noted in a preceding commit we're free to change the config_regexp() function, it's only used by "git send-email". This brings the runtime of "git send-email" from ~60-~70ms to a very steady ~40ms on my test box. We now run just one "git config" invocation on startup instead of 8, the exact number will differ based on the local sendemail.* config. I happen to have 8 of those set. This brings the runtime of t9001-send-email.sh from ~13s down to ~12s for me. The change there is less impressive as many of those tests set various config values, and we're also getting to the point of diminishing returns for optimizing "git send-email" itself. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- git-send-email.perl | 17 ++++++++++------- perl/Git.pm | 10 +++++----- 2 files changed, 15 insertions(+), 12 deletions(-) diff --git a/git-send-email.perl b/git-send-email.perl index a521f37c341..28b9a20a7f2 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -324,7 +324,10 @@ sub read_config { my $target = $config_bool_settings{$setting}; my $key = "$prefix.$setting"; next unless exists $known_keys->{$key}; - my $v = Git::config_bool(@repo, $key); + my $v = (@{$known_keys->{$key}} == 1 && + $known_keys->{$key}->[0] =~ /^(?:true|false)$/s) + ? $known_keys->{$key}->[0] eq 'true' + : Git::config_bool(@repo, $key); next unless defined $v; next if $configured->{$setting}++; $$target = $v; @@ -353,14 +356,12 @@ sub read_config { my $key = "$prefix.$setting"; next unless exists $known_keys->{$key}; if (ref($target) eq "ARRAY") { - my @values = Git::config(@repo, $key); - next unless @values; + my @values = @{$known_keys->{$key}}; next if $configured->{$setting}++; @$target = @values; } else { - my $v = Git::config(@repo, $key); - next unless defined $v; + my $v = $known_keys->{$key}->[0]; next if $configured->{$setting}++; $$target = $v; } @@ -371,8 +372,10 @@ sub read_config { # parses 'bool' etc.) by only doing so for config keys that exist. my %known_config_keys; { - my @known_config_keys = Git::config_regexp("^sende?mail[.]"); - @known_config_keys{@known_config_keys} = (); + my @kv = Git::config_regexp("^sende?mail[.]"); + while (my ($k, $v) = splice @kv, 0, 2) { + push @{$known_config_keys{$k}} => $v; + } } # sendemail.identity yields to --identity. We must parse this diff --git a/perl/Git.pm b/perl/Git.pm index f18852fb09c..a9020d0d01f 100644 --- a/perl/Git.pm +++ b/perl/Git.pm @@ -739,18 +739,18 @@ sub config_int { =item config_regexp ( RE ) Retrieve the list of configuration key names matching the regular -expression C<RE>. The return value is a list of strings matching -this regex. +expression C<RE>. The return value is an ARRAY of key-value pairs. =cut sub config_regexp { my ($self, $regex) = _maybe_self(@_); try { - my @cmd = ('config', '--name-only', '--get-regexp', $regex); + my @cmd = ('config', '--null', '--get-regexp', $regex); unshift @cmd, $self if $self; - my @matches = command(@cmd); - return @matches; + my $data = command(@cmd); + my (@kv) = map { split /\n/, $_, 2 } split /\0/, $data; + return @kv; } catch Git::Error::Command with { my $E = shift; if ($E->value() == 1) { -- 2.32.0.rc0.405.g5d387561bb3
It has been pointed out[1] that cwd() invokes "pwd(1)" while getcwd() is a Perl-native XS function. For what we're using these for we can use getcwd(). The performance difference is miniscule, we're saving on the order of a millisecond or so, see [2] below for the benchmark. I don't think this matters in practice for optimizing git-send-email or perl execution (unlike the patches leading up to this one). But let's do it regardless of that, if only so we don't have to think about this as a low-hanging fruit anymore. 1. https://lore.kernel.org/git/20210512180517.GA11354@dcvr/ 2. $ perl -MBenchmark=:all -MCwd -wE 'cmpthese(10000, { getcwd => sub { getcwd }, cwd => sub { cwd }, pwd => sub { system "pwd >/dev/null" }})' (warning: too few iterations for a reliable count) Rate pwd cwd getcwd pwd 982/s -- -48% -100% cwd 1890/s 92% -- -100% getcwd 10000000000000000000/s 1018000000000000000% 529000000000000064% - Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- git-send-email.perl | 2 +- perl/Git.pm | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/git-send-email.perl b/git-send-email.perl index 28b9a20a7f2..ff3008bfa34 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -1981,7 +1981,7 @@ sub validate_patch { require Cwd; my $target = Cwd::abs_path($fn); # The hook needs a correct cwd and GIT_DIR. - my $cwd_save = Cwd::cwd(); + my $cwd_save = Cwd::getcwd(); chdir($repo->wc_path() or $repo->repo_path()) or die("chdir: $!"); local $ENV{"GIT_DIR"} = $repo->repo_path(); diff --git a/perl/Git.pm b/perl/Git.pm index a9020d0d01f..358a5f5625e 100644 --- a/perl/Git.pm +++ b/perl/Git.pm @@ -405,7 +405,7 @@ sub command_bidi_pipe { if ($self) { shift; require Cwd; - $cwd_save = Cwd::cwd(); + $cwd_save = Cwd::getcwd(); _setup_git_cmd_env($self); } require IPC::Open2; -- 2.32.0.rc0.405.g5d387561bb3
On Thu, May 20, 2021 at 10:18:57AM +0200, Ævar Arnfjörð Bjarmason wrote:
> A re-roll of [1], the work I based v1 on top of has landed in master,
> so this has been rebased on master.
>
> The changes here are minor, just a typo fix / commit message
> clarification, moving "require" closer to where it's used, and finally
> a new 10/10 patch to s/cwd/getcwd/g.
I like all of this, except for the change in the interface of
Git::config_regexp(). You mention that it's new-ish, and probably not in
wide use. And I agree that's probably true. But it feels like violating
a principle of not breaking APIs, and we should stick to that principle
and not bend it for "well, it's not that old an API".
I'd find it more compelling if it the existing interface was broken or
hard to avoid changing. But couldn't we just add a new function with the
extra info (config_regexp_with_values() or something)?
-Peff
Jeff King <peff@peff.net> writes:
> I like all of this, except for the change in the interface of
> Git::config_regexp(). You mention that it's new-ish, and probably not in
> wide use. And I agree that's probably true. But it feels like violating
> a principle of not breaking APIs, and we should stick to that principle
> and not bend it for "well, it's not that old an API".
>
> I'd find it more compelling if it the existing interface was broken or
> hard to avoid changing. But couldn't we just add a new function with the
> extra info (config_regexp_with_values() or something)?
We seem to use it in-tree only once, but we have no control over
what out-of-tree users have been doing.
I do like your proposed name for the fact that it has _with_values
in it; the original gave us only a list of configuration variable
names, and now we are getting name-value tuples.
BUT
I am not sure if I like the new interface, and I do not know if the
the name "config_with_values" is a good match for what it does.
Namely, when a function is described as:
=item config_regexp ( RE )
Retrieve the list of configuration key names matching the regular
expression C<RE>. The return value is an ARRAY of key-value pairs.
I would expect it to return ([key1, value1], [key2, value2], ...),
but the implementation returns (key1, value1, key2, value2, ...),
i.e. a flattened list, if I am not mistaken.
my @cmd = ('config', '--null', '--get-regexp', $regex);
unshift @cmd, $self if $self;
my $data = command(@cmd);
my (@kv) = map { split /\n/, $_, 2 } split /\0/, $data;
return @kv;
We get NUL terminated records, so we first split the entire output
at NULs (to get a list of key-value records). Each key-value record
has the key followed by LF followed by value, so we split it at the
first LF (i.e. a value with an embedded LF would behave correctly)
to extract key and value out of it. But the resulting list is a
flat list with alternating key and value.
The side that works on the returned value of courese knows that it
is getting a flattened list and acts accordingly:
my @kv = Git::config_regexp("^sende?mail[.]");
while (my ($k, $v) = splice @kv, 0, 2) {
push @{$known_config_keys{$k}} => $v;
}
Perhaps it may be more performant than a more obvious and
straight-forward "list of tuples", i.e.
return map { [split /\n/, $_, 2] } split /\0/, $data;
my @kv = Git::config_regexp("^sende?mail[.]");
for my $tuple (@kv) {
push @{$known_config_keys{$tuple->[0]}}, $tuple->[1];
}
but somehow the flattened list felt unnatural at least to a naïve
reader like me.
On Fri, May 21 2021, Junio C Hamano wrote: > Jeff King <peff@peff.net> writes: > >> I like all of this, except for the change in the interface of >> Git::config_regexp(). You mention that it's new-ish, and probably not in >> wide use. And I agree that's probably true. But it feels like violating >> a principle of not breaking APIs, and we should stick to that principle >> and not bend it for "well, it's not that old an API". >> >> I'd find it more compelling if it the existing interface was broken or >> hard to avoid changing. But couldn't we just add a new function with the >> extra info (config_regexp_with_values() or something)? > > We seem to use it in-tree only once, but we have no control over > what out-of-tree users have been doing. I thought people might have bought my argument in v1 that we just document these as "public" as part of some copy/pasting in Git.pm, but sure, I'll re-roll and make this a new function or something... > I do like your proposed name for the fact that it has _with_values > in it; the original gave us only a list of configuration variable > names, and now we are getting name-value tuples. > > BUT > > I am not sure if I like the new interface, and I do not know if the > the name "config_with_values" is a good match for what it does. > > Namely, when a function is described as: > > =item config_regexp ( RE ) > > Retrieve the list of configuration key names matching the regular > expression C<RE>. The return value is an ARRAY of key-value pairs. > > I would expect it to return ([key1, value1], [key2, value2], ...), > but the implementation returns (key1, value1, key2, value2, ...), > i.e. a flattened list, if I am not mistaken. ... > my @cmd = ('config', '--null', '--get-regexp', $regex); > unshift @cmd, $self if $self; > my $data = command(@cmd); > my (@kv) = map { split /\n/, $_, 2 } split /\0/, $data; > return @kv; > > We get NUL terminated records, so we first split the entire output > at NULs (to get a list of key-value records). Each key-value record > has the key followed by LF followed by value, so we split it at the > first LF (i.e. a value with an embedded LF would behave correctly) > to extract key and value out of it. But the resulting list is a > flat list with alternating key and value. > > The side that works on the returned value of courese knows that it > is getting a flattened list and acts accordingly: > > my @kv = Git::config_regexp("^sende?mail[.]"); > while (my ($k, $v) = splice @kv, 0, 2) { > push @{$known_config_keys{$k}} => $v; > } > > Perhaps it may be more performant than a more obvious and > straight-forward "list of tuples", i.e. > > return map { [split /\n/, $_, 2] } split /\0/, $data; > > my @kv = Git::config_regexp("^sende?mail[.]"); > for my $tuple (@kv) { > push @{$known_config_keys{$tuple->[0]}}, $tuple->[1]; > } > > but somehow the flattened list felt unnatural at least to a naïve > reader like me. The "performant" really doesn't matter here, we're comparing shelling out to getting a small number of config keys back. So I wasn't trying to optimize this. Returning a flattened list is idiomatic in Perl, it means that a caller can do any of: # I only care about the last value for a key, or only about # existence checks my %hash = func(); Or: # I want all key-values to iterate over my @kv = func(); Returning touples like this makes that less convenient for both, who'll need to do more work to unpack them. For what it's worth In Perl "return a list" means "flattened list", the term "flattened list" I think comes from other languages. You'd call what you're suggesting a list of arrays, or (if a top-level reference) an array of arrays, AoA for short, AoH for array (ref) of hashes etc.
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
> Returning a flattened list is idiomatic in Perl, it means that a caller
> can do any of:
>
> # I only care about the last value for a key, or only about
> # existence checks
> my %hash = func();
>
> Or:
>
> # I want all key-values to iterate over
> my @kv = func();
>
> Returning touples like this makes that less convenient for both, who'll
> need to do more work to unpack them.
Thanks---that was exactly what I was missing. Following establshed
patterns is good.
So what remains is just the compatibility issues, I think.
Thanks for clarification.
On Fri, May 21, 2021 at 08:23:15AM +0200, Ævar Arnfjörð Bjarmason wrote:
> The "performant" really doesn't matter here, we're comparing shelling
> out to getting a small number of config keys back. So I wasn't trying to
> optimize this.
>
> Returning a flattened list is idiomatic in Perl, it means that a caller
> can do any of:
>
> # I only care about the last value for a key, or only about
> # existence checks
> my %hash = func();
>
> Or:
>
> # I want all key-values to iterate over
> my @kv = func();
>
> Returning touples like this makes that less convenient for both, who'll
> need to do more work to unpack them.
>
> For what it's worth In Perl "return a list" means "flattened list", the
> term "flattened list" I think comes from other languages. You'd call
> what you're suggesting a list of arrays, or (if a top-level reference)
> an array of arrays, AoA for short, AoH for array (ref) of hashes etc.
Yeah, I think that is reasonable. But it made me wonder how we handle
value-less booleans, and I think there's indeed a bug.
Try a config like this:
$ cat >foo <<\EOF
[foo]
key-with-value = string
key-with-empty =
just-bool
another-key-with-value = another
EOF
A regular config --list looks like:
$ git config --file=foo --list
foo.key-with-value=string
foo.key-with-empty=
foo.just-bool
foo.another-key-with-value=another
Note how "just-bool" drops the "=" to distinguish it from the empty
string. With "-z", it looks like this:
$ git config --file=foo --list -z
foo.key-with-value
string^@foo.key-with-empty
^@foo.just-bool^@foo.another-key-with-value
another^@
The NULs separate keys, but keys are separated from their values by a
newline. And again, just-bool omits the newline.
Your parser splits on newline, so for that entry we'll get only one
string returned in the list (the key), rather than two (the key and
value). In a flattened list, that becomes ambiguous. E.g., adapting your
parser into a stand-alone script:
$ git config --file=foo --list -z |
perl -e '
local $/;
my $data = <STDIN>;
my (@kv) = map { split /\n/, $_, 2 } split /\0/, $data;
while (@kv) {
my $k = shift @kv;
print "key: $k\n";
my $v = shift @kv;
print " value: ", (defined $v ? $v : "undef"), "\n";
}
'
key: foo.key-with-value
value: string
key: foo.key-with-empty
value:
key: foo.just-bool
value: foo.another-key-with-value
key: another
value: undef
We end up misinterpreting a key as a value, and vice versa.
Using a non-flattened structure would have prevented this (we'd sensibly
get undef when trying to access the missing second element of the
array). But I do agree the flattened structure is more perl-ish.
Probably you'd want to insert an explicit "undef" into the list. The
most perl-ish I could come up with is:
my (@kv) = map { my ($k, $v) = split /\n/, $_, 2;
($k, $v)
} split /\0/, $data;
I notice that $known_keys then becomes a non-flat representation. You'd
either want to turn that back into a zero-length array there, or store
the "undef" and handle it appropriately (it can be a synonym for "true",
though that is just an optimization at this point).
-Peff
On Fri, May 21 2021, Jeff King wrote:
> On Fri, May 21, 2021 at 08:23:15AM +0200, Ævar Arnfjörð Bjarmason wrote:
>
>> The "performant" really doesn't matter here, we're comparing shelling
>> out to getting a small number of config keys back. So I wasn't trying to
>> optimize this.
>>
>> Returning a flattened list is idiomatic in Perl, it means that a caller
>> can do any of:
>>
>> # I only care about the last value for a key, or only about
>> # existence checks
>> my %hash = func();
>>
>> Or:
>>
>> # I want all key-values to iterate over
>> my @kv = func();
>>
>> Returning touples like this makes that less convenient for both, who'll
>> need to do more work to unpack them.
>>
>> For what it's worth In Perl "return a list" means "flattened list", the
>> term "flattened list" I think comes from other languages. You'd call
>> what you're suggesting a list of arrays, or (if a top-level reference)
>> an array of arrays, AoA for short, AoH for array (ref) of hashes etc.
>
> Yeah, I think that is reasonable. But it made me wonder how we handle
> value-less booleans, and I think there's indeed a bug.
>
> Try a config like this:
>
> $ cat >foo <<\EOF
> [foo]
> key-with-value = string
> key-with-empty =
> just-bool
> another-key-with-value = another
> EOF
>
> A regular config --list looks like:
>
> $ git config --file=foo --list
> foo.key-with-value=string
> foo.key-with-empty=
> foo.just-bool
> foo.another-key-with-value=another
>
> Note how "just-bool" drops the "=" to distinguish it from the empty
> string. With "-z", it looks like this:
>
> $ git config --file=foo --list -z
> foo.key-with-value
> string^@foo.key-with-empty
> ^@foo.just-bool^@foo.another-key-with-value
> another^@
>
> The NULs separate keys, but keys are separated from their values by a
> newline. And again, just-bool omits the newline.
>
> Your parser splits on newline, so for that entry we'll get only one
> string returned in the list (the key), rather than two (the key and
> value). In a flattened list, that becomes ambiguous. E.g., adapting your
> parser into a stand-alone script:
>
> $ git config --file=foo --list -z |
> perl -e '
> local $/;
> my $data = <STDIN>;
> my (@kv) = map { split /\n/, $_, 2 } split /\0/, $data;
>
> while (@kv) {
> my $k = shift @kv;
> print "key: $k\n";
> my $v = shift @kv;
> print " value: ", (defined $v ? $v : "undef"), "\n";
> }
> '
> key: foo.key-with-value
> value: string
> key: foo.key-with-empty
> value:
> key: foo.just-bool
> value: foo.another-key-with-value
> key: another
> value: undef
>
> We end up misinterpreting a key as a value, and vice versa.
>
> Using a non-flattened structure would have prevented this (we'd sensibly
> get undef when trying to access the missing second element of the
> array). But I do agree the flattened structure is more perl-ish.
> Probably you'd want to insert an explicit "undef" into the list. The
> most perl-ish I could come up with is:
>
> my (@kv) = map { my ($k, $v) = split /\n/, $_, 2;
> ($k, $v)
> } split /\0/, $data;
>
> I notice that $known_keys then becomes a non-flat representation. You'd
> either want to turn that back into a zero-length array there, or store
> the "undef" and handle it appropriately (it can be a synonym for "true",
> though that is just an optimization at this point).
>
> -Peff
Ah yes, that's indeed a bug. I'd forgetten about the empty value case.
For what it's worth you can slightly golf that as (split /\n/, $_,
2)[0,1], but I think in this case your version is better than that, it's
more obvious what we're trying to do in always returning the $v.
On Fri, May 21, 2021 at 11:24:00AM +0200, Ævar Arnfjörð Bjarmason wrote:
> > Using a non-flattened structure would have prevented this (we'd sensibly
> > get undef when trying to access the missing second element of the
> > array). But I do agree the flattened structure is more perl-ish.
> > Probably you'd want to insert an explicit "undef" into the list. The
> > most perl-ish I could come up with is:
> >
> > my (@kv) = map { my ($k, $v) = split /\n/, $_, 2;
> > ($k, $v)
> > } split /\0/, $data;
> >
> > I notice that $known_keys then becomes a non-flat representation. You'd
> > either want to turn that back into a zero-length array there, or store
> > the "undef" and handle it appropriately (it can be a synonym for "true",
> > though that is just an optimization at this point).
> >
> > -Peff
>
> Ah yes, that's indeed a bug. I'd forgetten about the empty value case.
>
> For what it's worth you can slightly golf that as (split /\n/, $_,
> 2)[0,1], but I think in this case your version is better than that, it's
> more obvious what we're trying to do in always returning the $v.
Heh. Thanks, I almost invited you to golf it because I was curious if we
could continue to do it in one line. I see I didn't need to ask. :)
Yours is clever and I'm glad to be enlightened, but I agree the
two-liner is probably more obvious (perhaps even a comment like
"bool-only values omit the newline and become undef" is worth it).
-Peff
A re-roll of v2[1] which fixes issues pointed out with v2, plus some others I noticed along the way. There are now no longer any changes to the "public" Git::config_regxp() API. Instead it's left bitrotting in our tree at the end of this, having 0 in-tree users (instead of the 1 currently). We also handle the -c foo.bar (NULL value) case correctly, per what Jeff King pointed out. I also found a related issue with GIT_TEST_PERL_FATAL_WARNINGS=true, which we can now turn on for git-send-email.perl. 1. https://lore.kernel.org/git/cover-00.10-00000000000-20210520T081826Z-avarab@gmail.com/ Ævar Arnfjörð Bjarmason (13): send-email tests: support GIT_TEST_PERL_FATAL_WARNINGS=true send-email tests: test for boolean variables without a value send-email: remove non-working support for "sendemail.smtpssl" send-email: refactor sendemail.smtpencryption config parsing send-email: copy "config_regxp" into git-send-email.perl send-email: lazily load config for a big speedup send-email: lazily shell out to "git var" send-email: use function syntax instead of barewords send-email: get rid of indirect object syntax send-email: lazily load modules for a big speedup perl: lazily load some common Git.pm setup code send-email: move trivial config handling to Perl perl: nano-optimize by replacing Cwd::cwd() with Cwd::getcwd() Documentation/config/sendemail.txt | 3 - git-send-email.perl | 174 +++++++++++++++++++---------- perl/Git.pm | 35 +++--- t/t9001-send-email.sh | 29 +++++ 4 files changed, 160 insertions(+), 81 deletions(-) Range-diff against v2: -: ---------- > 1: 71f890dc60 send-email tests: support GIT_TEST_PERL_FATAL_WARNINGS=true -: ---------- > 2: 707c2ca556 send-email tests: test for boolean variables without a value 1: 8474acae68 = 3: 3bbd48dab2 send-email: remove non-working support for "sendemail.smtpssl" 2: b87f53adbe = 4: bed0f98d68 send-email: refactor sendemail.smtpencryption config parsing -: ---------- > 5: c12f69a411 send-email: copy "config_regxp" into git-send-email.perl 3: 1b27a393ae ! 6: d1c233d251 send-email: lazily load config for a big speedup @@ git-send-email.perl: sub read_config { next unless defined $v; next if $configured->{$setting}++; $$target = $v; -@@ git-send-email.perl: sub read_config { - } +@@ git-send-email.perl: sub config_regexp { + return @ret; } +# Save ourselves a lot of work of shelling out to 'git config' (it +# parses 'bool' etc.) by only doing so for config keys that exist. +my %known_config_keys; +{ -+ my @known_config_keys = Git::config_regexp("^sende?mail[.]"); ++ my @known_config_keys = config_regexp("^sende?mail[.]"); + @known_config_keys{@known_config_keys} = (); +} + @@ git-send-email.perl: sub read_config { my $rc = GetOptions( "identity=s" => \$identity, "no-identity" => \$no_identity, -@@ git-send-email.perl: sub read_config { +@@ git-send-email.perl: sub config_regexp { # Now we know enough to read the config { my %configured; @@ git-send-email.perl: sub read_config { } # Begin by accumulating all the variables (defined above), that we will end up -@@ git-send-email.perl: sub read_config { +@@ git-send-email.perl: sub config_regexp { usage(); } --if ($forbid_sendmail_variables && (scalar Git::config_regexp("^sendmail[.]")) != 0) { +-if ($forbid_sendmail_variables && (scalar config_regexp("^sendmail[.]")) != 0) { +if ($forbid_sendmail_variables && grep { /^sendmail/s } keys %known_config_keys) { die __("fatal: found configuration options for 'sendmail'\n" . "git-send-email is configured with the sendemail.* options - note the 'e'.\n" . "Set sendemail.forbidSendmailVariables to false to disable this check.\n"); - - ## perl/Git.pm ## -@@ perl/Git.pm: sub config_regexp { - } catch Git::Error::Command with { - my $E = shift; - if ($E->value() == 1) { -- my @matches = (); -- return @matches; -+ # Key(s) not found. -+ return; - } else { - throw $E; - } 4: acee22b77d ! 7: 4326c2f99c send-email: lazily shell out to "git var" @@ Commit message Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> ## git-send-email.perl ## -@@ git-send-email.perl: sub read_config { +@@ git-send-email.perl: sub config_regexp { } my ($repoauthor, $repocommitter); 5: f317cd1c01 = 8: e1fc71e3f9 send-email: use function syntax instead of barewords 6: fc27024f83 = 9: a806ce06f1 send-email: get rid of indirect object syntax 7: f86f5453d7 ! 10: aa11439789 send-email: lazily load modules for a big speedup @@ git-send-email.perl @@ use 5.008; use strict; - use warnings; + use warnings $ENV{GIT_PERL_FATAL_WARNINGS} ? qw(FATAL all) : (); -use POSIX qw/strftime/; -use Term::ReadLine; use Getopt::Long; @@ git-send-email.perl: sub do_edit { # SMTP password masked system "stty echo"; -@@ git-send-email.perl: sub read_config { +@@ git-send-email.perl: sub config_regexp { } sub parse_address_line { 8: 86641377c0 = 11: b3b342b173 perl: lazily load some common Git.pm setup code 9: 895c9e29a9 ! 12: 950dc0f53d send-email: move trivial config handling to Perl @@ Commit message Optimize the startup time of git-send-email by using an amended config_regexp() function to retrieve the list of config keys and - values we're interested in. See the earlier "send-email: lazily load - config for a big speedup" commit for why changing its interface is OK. + values we're interested in. For boolean keys we can handle the [true|false] case ourselves, and the "--get" case didn't need any parsing. Let's leave "--path" and - other "--bool" cases to "git config". As noted in a preceding commit - we're free to change the config_regexp() function, it's only used by - "git send-email". + other "--bool" cases to "git config". I'm not bothering with the + "undef" or "" case (true and false, respectively), let's just punt on + those and others and have "git config --type=bool" handle it. This brings the runtime of "git send-email" from ~60-~70ms to a very steady ~40ms on my test box. We now run just one "git config" @@ Commit message Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> + diff --git a/git-send-email.perl b/git-send-email.perl + index 1e9273fd4f..1ea4d9589d 100755 + --- a/git-send-email.perl + +++ b/git-send-email.perl + @@ -324,7 +324,11 @@ sub read_config { + my $target = $config_bool_settings{$setting}; + my $key = "$prefix.$setting"; + next unless exists $known_keys->{$key}; + - my $v = Git::config_bool(@repo, $key); + + my $v = (@{$known_keys->{$key}} == 1 && + + (defined $known_keys->{$key}->[0] && + + $known_keys->{$key}->[0] =~ /^(?:true|false)$/s)) + + ? $known_keys->{$key}->[0] eq 'true' + + : Git::config_bool(@repo, $key); + next unless defined $v; + next if $configured->{$setting}++; + $$target = $v; + @@ -353,14 +357,12 @@ sub read_config { + my $key = "$prefix.$setting"; + next unless exists $known_keys->{$key}; + if (ref($target) eq "ARRAY") { + - my @values = Git::config(@repo, $key); + - next unless @values; + + my @values = @{$known_keys->{$key}}; + next if $configured->{$setting}++; + @$target = @values; + } + else { + - my $v = Git::config(@repo, $key); + - next unless defined $v; + + my $v = $known_keys->{$key}->[0]; + next if $configured->{$setting}++; + $$target = $v; + } + @@ -371,12 +373,19 @@ sub config_regexp { + my ($regex) = @_; + my @ret; + eval { + - @ret = Git::command( + + my $ret = Git::command( + 'config', + - '--name-only', + + '--null', + '--get-regexp', + $regex, + ); + + @ret = map { + + # We must always return ($k, $v) here, since + + # empty config values will be just "key\0", + + # not "key\nvalue\0". + + my ($k, $v) = split /\n/, $_, 2; + + ($k, $v); + + } split /\0/, $ret; + 1; + } or do { + # If we have no keys we're OK, otherwise re-throw + @@ -389,8 +398,10 @@ sub config_regexp { + # parses 'bool' etc.) by only doing so for config keys that exist. + my %known_config_keys; + { + - my @known_config_keys = config_regexp("^sende?mail[.]"); + - @known_config_keys{@known_config_keys} = (); + + my @kv = config_regexp("^sende?mail[.]"); + + while (my ($k, $v) = splice @kv, 0, 2) { + + push @{$known_config_keys{$k}} => $v; + + } + } + + # sendemail.identity yields to --identity. We must parse this + ## git-send-email.perl ## @@ git-send-email.perl: sub read_config { my $target = $config_bool_settings{$setting}; @@ git-send-email.perl: sub read_config { next unless exists $known_keys->{$key}; - my $v = Git::config_bool(@repo, $key); + my $v = (@{$known_keys->{$key}} == 1 && -+ $known_keys->{$key}->[0] =~ /^(?:true|false)$/s) ++ (defined $known_keys->{$key}->[0] && ++ $known_keys->{$key}->[0] =~ /^(?:true|false)$/s)) + ? $known_keys->{$key}->[0] eq 'true' + : Git::config_bool(@repo, $key); next unless defined $v; @@ git-send-email.perl: sub read_config { next if $configured->{$setting}++; $$target = $v; } -@@ git-send-email.perl: sub read_config { +@@ git-send-email.perl: sub config_regexp { + my ($regex) = @_; + my @ret; + eval { +- @ret = Git::command( ++ my $ret = Git::command( + 'config', +- '--name-only', ++ '--null', + '--get-regexp', + $regex, + ); ++ @ret = map { ++ # We must always return ($k, $v) here, since ++ # empty config values will be just "key\0", ++ # not "key\nvalue\0". ++ my ($k, $v) = split /\n/, $_, 2; ++ ($k, $v); ++ } split /\0/, $ret; + 1; + } or do { + # If we have no keys we're OK, otherwise re-throw +@@ git-send-email.perl: sub config_regexp { # parses 'bool' etc.) by only doing so for config keys that exist. my %known_config_keys; { -- my @known_config_keys = Git::config_regexp("^sende?mail[.]"); +- my @known_config_keys = config_regexp("^sende?mail[.]"); - @known_config_keys{@known_config_keys} = (); -+ my @kv = Git::config_regexp("^sende?mail[.]"); ++ my @kv = config_regexp("^sende?mail[.]"); + while (my ($k, $v) = splice @kv, 0, 2) { + push @{$known_config_keys{$k}} => $v; + } } # sendemail.identity yields to --identity. We must parse this - - ## perl/Git.pm ## -@@ perl/Git.pm: sub config_int { - =item config_regexp ( RE ) - - Retrieve the list of configuration key names matching the regular --expression C<RE>. The return value is a list of strings matching --this regex. -+expression C<RE>. The return value is an ARRAY of key-value pairs. - - =cut - - sub config_regexp { - my ($self, $regex) = _maybe_self(@_); - try { -- my @cmd = ('config', '--name-only', '--get-regexp', $regex); -+ my @cmd = ('config', '--null', '--get-regexp', $regex); - unshift @cmd, $self if $self; -- my @matches = command(@cmd); -- return @matches; -+ my $data = command(@cmd); -+ my (@kv) = map { split /\n/, $_, 2 } split /\0/, $data; -+ return @kv; - } catch Git::Error::Command with { - my $E = shift; - if ($E->value() == 1) { 10: 97455f993d = 13: c1d7ea664a perl: nano-optimize by replacing Cwd::cwd() with Cwd::getcwd() -- 2.32.0.rc0.406.g05cb3eebfc
Add support for the "GIT_TEST_PERL_FATAL_WARNINGS=true" test mode to "send-email". This was added to e.g. git-svn in 5338ed2b26 (perl: check for perl warnings while running tests, 2020-10-21), but not "send-email". Let's rectify that. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- git-send-email.perl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/git-send-email.perl b/git-send-email.perl index 175da07d94..cffdfdacfb 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -18,7 +18,7 @@ use 5.008; use strict; -use warnings; +use warnings $ENV{GIT_PERL_FATAL_WARNINGS} ? qw(FATAL all) : (); use POSIX qw/strftime/; use Term::ReadLine; use Getopt::Long; -- 2.32.0.rc0.406.g05cb3eebfc
The Git.pm code does its own Perl-ifying of boolean variables, let's ensure that empty values = true for boolean variables, as in the C code. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- t/t9001-send-email.sh | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) diff --git a/t/t9001-send-email.sh b/t/t9001-send-email.sh index 65b3035371..6bf79c816a 100755 --- a/t/t9001-send-email.sh +++ b/t/t9001-send-email.sh @@ -1349,6 +1349,16 @@ test_expect_success $PREREQ 'sendemail.identity: bool variable fallback' ' ! grep "X-Mailer" stdout ' +test_expect_success $PREREQ 'sendemail.identity: bool variable without a value' ' + git -c sendemail.xmailer \ + send-email \ + --dry-run \ + --from="nobody@example.com" \ + $patches >stdout && + grep "To: default@example.com" stdout && + grep "X-Mailer" stdout +' + test_expect_success $PREREQ '--no-to overrides sendemail.to' ' git send-email \ --dry-run \ @@ -2073,6 +2083,18 @@ test_expect_success $PREREQ '--[no-]xmailer with sendemail.xmailer=true' ' do_xmailer_test 1 "--xmailer" ' +test_expect_success $PREREQ '--[no-]xmailer with sendemail.xmailer' ' + test_when_finished "test_unconfig sendemail.xmailer" && + cat >>.git/config <<-\EOF && + [sendemail] + xmailer + EOF + test_config sendemail.xmailer true && + do_xmailer_test 1 "" && + do_xmailer_test 0 "--no-xmailer" && + do_xmailer_test 1 "--xmailer" +' + test_expect_success $PREREQ '--[no-]xmailer with sendemail.xmailer=false' ' test_config sendemail.xmailer false && do_xmailer_test 0 "" && @@ -2080,6 +2102,13 @@ test_expect_success $PREREQ '--[no-]xmailer with sendemail.xmailer=false' ' do_xmailer_test 1 "--xmailer" ' +test_expect_success $PREREQ '--[no-]xmailer with sendemail.xmailer=' ' + test_config sendemail.xmailer "" && + do_xmailer_test 0 "" && + do_xmailer_test 0 "--no-xmailer" && + do_xmailer_test 1 "--xmailer" +' + test_expect_success $PREREQ 'setup expected-list' ' git send-email \ --dry-run \ -- 2.32.0.rc0.406.g05cb3eebfc
Remove the already dead code to support "sendemail.smtpssl" by finally removing the dead code supporting the configuration option. In f6bebd121ac (git-send-email: add support for TLS via Net::SMTP::SSL, 2008-06-25) the --smtp-ssl command-line option was documented as deprecated, later in 65180c66186 (List send-email config options in config.txt., 2009-07-22) the "sendemail.smtpssl" configuration option was also documented as such. Then in in 3ff15040e22 (send-email: fix regression in sendemail.identity parsing, 2019-05-17) I unintentionally removed support for it by introducing a bug in read_config(). As can be seen from the diff context we've already returned unless $enc i defined, so it's not possible for us to reach the "elsif" branch here. This code was therefore already dead since Git v2.23.0. So let's just remove it. We were already 11 years into a stated deprecation period of this variable when 3ff15040e22 landed, now it's around 13. Since it hasn't worked anyway for around 2 years it looks like we can safely remove it. The --smtp-ssl option is still deprecated, if someone cares they can follow-up and remove that too, but unlike the config option that one could still be in use in the wild. I'm just removing this code that's provably unused already. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- Documentation/config/sendemail.txt | 3 --- git-send-email.perl | 6 +----- 2 files changed, 1 insertion(+), 8 deletions(-) diff --git a/Documentation/config/sendemail.txt b/Documentation/config/sendemail.txt index cbc5af42fd..50baa5d6bf 100644 --- a/Documentation/config/sendemail.txt +++ b/Documentation/config/sendemail.txt @@ -8,9 +8,6 @@ sendemail.smtpEncryption:: See linkgit:git-send-email[1] for description. Note that this setting is not subject to the 'identity' mechanism. -sendemail.smtpssl (deprecated):: - Deprecated alias for 'sendemail.smtpEncryption = ssl'. - sendemail.smtpsslcertpath:: Path to ca-certificates (either a directory or a single file). Set it to an empty string to disable certificate verification. diff --git a/git-send-email.perl b/git-send-email.perl index cffdfdacfb..24d73df5a3 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -383,11 +383,7 @@ sub read_config { my $enc = Git::config(@repo, $setting); return unless defined $enc; return if $configured->{$setting}++; - if (defined $enc) { - $smtp_encryption = $enc; - } elsif (Git::config_bool(@repo, "$prefix.smtpssl")) { - $smtp_encryption = 'ssl'; - } + $smtp_encryption = $enc; } } -- 2.32.0.rc0.406.g05cb3eebfc
With the removal of the support for sendemail.smtpssl in the preceding commit the parsing of sendemail.smtpencryption is no longer special, and can by moved to %config_settings. This gets us rid of an unconditional call to Git::config(), which as we'll see in subsequent commits matters for startup performance. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- git-send-email.perl | 9 +-------- 1 file changed, 1 insertion(+), 8 deletions(-) diff --git a/git-send-email.perl b/git-send-email.perl index 24d73df5a3..3d8362b5f0 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -277,6 +277,7 @@ sub do_edit { ); my %config_settings = ( + "smtpencryption" => \$smtp_encryption, "smtpserver" => \$smtp_server, "smtpserverport" => \$smtp_server_port, "smtpserveroption" => \@smtp_server_options, @@ -377,14 +378,6 @@ sub read_config { $$target = $v; } } - - if (!defined $smtp_encryption) { - my $setting = "$prefix.smtpencryption"; - my $enc = Git::config(@repo, $setting); - return unless defined $enc; - return if $configured->{$setting}++; - $smtp_encryption = $enc; - } } # sendemail.identity yields to --identity. We must parse this -- 2.32.0.rc0.406.g05cb3eebfc
The config_regexp() function was added in dd84e528a3 (git-send-email: die if sendmail.* config is set, 2020-07-23) for use in git-send-email, and it's the only in-tree user of it. However, the consensus is that Git.pm is a public interface, so even though it's a recently added function we can't change it. So let's copy over a minimal version of it to git-send-email.perl itself. In a subsequent commit it'll be changed further for our own use. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- git-send-email.perl | 20 +++++++++++++++++++- 1 file changed, 19 insertions(+), 1 deletion(-) diff --git a/git-send-email.perl b/git-send-email.perl index 3d8362b5f0..95a69d4c51 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -380,6 +380,24 @@ sub read_config { } } +sub config_regexp { + my ($regex) = @_; + my @ret; + eval { + @ret = Git::command( + 'config', + '--name-only', + '--get-regexp', + $regex, + ); + 1; + } or do { + # If we have no keys we're OK, otherwise re-throw + die $@ if $@->value != 1; + }; + return @ret; +} + # sendemail.identity yields to --identity. We must parse this # special-case first before the rest of the config is read. $identity = Git::config(@repo, "sendemail.identity"); @@ -478,7 +496,7 @@ sub read_config { usage(); } -if ($forbid_sendmail_variables && (scalar Git::config_regexp("^sendmail[.]")) != 0) { +if ($forbid_sendmail_variables && (scalar config_regexp("^sendmail[.]")) != 0) { die __("fatal: found configuration options for 'sendmail'\n" . "git-send-email is configured with the sendemail.* options - note the 'e'.\n" . "Set sendemail.forbidSendmailVariables to false to disable this check.\n"); -- 2.32.0.rc0.406.g05cb3eebfc
Reduce the time it takes git-send-email to get to even the most trivial of tasks (such as serving up its "-h" output) by first listing config keys that exist, and only then only call e.g. "git config --bool" on them if they do. Over a lot of runs this speeds the time to "-h" up for me from ~250ms to ~150ms, and the runtime of t9001-send-email.sh goes from ~25s to ~20s. This introduces a race condition where we'll do the "wrong" thing if a config key were to be inserted between us discovering the list and calling read_config(), i.e. we won't know about the racily added key. In theory this is a change in behavior, in practice it doesn't matter. The config_regexp() function being changed here was added in dd84e528a34 (git-send-email: die if sendmail.* config is set, 2020-07-23) for use by git-send-email. So we can change its odd return value in the case where no values are found by "git config". The difference in the *.pm code would matter if it was invoked in scalar context, but now it no longer is. Arguably this caching belongs in Git.pm itself, but in lieu of modifying it for all its callers let's only do this for "git send-email". The other big potential win would be "git svn", but unlike "git send-email" it doesn't check tens of config variables one at a time at startup (in my brief testing it doesn't check any). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- git-send-email.perl | 35 ++++++++++++++++++++++++++--------- 1 file changed, 26 insertions(+), 9 deletions(-) diff --git a/git-send-email.perl b/git-send-email.perl index 95a69d4c51..d294681cf3 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -337,11 +337,13 @@ sub signal_handler { # Read our sendemail.* config sub read_config { - my ($configured, $prefix) = @_; + my ($known_keys, $configured, $prefix) = @_; foreach my $setting (keys %config_bool_settings) { my $target = $config_bool_settings{$setting}; - my $v = Git::config_bool(@repo, "$prefix.$setting"); + my $key = "$prefix.$setting"; + next unless exists $known_keys->{$key}; + my $v = Git::config_bool(@repo, $key); next unless defined $v; next if $configured->{$setting}++; $$target = $v; @@ -349,8 +351,10 @@ sub read_config { foreach my $setting (keys %config_path_settings) { my $target = $config_path_settings{$setting}; + my $key = "$prefix.$setting"; + next unless exists $known_keys->{$key}; if (ref($target) eq "ARRAY") { - my @values = Git::config_path(@repo, "$prefix.$setting"); + my @values = Git::config_path(@repo, $key); next unless @values; next if $configured->{$setting}++; @$target = @values; @@ -365,14 +369,16 @@ sub read_config { foreach my $setting (keys %config_settings) { my $target = $config_settings{$setting}; + my $key = "$prefix.$setting"; + next unless exists $known_keys->{$key}; if (ref($target) eq "ARRAY") { - my @values = Git::config(@repo, "$prefix.$setting"); + my @values = Git::config(@repo, $key); next unless @values; next if $configured->{$setting}++; @$target = @values; } else { - my $v = Git::config(@repo, "$prefix.$setting"); + my $v = Git::config(@repo, $key); next unless defined $v; next if $configured->{$setting}++; $$target = $v; @@ -398,9 +404,20 @@ sub config_regexp { return @ret; } +# Save ourselves a lot of work of shelling out to 'git config' (it +# parses 'bool' etc.) by only doing so for config keys that exist. +my %known_config_keys; +{ + my @known_config_keys = config_regexp("^sende?mail[.]"); + @known_config_keys{@known_config_keys} = (); +} + # sendemail.identity yields to --identity. We must parse this # special-case first before the rest of the config is read. -$identity = Git::config(@repo, "sendemail.identity"); +{ + my $key = "sendemail.identity"; + $identity = Git::config(@repo, $key) if exists $known_config_keys{$key}; +} my $rc = GetOptions( "identity=s" => \$identity, "no-identity" => \$no_identity, @@ -411,8 +428,8 @@ sub config_regexp { # Now we know enough to read the config { my %configured; - read_config(\%configured, "sendemail.$identity") if defined $identity; - read_config(\%configured, "sendemail"); + read_config(\%known_config_keys, \%configured, "sendemail.$identity") if defined $identity; + read_config(\%known_config_keys, \%configured, "sendemail"); } # Begin by accumulating all the variables (defined above), that we will end up @@ -496,7 +513,7 @@ sub config_regexp { usage(); } -if ($forbid_sendmail_variables && (scalar config_regexp("^sendmail[.]")) != 0) { +if ($forbid_sendmail_variables && grep { /^sendmail/s } keys %known_config_keys) { die __("fatal: found configuration options for 'sendmail'\n" . "git-send-email is configured with the sendemail.* options - note the 'e'.\n" . "Set sendemail.forbidSendmailVariables to false to disable this check.\n"); -- 2.32.0.rc0.406.g05cb3eebfc
Optimize git-send-email by only shelling out to "git var" if we need to. This is easily done by re-inventing our own small version of perl's Memoize module. I suppose I could just use Memoize itself, but in a subsequent patch I'll be micro-optimizing send-email's use of dependencies. Using Memoize is a measly extra 5-10 milliseconds, but as we'll see that'll end up mattering for us in the end. This brings the runtime of a plain "send-email" from around ~160-170ms to ~140m-150ms. The runtime of the tests is around the same, or around ~20s. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- git-send-email.perl | 20 +++++++++++++++----- 1 file changed, 15 insertions(+), 5 deletions(-) diff --git a/git-send-email.perl b/git-send-email.perl index d294681cf3..56bd5b0e50 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -578,8 +578,18 @@ sub config_regexp { } my ($repoauthor, $repocommitter); -($repoauthor) = Git::ident_person(@repo, 'author'); -($repocommitter) = Git::ident_person(@repo, 'committer'); +{ + my %cache; + my ($author, $committer); + my $common = sub { + my ($what) = @_; + return $cache{$what} if exists $cache{$what}; + ($cache{$what}) = Git::ident_person(@repo, $what); + return $cache{$what}; + }; + $repoauthor = sub { $common->('author') }; + $repocommitter = sub { $common->('committer') }; +} sub parse_address_line { return map { $_->format } Mail::Address->parse($_[0]); @@ -767,7 +777,7 @@ sub get_patch_subject { or die sprintf(__("Failed to open for writing %s: %s"), $compose_filename, $!); - my $tpl_sender = $sender || $repoauthor || $repocommitter || ''; + my $tpl_sender = $sender || $repoauthor->() || $repocommitter->() || ''; my $tpl_subject = $initial_subject || ''; my $tpl_in_reply_to = $initial_in_reply_to || ''; my $tpl_reply_to = $reply_to || ''; @@ -973,7 +983,7 @@ sub file_declares_8bit_cte { $sender =~ s/^\s+|\s+$//g; ($sender) = expand_aliases($sender); } else { - $sender = $repoauthor || $repocommitter || ''; + $sender = $repoauthor->() || $repocommitter->() || ''; } # $sender could be an already sanitized address @@ -1122,7 +1132,7 @@ sub make_message_id { $uniq = "$message_id_stamp-$message_id_serial"; my $du_part; - for ($sender, $repocommitter, $repoauthor) { + for ($sender, $repocommitter->(), $repoauthor->()) { $du_part = extract_valid_address(sanitize_address($_)); last if (defined $du_part and $du_part ne ''); } -- 2.32.0.rc0.406.g05cb3eebfc
Change calls like "__ 'foo'" to "__('foo')" so the Perl compiler doesn't have to guess that "__" is a function. This makes the code more readable. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- git-send-email.perl | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/git-send-email.perl b/git-send-email.perl index 56bd5b0e50..83f764a8bf 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -696,7 +696,7 @@ sub is_format_patch_arg { if (defined($format_patch)) { return $format_patch; } - die sprintf(__ <<EOF, $f, $f); + die sprintf(__(<<EOF), $f, $f); File '%s' exists but it could also be the range of commits to produce patches for. Please disambiguate by... @@ -782,7 +782,7 @@ sub get_patch_subject { my $tpl_in_reply_to = $initial_in_reply_to || ''; my $tpl_reply_to = $reply_to || ''; - print $c <<EOT1, Git::prefix_lines("GIT: ", __ <<EOT2), <<EOT3; + print $c <<EOT1, Git::prefix_lines("GIT: ", __(<<EOT2)), <<EOT3; From $tpl_sender # This line is ignored. EOT1 Lines beginning in "GIT:" will be removed. -- 2.32.0.rc0.406.g05cb3eebfc
Change indirect object syntax such as "new X ARGS" to "X->new(ARGS)". This allows perl to see what "new" is at compile-time without having loaded Term::ReadLine. This doesn't matter now, but will in a subsequent commit when we start lazily loading it. Let's do the same for the adjacent "FakeTerm" package for consistency, even though we're not going to conditionally load it. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- git-send-email.perl | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/git-send-email.perl b/git-send-email.perl index 83f764a8bf..f1542cb3ea 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -194,11 +194,11 @@ sub format_2822_time { my @repo = $repo ? ($repo) : (); my $term = eval { $ENV{"GIT_SEND_EMAIL_NOTTY"} - ? new Term::ReadLine 'git-send-email', \*STDIN, \*STDOUT - : new Term::ReadLine 'git-send-email'; + ? Term::ReadLine->new('git-send-email', \*STDIN, \*STDOUT) + : Term::ReadLine->new('git-send-email'); }; if ($@) { - $term = new FakeTerm "$@: going non-interactive"; + $term = FakeTerm->new("$@: going non-interactive"); } # Behavior modification variables -- 2.32.0.rc0.406.g05cb3eebfc
Optimize the time git-send-email takes to do even the simplest of things (such as serving up "-h") from around ~150ms to ~80ms-~90ms by lazily loading the modules it requires. Before this change Devel::TraceUse would report 99/97 used modules under NO_GETTEXT=[|Y], respectively. Now it's 52/37. It now takes ~15s to run t9001-send-email.sh, down from ~20s. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- git-send-email.perl | 71 +++++++++++++++++++++++++-------------------- 1 file changed, 39 insertions(+), 32 deletions(-) diff --git a/git-send-email.perl b/git-send-email.perl index f1542cb3ea..1e9273fd4f 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -19,20 +19,10 @@ use 5.008; use strict; use warnings $ENV{GIT_PERL_FATAL_WARNINGS} ? qw(FATAL all) : (); -use POSIX qw/strftime/; -use Term::ReadLine; use Getopt::Long; -use Text::ParseWords; -use Term::ANSIColor; -use File::Temp qw/ tempdir tempfile /; -use File::Spec::Functions qw(catdir catfile); use Git::LoadCPAN::Error qw(:try); -use Cwd qw(abs_path cwd); use Git; use Git::I18N; -use Net::Domain (); -use Net::SMTP (); -use Git::LoadCPAN::Mail::Address; Getopt::Long::Configure qw/ pass_through /; @@ -166,7 +156,6 @@ sub format_2822_time { ); } -my $have_email_valid = eval { require Email::Valid; 1 }; my $smtp; my $auth; my $num_sent = 0; @@ -192,14 +181,6 @@ sub format_2822_time { my $repo = eval { Git->repository() }; my @repo = $repo ? ($repo) : (); -my $term = eval { - $ENV{"GIT_SEND_EMAIL_NOTTY"} - ? Term::ReadLine->new('git-send-email', \*STDIN, \*STDOUT) - : Term::ReadLine->new('git-send-email'); -}; -if ($@) { - $term = FakeTerm->new("$@: going non-interactive"); -} # Behavior modification variables my ($quiet, $dry_run) = (0, 0); @@ -309,9 +290,9 @@ sub do_edit { # Handle Uncouth Termination sub signal_handler { - # Make text normal - print color("reset"), "\n"; + require Term::ANSIColor; + print Term::ANSIColor::color("reset"), "\n"; # SMTP password masked system "stty echo"; @@ -592,11 +573,13 @@ sub config_regexp { } sub parse_address_line { + require Git::LoadCPAN::Mail::Address; return map { $_->format } Mail::Address->parse($_[0]); } sub split_addrs { - return quotewords('\s*,\s*', 1, @_); + require Text::ParseWords; + return Text::ParseWords::quotewords('\s*,\s*', 1, @_); } my %aliases; @@ -645,10 +628,11 @@ sub parse_sendmail_aliases { s/\\"/"/g foreach @addr; $aliases{$alias} = \@addr }}}, - mailrc => sub { my $fh = shift; while (<$fh>) { + mailrc => sub { my $fh = shift; while (<$fh>) { if (/^alias\s+(\S+)\s+(.*?)\s*$/) { + require Text::ParseWords; # spaces delimit multiple addresses - $aliases{$1} = [ quotewords('\s+', 0, $2) ]; + $aliases{$1} = [ Text::ParseWords::quotewords('\s+', 0, $2) ]; }}}, pine => sub { my $fh = shift; my $f='\t[^\t]*'; for (my $x = ''; defined($x); $x = $_) { @@ -720,7 +704,8 @@ sub is_format_patch_arg { opendir my $dh, $f or die sprintf(__("Failed to opendir %s: %s"), $f, $!); - push @files, grep { -f $_ } map { catfile($f, $_) } + require File::Spec; + push @files, grep { -f $_ } map { File::Spec->catfile($f, $_) } sort readdir $dh; closedir $dh; } elsif ((-f $f or -p $f) and !is_format_patch_arg($f)) { @@ -733,7 +718,8 @@ sub is_format_patch_arg { if (@rev_list_opts) { die __("Cannot run git format-patch from outside a repository\n") unless $repo; - push @files, $repo->command('format-patch', '-o', tempdir(CLEANUP => 1), @rev_list_opts); + require File::Temp; + push @files, $repo->command('format-patch', '-o', File::Temp::tempdir(CLEANUP => 1), @rev_list_opts); } @files = handle_backup_files(@files); @@ -770,9 +756,10 @@ sub get_patch_subject { if ($compose) { # Note that this does not need to be secure, but we will make a small # effort to have it be unique + require File::Temp; $compose_filename = ($repo ? - tempfile(".gitsendemail.msg.XXXXXX", DIR => $repo->repo_path()) : - tempfile(".gitsendemail.msg.XXXXXX", DIR => "."))[1]; + File::Temp::tempfile(".gitsendemail.msg.XXXXXX", DIR => $repo->repo_path()) : + File::Temp::tempfile(".gitsendemail.msg.XXXXXX", DIR => "."))[1]; open my $c, ">", $compose_filename or die sprintf(__("Failed to open for writing %s: %s"), $compose_filename, $!); @@ -879,6 +866,19 @@ sub get_patch_subject { do_edit(@files); } +sub term { + my $term = eval { + require Term::ReadLine; + $ENV{"GIT_SEND_EMAIL_NOTTY"} + ? Term::ReadLine->new('git-send-email', \*STDIN, \*STDOUT) + : Term::ReadLine->new('git-send-email'); + }; + if ($@) { + $term = FakeTerm->new("$@: going non-interactive"); + } + return $term; +} + sub ask { my ($prompt, %arg) = @_; my $valid_re = $arg{valid_re}; @@ -886,6 +886,7 @@ sub ask { my $confirm_only = $arg{confirm_only}; my $resp; my $i = 0; + my $term = term(); return defined $default ? $default : undef unless defined $term->IN and defined fileno($term->IN) and defined $term->OUT and defined fileno($term->OUT); @@ -1066,6 +1067,7 @@ sub extract_valid_address { return $address if ($address =~ /^($local_part_regexp)$/); $address =~ s/^\s*<(.*)>\s*$/$1/; + my $have_email_valid = eval { require Email::Valid; 1 }; if ($have_email_valid) { return scalar Email::Valid->address($address); } @@ -1125,7 +1127,8 @@ sub validate_address_list { sub make_message_id { my $uniq; if (!defined $message_id_stamp) { - $message_id_stamp = strftime("%Y%m%d%H%M%S.$$", gmtime(time)); + require POSIX; + $message_id_stamp = POSIX::strftime("%Y%m%d%H%M%S.$$", gmtime(time)); $message_id_serial = 0; } $message_id_serial++; @@ -1295,6 +1298,7 @@ sub valid_fqdn { sub maildomain_net { my $maildomain; + require Net::Domain; my $domain = Net::Domain::domainname(); $maildomain = $domain if valid_fqdn($domain); @@ -1305,6 +1309,7 @@ sub maildomain_mta { my $maildomain; for my $host (qw(mailhost localhost)) { + require Net::SMTP; my $smtp = Net::SMTP->new($host); if (defined $smtp) { my $domain = $smtp->domain; @@ -1983,13 +1988,15 @@ sub validate_patch { my ($fn, $xfer_encoding) = @_; if ($repo) { - my $validate_hook = catfile($repo->hooks_path(), + require File::Spec; + my $validate_hook = File::Spec->catfile($repo->hooks_path(), 'sendemail-validate'); my $hook_error; if (-x $validate_hook) { - my $target = abs_path($fn); + require Cwd; + my $target = Cwd::abs_path($fn); # The hook needs a correct cwd and GIT_DIR. - my $cwd_save = cwd(); + my $cwd_save = Cwd::cwd(); chdir($repo->wc_path() or $repo->repo_path()) or die("chdir: $!"); local $ENV{"GIT_DIR"} = $repo->repo_path(); -- 2.32.0.rc0.406.g05cb3eebfc
Instead of unconditionally requiring modules such as File::Spec, let's only load them when needed. This speeds up code that only needs a subset of the features Git.pm provides. This brings a plain invocation of "git send-email" down from 52/37 loaded modules under NO_GETTEXT=[|Y] to 39/18, and it now takes ~60-~70ms instead of ~80-~90ms. The runtime of t9001-send-email.sh test is down to ~13s from ~15s. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- perl/Git.pm | 35 ++++++++++++++++++----------------- 1 file changed, 18 insertions(+), 17 deletions(-) diff --git a/perl/Git.pm b/perl/Git.pm index 73ebbf80cc..4d048f307b 100644 --- a/perl/Git.pm +++ b/perl/Git.pm @@ -11,9 +11,6 @@ package Git; use strict; use warnings $ENV{GIT_PERL_FATAL_WARNINGS} ? qw(FATAL all) : (); -use File::Temp (); -use File::Spec (); - BEGIN { our ($VERSION, @ISA, @EXPORT, @EXPORT_OK); @@ -103,12 +100,9 @@ =head1 DESCRIPTION =cut -use Carp qw(carp croak); # but croak is bad - throw instead +sub carp { require Carp; goto &Carp::carp } +sub croak { require Carp; goto &Carp::croak } use Git::LoadCPAN::Error qw(:try); -use Cwd qw(abs_path cwd); -use IPC::Open2 qw(open2); -use Fcntl qw(SEEK_SET SEEK_CUR); -use Time::Local qw(timegm); } @@ -191,13 +185,15 @@ sub repository { $dir = undef; }; + require Cwd; if ($dir) { + require File::Spec; File::Spec->file_name_is_absolute($dir) or $dir = $opts{Directory} . '/' . $dir; - $opts{Repository} = abs_path($dir); + $opts{Repository} = Cwd::abs_path($dir); # If --git-dir went ok, this shouldn't die either. my $prefix = $search->command_oneline('rev-parse', '--show-prefix'); - $dir = abs_path($opts{Directory}) . '/'; + $dir = Cwd::abs_path($opts{Directory}) . '/'; if ($prefix) { if (substr($dir, -length($prefix)) ne $prefix) { throw Error::Simple("rev-parse confused me - $dir does not have trailing $prefix"); @@ -223,7 +219,7 @@ sub repository { throw Error::Simple("fatal: Not a git repository: $dir"); } - $opts{Repository} = abs_path($dir); + $opts{Repository} = Cwd::abs_path($dir); } delete $opts{Directory}; @@ -408,10 +404,12 @@ sub command_bidi_pipe { my $cwd_save = undef; if ($self) { shift; - $cwd_save = cwd(); + require Cwd; + $cwd_save = Cwd::cwd(); _setup_git_cmd_env($self); } - $pid = open2($in, $out, 'git', @_); + require IPC::Open2; + $pid = IPC::Open2::open2($in, $out, 'git', @_); chdir($cwd_save) if $cwd_save; return ($pid, $in, $out, join(' ', @_)); } @@ -538,7 +536,8 @@ sub get_tz_offset { my $t = shift || time; my @t = localtime($t); $t[5] += 1900; - my $gm = timegm(@t); + require Time::Local; + my $gm = Time::Local::timegm(@t); my $sign = qw( + + - )[ $gm <=> $t ]; return sprintf("%s%02d%02d", $sign, (gmtime(abs($t - $gm)))[2,1]); } @@ -629,7 +628,8 @@ sub hooks_path { my ($self) = @_; my $dir = $self->command_oneline('rev-parse', '--git-path', 'hooks'); - my $abs = abs_path($dir); + require Cwd; + my $abs = Cwd::abs_path($dir); return $abs; } @@ -1353,6 +1353,7 @@ sub _temp_cache { my $n = $name; $n =~ s/\W/_/g; # no strange chars + require File::Temp; ($$temp_fd, $fname) = File::Temp::tempfile( "Git_${n}_XXXXXX", UNLINK => 1, DIR => $tmpdir, ) or throw Error::Simple("couldn't open new temp file"); @@ -1375,9 +1376,9 @@ sub temp_reset { truncate $temp_fd, 0 or throw Error::Simple("couldn't truncate file"); - sysseek($temp_fd, 0, SEEK_SET) and seek($temp_fd, 0, SEEK_SET) + sysseek($temp_fd, 0, Fcntl::SEEK_SET()) and seek($temp_fd, 0, Fcntl::SEEK_SET()) or throw Error::Simple("couldn't seek to beginning of file"); - sysseek($temp_fd, 0, SEEK_CUR) == 0 and tell($temp_fd) == 0 + sysseek($temp_fd, 0, Fcntl::SEEK_CUR()) == 0 and tell($temp_fd) == 0 or throw Error::Simple("expected file position to be reset"); } -- 2.32.0.rc0.406.g05cb3eebfc
Optimize the startup time of git-send-email by using an amended config_regexp() function to retrieve the list of config keys and values we're interested in. For boolean keys we can handle the [true|false] case ourselves, and the "--get" case didn't need any parsing. Let's leave "--path" and other "--bool" cases to "git config". I'm not bothering with the "undef" or "" case (true and false, respectively), let's just punt on those and others and have "git config --type=bool" handle it. This brings the runtime of "git send-email" from ~60-~70ms to a very steady ~40ms on my test box. We now run just one "git config" invocation on startup instead of 8, the exact number will differ based on the local sendemail.* config. I happen to have 8 of those set. This brings the runtime of t9001-send-email.sh from ~13s down to ~12s for me. The change there is less impressive as many of those tests set various config values, and we're also getting to the point of diminishing returns for optimizing "git send-email" itself. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> diff --git a/git-send-email.perl b/git-send-email.perl index 1e9273fd4f..1ea4d9589d 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -324,7 +324,11 @@ sub read_config { my $target = $config_bool_settings{$setting}; my $key = "$prefix.$setting"; next unless exists $known_keys->{$key}; - my $v = Git::config_bool(@repo, $key); + my $v = (@{$known_keys->{$key}} == 1 && + (defined $known_keys->{$key}->[0] && + $known_keys->{$key}->[0] =~ /^(?:true|false)$/s)) + ? $known_keys->{$key}->[0] eq 'true' + : Git::config_bool(@repo, $key); next unless defined $v; next if $configured->{$setting}++; $$target = $v; @@ -353,14 +357,12 @@ sub read_config { my $key = "$prefix.$setting"; next unless exists $known_keys->{$key}; if (ref($target) eq "ARRAY") { - my @values = Git::config(@repo, $key); - next unless @values; + my @values = @{$known_keys->{$key}}; next if $configured->{$setting}++; @$target = @values; } else { - my $v = Git::config(@repo, $key); - next unless defined $v; + my $v = $known_keys->{$key}->[0]; next if $configured->{$setting}++; $$target = $v; } @@ -371,12 +373,19 @@ sub config_regexp { my ($regex) = @_; my @ret; eval { - @ret = Git::command( + my $ret = Git::command( 'config', - '--name-only', + '--null', '--get-regexp', $regex, ); + @ret = map { + # We must always return ($k, $v) here, since + # empty config values will be just "key\0", + # not "key\nvalue\0". + my ($k, $v) = split /\n/, $_, 2; + ($k, $v); + } split /\0/, $ret; 1; } or do { # If we have no keys we're OK, otherwise re-throw @@ -389,8 +398,10 @@ sub config_regexp { # parses 'bool' etc.) by only doing so for config keys that exist. my %known_config_keys; { - my @known_config_keys = config_regexp("^sende?mail[.]"); - @known_config_keys{@known_config_keys} = (); + my @kv = config_regexp("^sende?mail[.]"); + while (my ($k, $v) = splice @kv, 0, 2) { + push @{$known_config_keys{$k}} => $v; + } } # sendemail.identity yields to --identity. We must parse this --- git-send-email.perl | 29 ++++++++++++++++++++--------- 1 file changed, 20 insertions(+), 9 deletions(-) diff --git a/git-send-email.perl b/git-send-email.perl index 1e9273fd4f..1ea4d9589d 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -324,7 +324,11 @@ sub read_config { my $target = $config_bool_settings{$setting}; my $key = "$prefix.$setting"; next unless exists $known_keys->{$key}; - my $v = Git::config_bool(@repo, $key); + my $v = (@{$known_keys->{$key}} == 1 && + (defined $known_keys->{$key}->[0] && + $known_keys->{$key}->[0] =~ /^(?:true|false)$/s)) + ? $known_keys->{$key}->[0] eq 'true' + : Git::config_bool(@repo, $key); next unless defined $v; next if $configured->{$setting}++; $$target = $v; @@ -353,14 +357,12 @@ sub read_config { my $key = "$prefix.$setting"; next unless exists $known_keys->{$key}; if (ref($target) eq "ARRAY") { - my @values = Git::config(@repo, $key); - next unless @values; + my @values = @{$known_keys->{$key}}; next if $configured->{$setting}++; @$target = @values; } else { - my $v = Git::config(@repo, $key); - next unless defined $v; + my $v = $known_keys->{$key}->[0]; next if $configured->{$setting}++; $$target = $v; } @@ -371,12 +373,19 @@ sub config_regexp { my ($regex) = @_; my @ret; eval { - @ret = Git::command( + my $ret = Git::command( 'config', - '--name-only', + '--null', '--get-regexp', $regex, ); + @ret = map { + # We must always return ($k, $v) here, since + # empty config values will be just "key\0", + # not "key\nvalue\0". + my ($k, $v) = split /\n/, $_, 2; + ($k, $v); + } split /\0/, $ret; 1; } or do { # If we have no keys we're OK, otherwise re-throw @@ -389,8 +398,10 @@ sub config_regexp { # parses 'bool' etc.) by only doing so for config keys that exist. my %known_config_keys; { - my @known_config_keys = config_regexp("^sende?mail[.]"); - @known_config_keys{@known_config_keys} = (); + my @kv = config_regexp("^sende?mail[.]"); + while (my ($k, $v) = splice @kv, 0, 2) { + push @{$known_config_keys{$k}} => $v; + } } # sendemail.identity yields to --identity. We must parse this -- 2.32.0.rc0.406.g05cb3eebfc
It has been pointed out[1] that cwd() invokes "pwd(1)" while getcwd() is a Perl-native XS function. For what we're using these for we can use getcwd(). The performance difference is miniscule, we're saving on the order of a millisecond or so, see [2] below for the benchmark. I don't think this matters in practice for optimizing git-send-email or perl execution (unlike the patches leading up to this one). But let's do it regardless of that, if only so we don't have to think about this as a low-hanging fruit anymore. 1. https://lore.kernel.org/git/20210512180517.GA11354@dcvr/ 2. $ perl -MBenchmark=:all -MCwd -wE 'cmpthese(10000, { getcwd => sub { getcwd }, cwd => sub { cwd }, pwd => sub { system "pwd >/dev/null" }})' (warning: too few iterations for a reliable count) Rate pwd cwd getcwd pwd 982/s -- -48% -100% cwd 1890/s 92% -- -100% getcwd 10000000000000000000/s 1018000000000000000% 529000000000000064% - Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- git-send-email.perl | 2 +- perl/Git.pm | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/git-send-email.perl b/git-send-email.perl index 1ea4d9589d..c6af05e8a3 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -2007,7 +2007,7 @@ sub validate_patch { require Cwd; my $target = Cwd::abs_path($fn); # The hook needs a correct cwd and GIT_DIR. - my $cwd_save = Cwd::cwd(); + my $cwd_save = Cwd::getcwd(); chdir($repo->wc_path() or $repo->repo_path()) or die("chdir: $!"); local $ENV{"GIT_DIR"} = $repo->repo_path(); diff --git a/perl/Git.pm b/perl/Git.pm index 4d048f307b..863bd80694 100644 --- a/perl/Git.pm +++ b/perl/Git.pm @@ -405,7 +405,7 @@ sub command_bidi_pipe { if ($self) { shift; require Cwd; - $cwd_save = Cwd::cwd(); + $cwd_save = Cwd::getcwd(); _setup_git_cmd_env($self); } require IPC::Open2; -- 2.32.0.rc0.406.g05cb3eebfc
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
> A re-roll of v2[1] which fixes issues pointed out with v2, plus some
> others I noticed along the way.
"move trivial" step does not seem to apply to master, next, or jch
and it does not seem to apply to master...ab/send-email-optim
either.
On top of what commit were these patches designed to be applied?
Thanks.
This v4 fixes an issue in v3 where 12/13 had a "diff --git" as part of the commit message (mistake during rebase/squash), which confused "git am" in trying to apply a diff twice. See <xmqqwnrplyns.fsf@gitster.g>. Ævar Arnfjörð Bjarmason (13): send-email tests: support GIT_TEST_PERL_FATAL_WARNINGS=true send-email tests: test for boolean variables without a value send-email: remove non-working support for "sendemail.smtpssl" send-email: refactor sendemail.smtpencryption config parsing send-email: copy "config_regxp" into git-send-email.perl send-email: lazily load config for a big speedup send-email: lazily shell out to "git var" send-email: use function syntax instead of barewords send-email: get rid of indirect object syntax send-email: lazily load modules for a big speedup perl: lazily load some common Git.pm setup code send-email: move trivial config handling to Perl perl: nano-optimize by replacing Cwd::cwd() with Cwd::getcwd() Documentation/config/sendemail.txt | 3 - git-send-email.perl | 174 +++++++++++++++++++---------- perl/Git.pm | 35 +++--- t/t9001-send-email.sh | 29 +++++ 4 files changed, 160 insertions(+), 81 deletions(-) Range-diff against v3: 1: 71f890dc603 = 1: 7140847367c send-email tests: support GIT_TEST_PERL_FATAL_WARNINGS=true 2: 707c2ca5563 = 2: d27f3b48f85 send-email tests: test for boolean variables without a value 3: 3bbd48dab23 = 3: a7a21b75f2e send-email: remove non-working support for "sendemail.smtpssl" 4: bed0f98d681 = 4: 7356a528589 send-email: refactor sendemail.smtpencryption config parsing 5: c12f69a4110 = 5: cce0f89143b send-email: copy "config_regxp" into git-send-email.perl 6: d1c233d2515 = 6: 8afe8661761 send-email: lazily load config for a big speedup 7: 4326c2f99c1 = 7: 491eefde6a2 send-email: lazily shell out to "git var" 8: e1fc71e3f90 = 8: 860156013f8 send-email: use function syntax instead of barewords 9: a806ce06f18 = 9: dd24f1249f5 send-email: get rid of indirect object syntax 10: aa11439789d = 10: 61e3e3c93c5 send-email: lazily load modules for a big speedup 11: b3b342b173b = 11: ada34374286 perl: lazily load some common Git.pm setup code 12: 950dc0f53dd ! 12: 3818000bfba send-email: move trivial config handling to Perl @@ Commit message Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> - diff --git a/git-send-email.perl b/git-send-email.perl - index 1e9273fd4f..1ea4d9589d 100755 - --- a/git-send-email.perl - +++ b/git-send-email.perl - @@ -324,7 +324,11 @@ sub read_config { - my $target = $config_bool_settings{$setting}; - my $key = "$prefix.$setting"; - next unless exists $known_keys->{$key}; - - my $v = Git::config_bool(@repo, $key); - + my $v = (@{$known_keys->{$key}} == 1 && - + (defined $known_keys->{$key}->[0] && - + $known_keys->{$key}->[0] =~ /^(?:true|false)$/s)) - + ? $known_keys->{$key}->[0] eq 'true' - + : Git::config_bool(@repo, $key); - next unless defined $v; - next if $configured->{$setting}++; - $$target = $v; - @@ -353,14 +357,12 @@ sub read_config { - my $key = "$prefix.$setting"; - next unless exists $known_keys->{$key}; - if (ref($target) eq "ARRAY") { - - my @values = Git::config(@repo, $key); - - next unless @values; - + my @values = @{$known_keys->{$key}}; - next if $configured->{$setting}++; - @$target = @values; - } - else { - - my $v = Git::config(@repo, $key); - - next unless defined $v; - + my $v = $known_keys->{$key}->[0]; - next if $configured->{$setting}++; - $$target = $v; - } - @@ -371,12 +373,19 @@ sub config_regexp { - my ($regex) = @_; - my @ret; - eval { - - @ret = Git::command( - + my $ret = Git::command( - 'config', - - '--name-only', - + '--null', - '--get-regexp', - $regex, - ); - + @ret = map { - + # We must always return ($k, $v) here, since - + # empty config values will be just "key\0", - + # not "key\nvalue\0". - + my ($k, $v) = split /\n/, $_, 2; - + ($k, $v); - + } split /\0/, $ret; - 1; - } or do { - # If we have no keys we're OK, otherwise re-throw - @@ -389,8 +398,10 @@ sub config_regexp { - # parses 'bool' etc.) by only doing so for config keys that exist. - my %known_config_keys; - { - - my @known_config_keys = config_regexp("^sende?mail[.]"); - - @known_config_keys{@known_config_keys} = (); - + my @kv = config_regexp("^sende?mail[.]"); - + while (my ($k, $v) = splice @kv, 0, 2) { - + push @{$known_config_keys{$k}} => $v; - + } - } - - # sendemail.identity yields to --identity. We must parse this - ## git-send-email.perl ## @@ git-send-email.perl: sub read_config { my $target = $config_bool_settings{$setting}; 13: c1d7ea664ac = 13: d36b57e429f perl: nano-optimize by replacing Cwd::cwd() with Cwd::getcwd() -- 2.32.0.rc1.385.g46e826f1e55
Add support for the "GIT_TEST_PERL_FATAL_WARNINGS=true" test mode to "send-email". This was added to e.g. git-svn in 5338ed2b26 (perl: check for perl warnings while running tests, 2020-10-21), but not "send-email". Let's rectify that. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- git-send-email.perl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/git-send-email.perl b/git-send-email.perl index 175da07d946..cffdfdacfb9 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -18,7 +18,7 @@ use 5.008; use strict; -use warnings; +use warnings $ENV{GIT_PERL_FATAL_WARNINGS} ? qw(FATAL all) : (); use POSIX qw/strftime/; use Term::ReadLine; use Getopt::Long; -- 2.32.0.rc1.385.g46e826f1e55
The Git.pm code does its own Perl-ifying of boolean variables, let's ensure that empty values = true for boolean variables, as in the C code. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- t/t9001-send-email.sh | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) diff --git a/t/t9001-send-email.sh b/t/t9001-send-email.sh index 65b30353719..6bf79c816a6 100755 --- a/t/t9001-send-email.sh +++ b/t/t9001-send-email.sh @@ -1349,6 +1349,16 @@ test_expect_success $PREREQ 'sendemail.identity: bool variable fallback' ' ! grep "X-Mailer" stdout ' +test_expect_success $PREREQ 'sendemail.identity: bool variable without a value' ' + git -c sendemail.xmailer \ + send-email \ + --dry-run \ + --from="nobody@example.com" \ + $patches >stdout && + grep "To: default@example.com" stdout && + grep "X-Mailer" stdout +' + test_expect_success $PREREQ '--no-to overrides sendemail.to' ' git send-email \ --dry-run \ @@ -2073,6 +2083,18 @@ test_expect_success $PREREQ '--[no-]xmailer with sendemail.xmailer=true' ' do_xmailer_test 1 "--xmailer" ' +test_expect_success $PREREQ '--[no-]xmailer with sendemail.xmailer' ' + test_when_finished "test_unconfig sendemail.xmailer" && + cat >>.git/config <<-\EOF && + [sendemail] + xmailer + EOF + test_config sendemail.xmailer true && + do_xmailer_test 1 "" && + do_xmailer_test 0 "--no-xmailer" && + do_xmailer_test 1 "--xmailer" +' + test_expect_success $PREREQ '--[no-]xmailer with sendemail.xmailer=false' ' test_config sendemail.xmailer false && do_xmailer_test 0 "" && @@ -2080,6 +2102,13 @@ test_expect_success $PREREQ '--[no-]xmailer with sendemail.xmailer=false' ' do_xmailer_test 1 "--xmailer" ' +test_expect_success $PREREQ '--[no-]xmailer with sendemail.xmailer=' ' + test_config sendemail.xmailer "" && + do_xmailer_test 0 "" && + do_xmailer_test 0 "--no-xmailer" && + do_xmailer_test 1 "--xmailer" +' + test_expect_success $PREREQ 'setup expected-list' ' git send-email \ --dry-run \ -- 2.32.0.rc1.385.g46e826f1e55
Remove the already dead code to support "sendemail.smtpssl" by finally removing the dead code supporting the configuration option. In f6bebd121ac (git-send-email: add support for TLS via Net::SMTP::SSL, 2008-06-25) the --smtp-ssl command-line option was documented as deprecated, later in 65180c66186 (List send-email config options in config.txt., 2009-07-22) the "sendemail.smtpssl" configuration option was also documented as such. Then in in 3ff15040e22 (send-email: fix regression in sendemail.identity parsing, 2019-05-17) I unintentionally removed support for it by introducing a bug in read_config(). As can be seen from the diff context we've already returned unless $enc i defined, so it's not possible for us to reach the "elsif" branch here. This code was therefore already dead since Git v2.23.0. So let's just remove it. We were already 11 years into a stated deprecation period of this variable when 3ff15040e22 landed, now it's around 13. Since it hasn't worked anyway for around 2 years it looks like we can safely remove it. The --smtp-ssl option is still deprecated, if someone cares they can follow-up and remove that too, but unlike the config option that one could still be in use in the wild. I'm just removing this code that's provably unused already. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- Documentation/config/sendemail.txt | 3 --- git-send-email.perl | 6 +----- 2 files changed, 1 insertion(+), 8 deletions(-) diff --git a/Documentation/config/sendemail.txt b/Documentation/config/sendemail.txt index cbc5af42fdf..50baa5d6bfb 100644 --- a/Documentation/config/sendemail.txt +++ b/Documentation/config/sendemail.txt @@ -8,9 +8,6 @@ sendemail.smtpEncryption:: See linkgit:git-send-email[1] for description. Note that this setting is not subject to the 'identity' mechanism. -sendemail.smtpssl (deprecated):: - Deprecated alias for 'sendemail.smtpEncryption = ssl'. - sendemail.smtpsslcertpath:: Path to ca-certificates (either a directory or a single file). Set it to an empty string to disable certificate verification. diff --git a/git-send-email.perl b/git-send-email.perl index cffdfdacfb9..24d73df5a3e 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -383,11 +383,7 @@ sub read_config { my $enc = Git::config(@repo, $setting); return unless defined $enc; return if $configured->{$setting}++; - if (defined $enc) { - $smtp_encryption = $enc; - } elsif (Git::config_bool(@repo, "$prefix.smtpssl")) { - $smtp_encryption = 'ssl'; - } + $smtp_encryption = $enc; } } -- 2.32.0.rc1.385.g46e826f1e55
With the removal of the support for sendemail.smtpssl in the preceding commit the parsing of sendemail.smtpencryption is no longer special, and can by moved to %config_settings. This gets us rid of an unconditional call to Git::config(), which as we'll see in subsequent commits matters for startup performance. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- git-send-email.perl | 9 +-------- 1 file changed, 1 insertion(+), 8 deletions(-) diff --git a/git-send-email.perl b/git-send-email.perl index 24d73df5a3e..3d8362b5f07 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -277,6 +277,7 @@ sub do_edit { ); my %config_settings = ( + "smtpencryption" => \$smtp_encryption, "smtpserver" => \$smtp_server, "smtpserverport" => \$smtp_server_port, "smtpserveroption" => \@smtp_server_options, @@ -377,14 +378,6 @@ sub read_config { $$target = $v; } } - - if (!defined $smtp_encryption) { - my $setting = "$prefix.smtpencryption"; - my $enc = Git::config(@repo, $setting); - return unless defined $enc; - return if $configured->{$setting}++; - $smtp_encryption = $enc; - } } # sendemail.identity yields to --identity. We must parse this -- 2.32.0.rc1.385.g46e826f1e55
The config_regexp() function was added in dd84e528a3 (git-send-email: die if sendmail.* config is set, 2020-07-23) for use in git-send-email, and it's the only in-tree user of it. However, the consensus is that Git.pm is a public interface, so even though it's a recently added function we can't change it. So let's copy over a minimal version of it to git-send-email.perl itself. In a subsequent commit it'll be changed further for our own use. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- git-send-email.perl | 20 +++++++++++++++++++- 1 file changed, 19 insertions(+), 1 deletion(-) diff --git a/git-send-email.perl b/git-send-email.perl index 3d8362b5f07..95a69d4c51f 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -380,6 +380,24 @@ sub read_config { } } +sub config_regexp { + my ($regex) = @_; + my @ret; + eval { + @ret = Git::command( + 'config', + '--name-only', + '--get-regexp', + $regex, + ); + 1; + } or do { + # If we have no keys we're OK, otherwise re-throw + die $@ if $@->value != 1; + }; + return @ret; +} + # sendemail.identity yields to --identity. We must parse this # special-case first before the rest of the config is read. $identity = Git::config(@repo, "sendemail.identity"); @@ -478,7 +496,7 @@ sub read_config { usage(); } -if ($forbid_sendmail_variables && (scalar Git::config_regexp("^sendmail[.]")) != 0) { +if ($forbid_sendmail_variables && (scalar config_regexp("^sendmail[.]")) != 0) { die __("fatal: found configuration options for 'sendmail'\n" . "git-send-email is configured with the sendemail.* options - note the 'e'.\n" . "Set sendemail.forbidSendmailVariables to false to disable this check.\n"); -- 2.32.0.rc1.385.g46e826f1e55
Reduce the time it takes git-send-email to get to even the most trivial of tasks (such as serving up its "-h" output) by first listing config keys that exist, and only then only call e.g. "git config --bool" on them if they do. Over a lot of runs this speeds the time to "-h" up for me from ~250ms to ~150ms, and the runtime of t9001-send-email.sh goes from ~25s to ~20s. This introduces a race condition where we'll do the "wrong" thing if a config key were to be inserted between us discovering the list and calling read_config(), i.e. we won't know about the racily added key. In theory this is a change in behavior, in practice it doesn't matter. The config_regexp() function being changed here was added in dd84e528a34 (git-send-email: die if sendmail.* config is set, 2020-07-23) for use by git-send-email. So we can change its odd return value in the case where no values are found by "git config". The difference in the *.pm code would matter if it was invoked in scalar context, but now it no longer is. Arguably this caching belongs in Git.pm itself, but in lieu of modifying it for all its callers let's only do this for "git send-email". The other big potential win would be "git svn", but unlike "git send-email" it doesn't check tens of config variables one at a time at startup (in my brief testing it doesn't check any). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- git-send-email.perl | 35 ++++++++++++++++++++++++++--------- 1 file changed, 26 insertions(+), 9 deletions(-) diff --git a/git-send-email.perl b/git-send-email.perl index 95a69d4c51f..d294681cf39 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -337,11 +337,13 @@ sub signal_handler { # Read our sendemail.* config sub read_config { - my ($configured, $prefix) = @_; + my ($known_keys, $configured, $prefix) = @_; foreach my $setting (keys %config_bool_settings) { my $target = $config_bool_settings{$setting}; - my $v = Git::config_bool(@repo, "$prefix.$setting"); + my $key = "$prefix.$setting"; + next unless exists $known_keys->{$key}; + my $v = Git::config_bool(@repo, $key); next unless defined $v; next if $configured->{$setting}++; $$target = $v; @@ -349,8 +351,10 @@ sub read_config { foreach my $setting (keys %config_path_settings) { my $target = $config_path_settings{$setting}; + my $key = "$prefix.$setting"; + next unless exists $known_keys->{$key}; if (ref($target) eq "ARRAY") { - my @values = Git::config_path(@repo, "$prefix.$setting"); + my @values = Git::config_path(@repo, $key); next unless @values; next if $configured->{$setting}++; @$target = @values; @@ -365,14 +369,16 @@ sub read_config { foreach my $setting (keys %config_settings) { my $target = $config_settings{$setting}; + my $key = "$prefix.$setting"; + next unless exists $known_keys->{$key}; if (ref($target) eq "ARRAY") { - my @values = Git::config(@repo, "$prefix.$setting"); + my @values = Git::config(@repo, $key); next unless @values; next if $configured->{$setting}++; @$target = @values; } else { - my $v = Git::config(@repo, "$prefix.$setting"); + my $v = Git::config(@repo, $key); next unless defined $v; next if $configured->{$setting}++; $$target = $v; @@ -398,9 +404,20 @@ sub config_regexp { return @ret; } +# Save ourselves a lot of work of shelling out to 'git config' (it +# parses 'bool' etc.) by only doing so for config keys that exist. +my %known_config_keys; +{ + my @known_config_keys = config_regexp("^sende?mail[.]"); + @known_config_keys{@known_config_keys} = (); +} + # sendemail.identity yields to --identity. We must parse this # special-case first before the rest of the config is read. -$identity = Git::config(@repo, "sendemail.identity"); +{ + my $key = "sendemail.identity"; + $identity = Git::config(@repo, $key) if exists $known_config_keys{$key}; +} my $rc = GetOptions( "identity=s" => \$identity, "no-identity" => \$no_identity, @@ -411,8 +428,8 @@ sub config_regexp { # Now we know enough to read the config { my %configured; - read_config(\%configured, "sendemail.$identity") if defined $identity; - read_config(\%configured, "sendemail"); + read_config(\%known_config_keys, \%configured, "sendemail.$identity") if defined $identity; + read_config(\%known_config_keys, \%configured, "sendemail"); } # Begin by accumulating all the variables (defined above), that we will end up @@ -496,7 +513,7 @@ sub config_regexp { usage(); } -if ($forbid_sendmail_variables && (scalar config_regexp("^sendmail[.]")) != 0) { +if ($forbid_sendmail_variables && grep { /^sendmail/s } keys %known_config_keys) { die __("fatal: found configuration options for 'sendmail'\n" . "git-send-email is configured with the sendemail.* options - note the 'e'.\n" . "Set sendemail.forbidSendmailVariables to false to disable this check.\n"); -- 2.32.0.rc1.385.g46e826f1e55
Optimize git-send-email by only shelling out to "git var" if we need to. This is easily done by re-inventing our own small version of perl's Memoize module. I suppose I could just use Memoize itself, but in a subsequent patch I'll be micro-optimizing send-email's use of dependencies. Using Memoize is a measly extra 5-10 milliseconds, but as we'll see that'll end up mattering for us in the end. This brings the runtime of a plain "send-email" from around ~160-170ms to ~140m-150ms. The runtime of the tests is around the same, or around ~20s. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- git-send-email.perl | 20 +++++++++++++++----- 1 file changed, 15 insertions(+), 5 deletions(-) diff --git a/git-send-email.perl b/git-send-email.perl index d294681cf39..56bd5b0e50c 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -578,8 +578,18 @@ sub config_regexp { } my ($repoauthor, $repocommitter); -($repoauthor) = Git::ident_person(@repo, 'author'); -($repocommitter) = Git::ident_person(@repo, 'committer'); +{ + my %cache; + my ($author, $committer); + my $common = sub { + my ($what) = @_; + return $cache{$what} if exists $cache{$what}; + ($cache{$what}) = Git::ident_person(@repo, $what); + return $cache{$what}; + }; + $repoauthor = sub { $common->('author') }; + $repocommitter = sub { $common->('committer') }; +} sub parse_address_line { return map { $_->format } Mail::Address->parse($_[0]); @@ -767,7 +777,7 @@ sub get_patch_subject { or die sprintf(__("Failed to open for writing %s: %s"), $compose_filename, $!); - my $tpl_sender = $sender || $repoauthor || $repocommitter || ''; + my $tpl_sender = $sender || $repoauthor->() || $repocommitter->() || ''; my $tpl_subject = $initial_subject || ''; my $tpl_in_reply_to = $initial_in_reply_to || ''; my $tpl_reply_to = $reply_to || ''; @@ -973,7 +983,7 @@ sub file_declares_8bit_cte { $sender =~ s/^\s+|\s+$//g; ($sender) = expand_aliases($sender); } else { - $sender = $repoauthor || $repocommitter || ''; + $sender = $repoauthor->() || $repocommitter->() || ''; } # $sender could be an already sanitized address @@ -1122,7 +1132,7 @@ sub make_message_id { $uniq = "$message_id_stamp-$message_id_serial"; my $du_part; - for ($sender, $repocommitter, $repoauthor) { + for ($sender, $repocommitter->(), $repoauthor->()) { $du_part = extract_valid_address(sanitize_address($_)); last if (defined $du_part and $du_part ne ''); } -- 2.32.0.rc1.385.g46e826f1e55
Change calls like "__ 'foo'" to "__('foo')" so the Perl compiler doesn't have to guess that "__" is a function. This makes the code more readable. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- git-send-email.perl | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/git-send-email.perl b/git-send-email.perl index 56bd5b0e50c..83f764a8bf9 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -696,7 +696,7 @@ sub is_format_patch_arg { if (defined($format_patch)) { return $format_patch; } - die sprintf(__ <<EOF, $f, $f); + die sprintf(__(<<EOF), $f, $f); File '%s' exists but it could also be the range of commits to produce patches for. Please disambiguate by... @@ -782,7 +782,7 @@ sub get_patch_subject { my $tpl_in_reply_to = $initial_in_reply_to || ''; my $tpl_reply_to = $reply_to || ''; - print $c <<EOT1, Git::prefix_lines("GIT: ", __ <<EOT2), <<EOT3; + print $c <<EOT1, Git::prefix_lines("GIT: ", __(<<EOT2)), <<EOT3; From $tpl_sender # This line is ignored. EOT1 Lines beginning in "GIT:" will be removed. -- 2.32.0.rc1.385.g46e826f1e55
Change indirect object syntax such as "new X ARGS" to "X->new(ARGS)". This allows perl to see what "new" is at compile-time without having loaded Term::ReadLine. This doesn't matter now, but will in a subsequent commit when we start lazily loading it. Let's do the same for the adjacent "FakeTerm" package for consistency, even though we're not going to conditionally load it. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- git-send-email.perl | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/git-send-email.perl b/git-send-email.perl index 83f764a8bf9..f1542cb3ea6 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -194,11 +194,11 @@ sub format_2822_time { my @repo = $repo ? ($repo) : (); my $term = eval { $ENV{"GIT_SEND_EMAIL_NOTTY"} - ? new Term::ReadLine 'git-send-email', \*STDIN, \*STDOUT - : new Term::ReadLine 'git-send-email'; + ? Term::ReadLine->new('git-send-email', \*STDIN, \*STDOUT) + : Term::ReadLine->new('git-send-email'); }; if ($@) { - $term = new FakeTerm "$@: going non-interactive"; + $term = FakeTerm->new("$@: going non-interactive"); } # Behavior modification variables -- 2.32.0.rc1.385.g46e826f1e55
Optimize the time git-send-email takes to do even the simplest of things (such as serving up "-h") from around ~150ms to ~80ms-~90ms by lazily loading the modules it requires. Before this change Devel::TraceUse would report 99/97 used modules under NO_GETTEXT=[|Y], respectively. Now it's 52/37. It now takes ~15s to run t9001-send-email.sh, down from ~20s. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- git-send-email.perl | 71 +++++++++++++++++++++++++-------------------- 1 file changed, 39 insertions(+), 32 deletions(-) diff --git a/git-send-email.perl b/git-send-email.perl index f1542cb3ea6..1e9273fd4f5 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -19,20 +19,10 @@ use 5.008; use strict; use warnings $ENV{GIT_PERL_FATAL_WARNINGS} ? qw(FATAL all) : (); -use POSIX qw/strftime/; -use Term::ReadLine; use Getopt::Long; -use Text::ParseWords; -use Term::ANSIColor; -use File::Temp qw/ tempdir tempfile /; -use File::Spec::Functions qw(catdir catfile); use Git::LoadCPAN::Error qw(:try); -use Cwd qw(abs_path cwd); use Git; use Git::I18N; -use Net::Domain (); -use Net::SMTP (); -use Git::LoadCPAN::Mail::Address; Getopt::Long::Configure qw/ pass_through /; @@ -166,7 +156,6 @@ sub format_2822_time { ); } -my $have_email_valid = eval { require Email::Valid; 1 }; my $smtp; my $auth; my $num_sent = 0; @@ -192,14 +181,6 @@ sub format_2822_time { my $repo = eval { Git->repository() }; my @repo = $repo ? ($repo) : (); -my $term = eval { - $ENV{"GIT_SEND_EMAIL_NOTTY"} - ? Term::ReadLine->new('git-send-email', \*STDIN, \*STDOUT) - : Term::ReadLine->new('git-send-email'); -}; -if ($@) { - $term = FakeTerm->new("$@: going non-interactive"); -} # Behavior modification variables my ($quiet, $dry_run) = (0, 0); @@ -309,9 +290,9 @@ sub do_edit { # Handle Uncouth Termination sub signal_handler { - # Make text normal - print color("reset"), "\n"; + require Term::ANSIColor; + print Term::ANSIColor::color("reset"), "\n"; # SMTP password masked system "stty echo"; @@ -592,11 +573,13 @@ sub config_regexp { } sub parse_address_line { + require Git::LoadCPAN::Mail::Address; return map { $_->format } Mail::Address->parse($_[0]); } sub split_addrs { - return quotewords('\s*,\s*', 1, @_); + require Text::ParseWords; + return Text::ParseWords::quotewords('\s*,\s*', 1, @_); } my %aliases; @@ -645,10 +628,11 @@ sub parse_sendmail_aliases { s/\\"/"/g foreach @addr; $aliases{$alias} = \@addr }}}, - mailrc => sub { my $fh = shift; while (<$fh>) { + mailrc => sub { my $fh = shift; while (<$fh>) { if (/^alias\s+(\S+)\s+(.*?)\s*$/) { + require Text::ParseWords; # spaces delimit multiple addresses - $aliases{$1} = [ quotewords('\s+', 0, $2) ]; + $aliases{$1} = [ Text::ParseWords::quotewords('\s+', 0, $2) ]; }}}, pine => sub { my $fh = shift; my $f='\t[^\t]*'; for (my $x = ''; defined($x); $x = $_) { @@ -720,7 +704,8 @@ sub is_format_patch_arg { opendir my $dh, $f or die sprintf(__("Failed to opendir %s: %s"), $f, $!); - push @files, grep { -f $_ } map { catfile($f, $_) } + require File::Spec; + push @files, grep { -f $_ } map { File::Spec->catfile($f, $_) } sort readdir $dh; closedir $dh; } elsif ((-f $f or -p $f) and !is_format_patch_arg($f)) { @@ -733,7 +718,8 @@ sub is_format_patch_arg { if (@rev_list_opts) { die __("Cannot run git format-patch from outside a repository\n") unless $repo; - push @files, $repo->command('format-patch', '-o', tempdir(CLEANUP => 1), @rev_list_opts); + require File::Temp; + push @files, $repo->command('format-patch', '-o', File::Temp::tempdir(CLEANUP => 1), @rev_list_opts); } @files = handle_backup_files(@files); @@ -770,9 +756,10 @@ sub get_patch_subject { if ($compose) { # Note that this does not need to be secure, but we will make a small # effort to have it be unique + require File::Temp; $compose_filename = ($repo ? - tempfile(".gitsendemail.msg.XXXXXX", DIR => $repo->repo_path()) : - tempfile(".gitsendemail.msg.XXXXXX", DIR => "."))[1]; + File::Temp::tempfile(".gitsendemail.msg.XXXXXX", DIR => $repo->repo_path()) : + File::Temp::tempfile(".gitsendemail.msg.XXXXXX", DIR => "."))[1]; open my $c, ">", $compose_filename or die sprintf(__("Failed to open for writing %s: %s"), $compose_filename, $!); @@ -879,6 +866,19 @@ sub get_patch_subject { do_edit(@files); } +sub term { + my $term = eval { + require Term::ReadLine; + $ENV{"GIT_SEND_EMAIL_NOTTY"} + ? Term::ReadLine->new('git-send-email', \*STDIN, \*STDOUT) + : Term::ReadLine->new('git-send-email'); + }; + if ($@) { + $term = FakeTerm->new("$@: going non-interactive"); + } + return $term; +} + sub ask { my ($prompt, %arg) = @_; my $valid_re = $arg{valid_re}; @@ -886,6 +886,7 @@ sub ask { my $confirm_only = $arg{confirm_only}; my $resp; my $i = 0; + my $term = term(); return defined $default ? $default : undef unless defined $term->IN and defined fileno($term->IN) and defined $term->OUT and defined fileno($term->OUT); @@ -1066,6 +1067,7 @@ sub extract_valid_address { return $address if ($address =~ /^($local_part_regexp)$/); $address =~ s/^\s*<(.*)>\s*$/$1/; + my $have_email_valid = eval { require Email::Valid; 1 }; if ($have_email_valid) { return scalar Email::Valid->address($address); } @@ -1125,7 +1127,8 @@ sub validate_address_list { sub make_message_id { my $uniq; if (!defined $message_id_stamp) { - $message_id_stamp = strftime("%Y%m%d%H%M%S.$$", gmtime(time)); + require POSIX; + $message_id_stamp = POSIX::strftime("%Y%m%d%H%M%S.$$", gmtime(time)); $message_id_serial = 0; } $message_id_serial++; @@ -1295,6 +1298,7 @@ sub valid_fqdn { sub maildomain_net { my $maildomain; + require Net::Domain; my $domain = Net::Domain::domainname(); $maildomain = $domain if valid_fqdn($domain); @@ -1305,6 +1309,7 @@ sub maildomain_mta { my $maildomain; for my $host (qw(mailhost localhost)) { + require Net::SMTP; my $smtp = Net::SMTP->new($host); if (defined $smtp) { my $domain = $smtp->domain; @@ -1983,13 +1988,15 @@ sub validate_patch { my ($fn, $xfer_encoding) = @_; if ($repo) { - my $validate_hook = catfile($repo->hooks_path(), + require File::Spec; + my $validate_hook = File::Spec->catfile($repo->hooks_path(), 'sendemail-validate'); my $hook_error; if (-x $validate_hook) { - my $target = abs_path($fn); + require Cwd; + my $target = Cwd::abs_path($fn); # The hook needs a correct cwd and GIT_DIR. - my $cwd_save = cwd(); + my $cwd_save = Cwd::cwd(); chdir($repo->wc_path() or $repo->repo_path()) or die("chdir: $!"); local $ENV{"GIT_DIR"} = $repo->repo_path(); -- 2.32.0.rc1.385.g46e826f1e55
Instead of unconditionally requiring modules such as File::Spec, let's only load them when needed. This speeds up code that only needs a subset of the features Git.pm provides. This brings a plain invocation of "git send-email" down from 52/37 loaded modules under NO_GETTEXT=[|Y] to 39/18, and it now takes ~60-~70ms instead of ~80-~90ms. The runtime of t9001-send-email.sh test is down to ~13s from ~15s. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- perl/Git.pm | 35 ++++++++++++++++++----------------- 1 file changed, 18 insertions(+), 17 deletions(-) diff --git a/perl/Git.pm b/perl/Git.pm index 73ebbf80cc6..4d048f307b3 100644 --- a/perl/Git.pm +++ b/perl/Git.pm @@ -11,9 +11,6 @@ package Git; use strict; use warnings $ENV{GIT_PERL_FATAL_WARNINGS} ? qw(FATAL all) : (); -use File::Temp (); -use File::Spec (); - BEGIN { our ($VERSION, @ISA, @EXPORT, @EXPORT_OK); @@ -103,12 +100,9 @@ =head1 DESCRIPTION =cut -use Carp qw(carp croak); # but croak is bad - throw instead +sub carp { require Carp; goto &Carp::carp } +sub croak { require Carp; goto &Carp::croak } use Git::LoadCPAN::Error qw(:try); -use Cwd qw(abs_path cwd); -use IPC::Open2 qw(open2); -use Fcntl qw(SEEK_SET SEEK_CUR); -use Time::Local qw(timegm); } @@ -191,13 +185,15 @@ sub repository { $dir = undef; }; + require Cwd; if ($dir) { + require File::Spec; File::Spec->file_name_is_absolute($dir) or $dir = $opts{Directory} . '/' . $dir; - $opts{Repository} = abs_path($dir); + $opts{Repository} = Cwd::abs_path($dir); # If --git-dir went ok, this shouldn't die either. my $prefix = $search->command_oneline('rev-parse', '--show-prefix'); - $dir = abs_path($opts{Directory}) . '/'; + $dir = Cwd::abs_path($opts{Directory}) . '/'; if ($prefix) { if (substr($dir, -length($prefix)) ne $prefix) { throw Error::Simple("rev-parse confused me - $dir does not have trailing $prefix"); @@ -223,7 +219,7 @@ sub repository { throw Error::Simple("fatal: Not a git repository: $dir"); } - $opts{Repository} = abs_path($dir); + $opts{Repository} = Cwd::abs_path($dir); } delete $opts{Directory}; @@ -408,10 +404,12 @@ sub command_bidi_pipe { my $cwd_save = undef; if ($self) { shift; - $cwd_save = cwd(); + require Cwd; + $cwd_save = Cwd::cwd(); _setup_git_cmd_env($self); } - $pid = open2($in, $out, 'git', @_); + require IPC::Open2; + $pid = IPC::Open2::open2($in, $out, 'git', @_); chdir($cwd_save) if $cwd_save; return ($pid, $in, $out, join(' ', @_)); } @@ -538,7 +536,8 @@ sub get_tz_offset { my $t = shift || time; my @t = localtime($t); $t[5] += 1900; - my $gm = timegm(@t); + require Time::Local; + my $gm = Time::Local::timegm(@t); my $sign = qw( + + - )[ $gm <=> $t ]; return sprintf("%s%02d%02d", $sign, (gmtime(abs($t - $gm)))[2,1]); } @@ -629,7 +628,8 @@ sub hooks_path { my ($self) = @_; my $dir = $self->command_oneline('rev-parse', '--git-path', 'hooks'); - my $abs = abs_path($dir); + require Cwd; + my $abs = Cwd::abs_path($dir); return $abs; } @@ -1353,6 +1353,7 @@ sub _temp_cache { my $n = $name; $n =~ s/\W/_/g; # no strange chars + require File::Temp; ($$temp_fd, $fname) = File::Temp::tempfile( "Git_${n}_XXXXXX", UNLINK => 1, DIR => $tmpdir, ) or throw Error::Simple("couldn't open new temp file"); @@ -1375,9 +1376,9 @@ sub temp_reset { truncate $temp_fd, 0 or throw Error::Simple("couldn't truncate file"); - sysseek($temp_fd, 0, SEEK_SET) and seek($temp_fd, 0, SEEK_SET) + sysseek($temp_fd, 0, Fcntl::SEEK_SET()) and seek($temp_fd, 0, Fcntl::SEEK_SET()) or throw Error::Simple("couldn't seek to beginning of file"); - sysseek($temp_fd, 0, SEEK_CUR) == 0 and tell($temp_fd) == 0 + sysseek($temp_fd, 0, Fcntl::SEEK_CUR()) == 0 and tell($temp_fd) == 0 or throw Error::Simple("expected file position to be reset"); } -- 2.32.0.rc1.385.g46e826f1e55
Optimize the startup time of git-send-email by using an amended config_regexp() function to retrieve the list of config keys and values we're interested in. For boolean keys we can handle the [true|false] case ourselves, and the "--get" case didn't need any parsing. Let's leave "--path" and other "--bool" cases to "git config". I'm not bothering with the "undef" or "" case (true and false, respectively), let's just punt on those and others and have "git config --type=bool" handle it. This brings the runtime of "git send-email" from ~60-~70ms to a very steady ~40ms on my test box. We now run just one "git config" invocation on startup instead of 8, the exact number will differ based on the local sendemail.* config. I happen to have 8 of those set. This brings the runtime of t9001-send-email.sh from ~13s down to ~12s for me. The change there is less impressive as many of those tests set various config values, and we're also getting to the point of diminishing returns for optimizing "git send-email" itself. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- git-send-email.perl | 29 ++++++++++++++++++++--------- 1 file changed, 20 insertions(+), 9 deletions(-) diff --git a/git-send-email.perl b/git-send-email.perl index 1e9273fd4f5..1ea4d9589d8 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -324,7 +324,11 @@ sub read_config { my $target = $config_bool_settings{$setting}; my $key = "$prefix.$setting"; next unless exists $known_keys->{$key}; - my $v = Git::config_bool(@repo, $key); + my $v = (@{$known_keys->{$key}} == 1 && + (defined $known_keys->{$key}->[0] && + $known_keys->{$key}->[0] =~ /^(?:true|false)$/s)) + ? $known_keys->{$key}->[0] eq 'true' + : Git::config_bool(@repo, $key); next unless defined $v; next if $configured->{$setting}++; $$target = $v; @@ -353,14 +357,12 @@ sub read_config { my $key = "$prefix.$setting"; next unless exists $known_keys->{$key}; if (ref($target) eq "ARRAY") { - my @values = Git::config(@repo, $key); - next unless @values; + my @values = @{$known_keys->{$key}}; next if $configured->{$setting}++; @$target = @values; } else { - my $v = Git::config(@repo, $key); - next unless defined $v; + my $v = $known_keys->{$key}->[0]; next if $configured->{$setting}++; $$target = $v; } @@ -371,12 +373,19 @@ sub config_regexp { my ($regex) = @_; my @ret; eval { - @ret = Git::command( + my $ret = Git::command( 'config', - '--name-only', + '--null', '--get-regexp', $regex, ); + @ret = map { + # We must always return ($k, $v) here, since + # empty config values will be just "key\0", + # not "key\nvalue\0". + my ($k, $v) = split /\n/, $_, 2; + ($k, $v); + } split /\0/, $ret; 1; } or do { # If we have no keys we're OK, otherwise re-throw @@ -389,8 +398,10 @@ sub config_regexp { # parses 'bool' etc.) by only doing so for config keys that exist. my %known_config_keys; { - my @known_config_keys = config_regexp("^sende?mail[.]"); - @known_config_keys{@known_config_keys} = (); + my @kv = config_regexp("^sende?mail[.]"); + while (my ($k, $v) = splice @kv, 0, 2) { + push @{$known_config_keys{$k}} => $v; + } } # sendemail.identity yields to --identity. We must parse this -- 2.32.0.rc1.385.g46e826f1e55
It has been pointed out[1] that cwd() invokes "pwd(1)" while getcwd() is a Perl-native XS function. For what we're using these for we can use getcwd(). The performance difference is miniscule, we're saving on the order of a millisecond or so, see [2] below for the benchmark. I don't think this matters in practice for optimizing git-send-email or perl execution (unlike the patches leading up to this one). But let's do it regardless of that, if only so we don't have to think about this as a low-hanging fruit anymore. 1. https://lore.kernel.org/git/20210512180517.GA11354@dcvr/ 2. $ perl -MBenchmark=:all -MCwd -wE 'cmpthese(10000, { getcwd => sub { getcwd }, cwd => sub { cwd }, pwd => sub { system "pwd >/dev/null" }})' (warning: too few iterations for a reliable count) Rate pwd cwd getcwd pwd 982/s -- -48% -100% cwd 1890/s 92% -- -100% getcwd 10000000000000000000/s 1018000000000000000% 529000000000000064% - Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- git-send-email.perl | 2 +- perl/Git.pm | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/git-send-email.perl b/git-send-email.perl index 1ea4d9589d8..c6af05e8a31 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -2007,7 +2007,7 @@ sub validate_patch { require Cwd; my $target = Cwd::abs_path($fn); # The hook needs a correct cwd and GIT_DIR. - my $cwd_save = Cwd::cwd(); + my $cwd_save = Cwd::getcwd(); chdir($repo->wc_path() or $repo->repo_path()) or die("chdir: $!"); local $ENV{"GIT_DIR"} = $repo->repo_path(); diff --git a/perl/Git.pm b/perl/Git.pm index 4d048f307b3..863bd80694c 100644 --- a/perl/Git.pm +++ b/perl/Git.pm @@ -405,7 +405,7 @@ sub command_bidi_pipe { if ($self) { shift; require Cwd; - $cwd_save = Cwd::cwd(); + $cwd_save = Cwd::getcwd(); _setup_git_cmd_env($self); } require IPC::Open2; -- 2.32.0.rc1.385.g46e826f1e55
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
> -use File::Temp qw/ tempdir tempfile /;
> -use File::Spec::Functions qw(catdir catfile);
> ...
> + require File::Spec;
> + push @files, grep { -f $_ } map { File::Spec->catfile($f, $_) }
> sort readdir $dh;
> ...
> - push @files, $repo->command('format-patch', '-o', tempdir(CLEANUP => 1), @rev_list_opts);
> + require File::Temp;
> + push @files, $repo->command('format-patch', '-o', File::Temp::tempdir(CLEANUP => 1), @rev_list_opts);
I suspect I would not be the only one who finds it curious to have
the distinction between the use of "->" in File::Spec->catfile() and
"::" in File::Temp::tempdir (and all the other function calls
touched by this patch).
Wouldn't it be more faithful to the original to do:
require File::Spec::Functions;
push @files, ... map { File::Spec::Functions::catfile($f, $_) }
in the first hunk? It would save the time readers of "git log -p"
has to spend on wondering why "catfile" is an oddball.
Of course "man File::Spec" would educate readers where the
difference comes from, but it is not immediately obvious to me why
we want to switch in a "lazily load for speedup" that is supposed to
be optimization-only, even if the class method ends up calling the
same underlying thing.
Thanks.
Hi,
On Wed, May 12, 2021 at 6:50 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> As noted in the subject this speeds up git-send-email invocations by
> ~2x or more, and brings the very slow t9001 test from running in ~26s
> on my box to ~12s. It's no longer consistently the slowest test I run.
>
> This is basically done in two ways: We lazily invoke "git config" to
> get config, before it's very eager, and deferring Perl compilation
> with s/use/require/g.
I know I'm very late to the party, but I just wanted to comment that
this is super cool. Thanks for speeding this up; some really good
finds here.
On Thu, May 27 2021, Junio C Hamano wrote:
> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>> -use File::Temp qw/ tempdir tempfile /;
>> -use File::Spec::Functions qw(catdir catfile);
>> ...
>> + require File::Spec;
>> + push @files, grep { -f $_ } map { File::Spec->catfile($f, $_) }
>> sort readdir $dh;
>> ...
>> - push @files, $repo->command('format-patch', '-o', tempdir(CLEANUP => 1), @rev_list_opts);
>> + require File::Temp;
>> + push @files, $repo->command('format-patch', '-o', File::Temp::tempdir(CLEANUP => 1), @rev_list_opts);
>
> I suspect I would not be the only one who finds it curious to have
> the distinction between the use of "->" in File::Spec->catfile() and
> "::" in File::Temp::tempdir (and all the other function calls
> touched by this patch).
>
> Wouldn't it be more faithful to the original to do:
>
> require File::Spec::Functions;
> push @files, ... map { File::Spec::Functions::catfile($f, $_) }
>
> in the first hunk? It would save the time readers of "git log -p"
> has to spend on wondering why "catfile" is an oddball.
>
> Of course "man File::Spec" would educate readers where the
> difference comes from, but it is not immediately obvious to me why
> we want to switch in a "lazily load for speedup" that is supposed to
> be optimization-only, even if the class method ends up calling the
> same underlying thing.
It was idiomatic to me so I didn't think of explaining it. Yes it looks
odd, but it's the most straightforward and correct thing to do in this
s/use/require/ change.
The reason to use the File::Spec::Functions wrapper is because you want
File::Spec class methods in your namespace. I.e. to do catfile() instead
of File::Spec->catfile().
To do that requires importing the "catfile" symbol. Since we're doing a
s/use/requires/ here we won't do that (symbols like that are
compile-time), so we'd need to call File::Spec::Functions::catfile() as
you point out.
Except the whole point of that Functions.pm package is to not need that
fully-qualified name, so once we're doing that we might as well call
File::Spec->catfile() instead.
The implementation of File::Spec::Functions is to literally give you a
generaler boilerplate function in your namespace that calls
File::Spec::catfile("File::Spec", <your argument here>), which is what
the "->" class method syntax does implicitly. Calling that wrapper
doesn't make sense here.
On Mon, May 24, 2021 at 09:53:01AM +0200, Ævar Arnfjörð Bjarmason wrote: > diff --git a/git-send-email.perl b/git-send-email.perl > index 1e9273fd4f5..1ea4d9589d8 100755 > --- a/git-send-email.perl > +++ b/git-send-email.perl > @@ -324,7 +324,11 @@ sub read_config { > my $target = $config_bool_settings{$setting}; > my $key = "$prefix.$setting"; > next unless exists $known_keys->{$key}; > - my $v = Git::config_bool(@repo, $key); > + my $v = (@{$known_keys->{$key}} == 1 && > + (defined $known_keys->{$key}->[0] && > + $known_keys->{$key}->[0] =~ /^(?:true|false)$/s)) > + ? $known_keys->{$key}->[0] eq 'true' > + : Git::config_bool(@repo, $key); Thanks for addressing this. It looks like an undefined value will kick back to the Git::config_bool() case. That's probably fine, as I'd think it's relatively rare (and this is all just optimization anyway). > @@ -353,14 +357,12 @@ sub read_config { > my $key = "$prefix.$setting"; > next unless exists $known_keys->{$key}; > if (ref($target) eq "ARRAY") { > - my @values = Git::config(@repo, $key); > - next unless @values; > + my @values = @{$known_keys->{$key}}; > next if $configured->{$setting}++; > @$target = @values; > } I do wonder what happens for non-bool values here. I.e., would we return a list that contains undef? Before your change, the value comes from Git::config(), and now it comes from our pre-read $known_keys. It seems fine either way: $ git -c sendemail.smtpsslcertpath send-email -1 error: missing value for 'sendemail.smtpsslcertpath' fatal: unable to parse command-line config config --path --get sendemail.smtpsslcertpath: command returned error: 128 but I didn't carefully follow all the paths that config can take. So I raise it only as a potential issue. Your response is hopefully "yes, I thought of that, and it is fine." :) -Peff
On Mon, May 24, 2021 at 09:52:49AM +0200, Ævar Arnfjörð Bjarmason wrote:
> This v4 fixes an issue in v3 where 12/13 had a "diff --git" as part of
> the commit message (mistake during rebase/squash), which confused "git
> am" in trying to apply a diff twice. See <xmqqwnrplyns.fsf@gitster.g>.
I raised a probably-not-a-problem question in patch 12. Assuming it's
indeed not-a-problem, this all looks good to me.
Thanks for addressing my concerns about Git.pm stability.
-Peff
PS I hit some mild conflicts applying this on top of master because
7cbc0455cc (send-email: move "hooks_path" invocation to
git-send-email.perl, 2021-05-26) graduated in the meantime. I expect
Junio prefers it as you have it here, since he tends to use the
original base from previous rounds, but just a hint to any other
reviewers.
Hopefully the final iteration. Updates a commit message to explain why I moved away from File::Spec::Functions, rebases on master, and explains and deals with the "undef in config" issue Jeff King noted. Ævar Arnfjörð Bjarmason (13): send-email tests: support GIT_TEST_PERL_FATAL_WARNINGS=true send-email tests: test for boolean variables without a value send-email: remove non-working support for "sendemail.smtpssl" send-email: refactor sendemail.smtpencryption config parsing send-email: copy "config_regxp" into git-send-email.perl send-email: lazily load config for a big speedup send-email: lazily shell out to "git var" send-email: use function syntax instead of barewords send-email: get rid of indirect object syntax send-email: lazily load modules for a big speedup perl: lazily load some common Git.pm setup code send-email: move trivial config handling to Perl perl: nano-optimize by replacing Cwd::cwd() with Cwd::getcwd() Documentation/config/sendemail.txt | 3 - git-send-email.perl | 174 +++++++++++++++++++---------- perl/Git.pm | 32 +++--- t/t9001-send-email.sh | 29 +++++ 4 files changed, 159 insertions(+), 79 deletions(-) Range-diff against v4: 1: 7140847367c = 1: 81025b48f1c send-email tests: support GIT_TEST_PERL_FATAL_WARNINGS=true 2: d27f3b48f85 = 2: 16277bd1082 send-email tests: test for boolean variables without a value 3: a7a21b75f2e = 3: e3e3e6415d2 send-email: remove non-working support for "sendemail.smtpssl" 4: 7356a528589 = 4: 961ca4c2b2a send-email: refactor sendemail.smtpencryption config parsing 5: cce0f89143b = 5: f2bd12728a1 send-email: copy "config_regxp" into git-send-email.perl 6: 8afe8661761 = 6: 4cf70c6f97e send-email: lazily load config for a big speedup 7: 491eefde6a2 = 7: bd0d9535718 send-email: lazily shell out to "git var" 8: 860156013f8 = 8: f1a879a8ae9 send-email: use function syntax instead of barewords 9: dd24f1249f5 = 9: 881b1093409 send-email: get rid of indirect object syntax 10: 61e3e3c93c5 ! 10: 9f21bc6e6f2 send-email: lazily load modules for a big speedup @@ Commit message under NO_GETTEXT=[|Y], respectively. Now it's 52/37. It now takes ~15s to run t9001-send-email.sh, down from ~20s. + Changing File::Spec::Functions::{catdir,catfile} to invoking class + methods on File::Spec itself is idiomatic. See [1] for a more + elaborate explanation, the resulting code behaves the same way, just + without the now-pointless function wrapper. + + 1. http://lore.kernel.org/git/8735u8mmj9.fsf@evledraar.gmail.com + Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> ## git-send-email.perl ## @@ git-send-email.perl: sub maildomain_mta { if (defined $smtp) { my $domain = $smtp->domain; @@ git-send-email.perl: sub validate_patch { - my ($fn, $xfer_encoding) = @_; if ($repo) { -- my $validate_hook = catfile($repo->hooks_path(), + my $hooks_path = $repo->command_oneline('rev-parse', '--git-path', 'hooks'); +- my $validate_hook = catfile($hooks_path, + require File::Spec; -+ my $validate_hook = File::Spec->catfile($repo->hooks_path(), ++ my $validate_hook = File::Spec->catfile($hooks_path, 'sendemail-validate'); my $hook_error; if (-x $validate_hook) { 11: ada34374286 ! 11: 66f68e38c16 perl: lazily load some common Git.pm setup code @@ perl/Git.pm: sub get_tz_offset { my $sign = qw( + + - )[ $gm <=> $t ]; return sprintf("%s%02d%02d", $sign, (gmtime(abs($t - $gm)))[2,1]); } -@@ perl/Git.pm: sub hooks_path { - my ($self) = @_; - - my $dir = $self->command_oneline('rev-parse', '--git-path', 'hooks'); -- my $abs = abs_path($dir); -+ require Cwd; -+ my $abs = Cwd::abs_path($dir); - return $abs; - } - @@ perl/Git.pm: sub _temp_cache { my $n = $name; $n =~ s/\W/_/g; # no strange chars 12: 3818000bfba ! 12: f605b5ae49f send-email: move trivial config handling to Perl @@ Commit message "undef" or "" case (true and false, respectively), let's just punt on those and others and have "git config --type=bool" handle it. + The "grep { defined } @values" here covers a rather subtle case. For + list values such as sendemail.to it is possible as with any other + config key to provide a plain "-c sendemail.to", i.e. to set the key + as a boolean true. In that case the Git::config() API will return an + empty string, but this new parser will correctly return "undef". + + However, that means we can end up with "undef" in the middle of a + list. E.g. for sendemail.smtpserveroption in conjuction with + sendemail.smtpserver as a path this would have produce a warning. For + most of the other keys we'd behave the same despite the subtle change + in the value, e.g. sendemail.to would behave the same because + Mail::Address->parse() happens to return an empty list if fed + "undef". For the boolean values we were already prepared to handle + these variables being initialized as undef anyway. + This brings the runtime of "git send-email" from ~60-~70ms to a very steady ~40ms on my test box. We now run just one "git config" invocation on startup instead of 8, the exact number will differ based @@ git-send-email.perl: sub read_config { - my @values = Git::config(@repo, $key); - next unless @values; + my @values = @{$known_keys->{$key}}; ++ @values = grep { defined } @values; next if $configured->{$setting}++; @$target = @values; } else { - my $v = Git::config(@repo, $key); -- next unless defined $v; + my $v = $known_keys->{$key}->[0]; + next unless defined $v; next if $configured->{$setting}++; $$target = $v; - } @@ git-send-email.perl: sub config_regexp { my ($regex) = @_; my @ret; 13: d36b57e429f = 13: aa3a2de7047 perl: nano-optimize by replacing Cwd::cwd() with Cwd::getcwd() -- 2.32.0.rc1.458.gd885d4f985c
Add support for the "GIT_TEST_PERL_FATAL_WARNINGS=true" test mode to "send-email". This was added to e.g. git-svn in 5338ed2b26 (perl: check for perl warnings while running tests, 2020-10-21), but not "send-email". Let's rectify that. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- git-send-email.perl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/git-send-email.perl b/git-send-email.perl index 25be2ebd2af..1630e87cd4d 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -18,7 +18,7 @@ use 5.008; use strict; -use warnings; +use warnings $ENV{GIT_PERL_FATAL_WARNINGS} ? qw(FATAL all) : (); use POSIX qw/strftime/; use Term::ReadLine; use Getopt::Long; -- 2.32.0.rc1.458.gd885d4f985c
The Git.pm code does its own Perl-ifying of boolean variables, let's ensure that empty values = true for boolean variables, as in the C code. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- t/t9001-send-email.sh | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) diff --git a/t/t9001-send-email.sh b/t/t9001-send-email.sh index 3b7540050ca..612de095fd7 100755 --- a/t/t9001-send-email.sh +++ b/t/t9001-send-email.sh @@ -1368,6 +1368,16 @@ test_expect_success $PREREQ 'sendemail.identity: bool variable fallback' ' ! grep "X-Mailer" stdout ' +test_expect_success $PREREQ 'sendemail.identity: bool variable without a value' ' + git -c sendemail.xmailer \ + send-email \ + --dry-run \ + --from="nobody@example.com" \ + $patches >stdout && + grep "To: default@example.com" stdout && + grep "X-Mailer" stdout +' + test_expect_success $PREREQ '--no-to overrides sendemail.to' ' git send-email \ --dry-run \ @@ -2092,6 +2102,18 @@ test_expect_success $PREREQ '--[no-]xmailer with sendemail.xmailer=true' ' do_xmailer_test 1 "--xmailer" ' +test_expect_success $PREREQ '--[no-]xmailer with sendemail.xmailer' ' + test_when_finished "test_unconfig sendemail.xmailer" && + cat >>.git/config <<-\EOF && + [sendemail] + xmailer + EOF + test_config sendemail.xmailer true && + do_xmailer_test 1 "" && + do_xmailer_test 0 "--no-xmailer" && + do_xmailer_test 1 "--xmailer" +' + test_expect_success $PREREQ '--[no-]xmailer with sendemail.xmailer=false' ' test_config sendemail.xmailer false && do_xmailer_test 0 "" && @@ -2099,6 +2121,13 @@ test_expect_success $PREREQ '--[no-]xmailer with sendemail.xmailer=false' ' do_xmailer_test 1 "--xmailer" ' +test_expect_success $PREREQ '--[no-]xmailer with sendemail.xmailer=' ' + test_config sendemail.xmailer "" && + do_xmailer_test 0 "" && + do_xmailer_test 0 "--no-xmailer" && + do_xmailer_test 1 "--xmailer" +' + test_expect_success $PREREQ 'setup expected-list' ' git send-email \ --dry-run \ -- 2.32.0.rc1.458.gd885d4f985c
Remove the already dead code to support "sendemail.smtpssl" by finally removing the dead code supporting the configuration option. In f6bebd121ac (git-send-email: add support for TLS via Net::SMTP::SSL, 2008-06-25) the --smtp-ssl command-line option was documented as deprecated, later in 65180c66186 (List send-email config options in config.txt., 2009-07-22) the "sendemail.smtpssl" configuration option was also documented as such. Then in in 3ff15040e22 (send-email: fix regression in sendemail.identity parsing, 2019-05-17) I unintentionally removed support for it by introducing a bug in read_config(). As can be seen from the diff context we've already returned unless $enc i defined, so it's not possible for us to reach the "elsif" branch here. This code was therefore already dead since Git v2.23.0. So let's just remove it. We were already 11 years into a stated deprecation period of this variable when 3ff15040e22 landed, now it's around 13. Since it hasn't worked anyway for around 2 years it looks like we can safely remove it. The --smtp-ssl option is still deprecated, if someone cares they can follow-up and remove that too, but unlike the config option that one could still be in use in the wild. I'm just removing this code that's provably unused already. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- Documentation/config/sendemail.txt | 3 --- git-send-email.perl | 6 +----- 2 files changed, 1 insertion(+), 8 deletions(-) diff --git a/Documentation/config/sendemail.txt b/Documentation/config/sendemail.txt index cbc5af42fdf..50baa5d6bfb 100644 --- a/Documentation/config/sendemail.txt +++ b/Documentation/config/sendemail.txt @@ -8,9 +8,6 @@ sendemail.smtpEncryption:: See linkgit:git-send-email[1] for description. Note that this setting is not subject to the 'identity' mechanism. -sendemail.smtpssl (deprecated):: - Deprecated alias for 'sendemail.smtpEncryption = ssl'. - sendemail.smtpsslcertpath:: Path to ca-certificates (either a directory or a single file). Set it to an empty string to disable certificate verification. diff --git a/git-send-email.perl b/git-send-email.perl index 1630e87cd4d..9a1a4898e36 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -393,11 +393,7 @@ sub read_config { my $enc = Git::config(@repo, $setting); return unless defined $enc; return if $configured->{$setting}++; - if (defined $enc) { - $smtp_encryption = $enc; - } elsif (Git::config_bool(@repo, "$prefix.smtpssl")) { - $smtp_encryption = 'ssl'; - } + $smtp_encryption = $enc; } } -- 2.32.0.rc1.458.gd885d4f985c
With the removal of the support for sendemail.smtpssl in the preceding commit the parsing of sendemail.smtpencryption is no longer special, and can by moved to %config_settings. This gets us rid of an unconditional call to Git::config(), which as we'll see in subsequent commits matters for startup performance. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- git-send-email.perl | 9 +-------- 1 file changed, 1 insertion(+), 8 deletions(-) diff --git a/git-send-email.perl b/git-send-email.perl index 9a1a4898e36..e2fe112aa50 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -287,6 +287,7 @@ sub do_edit { ); my %config_settings = ( + "smtpencryption" => \$smtp_encryption, "smtpserver" => \$smtp_server, "smtpserverport" => \$smtp_server_port, "smtpserveroption" => \@smtp_server_options, @@ -387,14 +388,6 @@ sub read_config { $$target = $v; } } - - if (!defined $smtp_encryption) { - my $setting = "$prefix.smtpencryption"; - my $enc = Git::config(@repo, $setting); - return unless defined $enc; - return if $configured->{$setting}++; - $smtp_encryption = $enc; - } } # sendemail.identity yields to --identity. We must parse this -- 2.32.0.rc1.458.gd885d4f985c
The config_regexp() function was added in dd84e528a3 (git-send-email: die if sendmail.* config is set, 2020-07-23) for use in git-send-email, and it's the only in-tree user of it. However, the consensus is that Git.pm is a public interface, so even though it's a recently added function we can't change it. So let's copy over a minimal version of it to git-send-email.perl itself. In a subsequent commit it'll be changed further for our own use. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- git-send-email.perl | 20 +++++++++++++++++++- 1 file changed, 19 insertions(+), 1 deletion(-) diff --git a/git-send-email.perl b/git-send-email.perl index e2fe112aa50..73e3d3fd26e 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -390,6 +390,24 @@ sub read_config { } } +sub config_regexp { + my ($regex) = @_; + my @ret; + eval { + @ret = Git::command( + 'config', + '--name-only', + '--get-regexp', + $regex, + ); + 1; + } or do { + # If we have no keys we're OK, otherwise re-throw + die $@ if $@->value != 1; + }; + return @ret; +} + # sendemail.identity yields to --identity. We must parse this # special-case first before the rest of the config is read. $identity = Git::config(@repo, "sendemail.identity"); @@ -488,7 +506,7 @@ sub read_config { usage(); } -if ($forbid_sendmail_variables && (scalar Git::config_regexp("^sendmail[.]")) != 0) { +if ($forbid_sendmail_variables && (scalar config_regexp("^sendmail[.]")) != 0) { die __("fatal: found configuration options for 'sendmail'\n" . "git-send-email is configured with the sendemail.* options - note the 'e'.\n" . "Set sendemail.forbidSendmailVariables to false to disable this check.\n"); -- 2.32.0.rc1.458.gd885d4f985c
Reduce the time it takes git-send-email to get to even the most trivial of tasks (such as serving up its "-h" output) by first listing config keys that exist, and only then only call e.g. "git config --bool" on them if they do. Over a lot of runs this speeds the time to "-h" up for me from ~250ms to ~150ms, and the runtime of t9001-send-email.sh goes from ~25s to ~20s. This introduces a race condition where we'll do the "wrong" thing if a config key were to be inserted between us discovering the list and calling read_config(), i.e. we won't know about the racily added key. In theory this is a change in behavior, in practice it doesn't matter. The config_regexp() function being changed here was added in dd84e528a34 (git-send-email: die if sendmail.* config is set, 2020-07-23) for use by git-send-email. So we can change its odd return value in the case where no values are found by "git config". The difference in the *.pm code would matter if it was invoked in scalar context, but now it no longer is. Arguably this caching belongs in Git.pm itself, but in lieu of modifying it for all its callers let's only do this for "git send-email". The other big potential win would be "git svn", but unlike "git send-email" it doesn't check tens of config variables one at a time at startup (in my brief testing it doesn't check any). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- git-send-email.perl | 35 ++++++++++++++++++++++++++--------- 1 file changed, 26 insertions(+), 9 deletions(-) diff --git a/git-send-email.perl b/git-send-email.perl index 73e3d3fd26e..de62cbf2506 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -347,11 +347,13 @@ sub signal_handler { # Read our sendemail.* config sub read_config { - my ($configured, $prefix) = @_; + my ($known_keys, $configured, $prefix) = @_; foreach my $setting (keys %config_bool_settings) { my $target = $config_bool_settings{$setting}; - my $v = Git::config_bool(@repo, "$prefix.$setting"); + my $key = "$prefix.$setting"; + next unless exists $known_keys->{$key}; + my $v = Git::config_bool(@repo, $key); next unless defined $v; next if $configured->{$setting}++; $$target = $v; @@ -359,8 +361,10 @@ sub read_config { foreach my $setting (keys %config_path_settings) { my $target = $config_path_settings{$setting}; + my $key = "$prefix.$setting"; + next unless exists $known_keys->{$key}; if (ref($target) eq "ARRAY") { - my @values = Git::config_path(@repo, "$prefix.$setting"); + my @values = Git::config_path(@repo, $key); next unless @values; next if $configured->{$setting}++; @$target = @values; @@ -375,14 +379,16 @@ sub read_config { foreach my $setting (keys %config_settings) { my $target = $config_settings{$setting}; + my $key = "$prefix.$setting"; + next unless exists $known_keys->{$key}; if (ref($target) eq "ARRAY") { - my @values = Git::config(@repo, "$prefix.$setting"); + my @values = Git::config(@repo, $key); next unless @values; next if $configured->{$setting}++; @$target = @values; } else { - my $v = Git::config(@repo, "$prefix.$setting"); + my $v = Git::config(@repo, $key); next unless defined $v; next if $configured->{$setting}++; $$target = $v; @@ -408,9 +414,20 @@ sub config_regexp { return @ret; } +# Save ourselves a lot of work of shelling out to 'git config' (it +# parses 'bool' etc.) by only doing so for config keys that exist. +my %known_config_keys; +{ + my @known_config_keys = config_regexp("^sende?mail[.]"); + @known_config_keys{@known_config_keys} = (); +} + # sendemail.identity yields to --identity. We must parse this # special-case first before the rest of the config is read. -$identity = Git::config(@repo, "sendemail.identity"); +{ + my $key = "sendemail.identity"; + $identity = Git::config(@repo, $key) if exists $known_config_keys{$key}; +} my $rc = GetOptions( "identity=s" => \$identity, "no-identity" => \$no_identity, @@ -421,8 +438,8 @@ sub config_regexp { # Now we know enough to read the config { my %configured; - read_config(\%configured, "sendemail.$identity") if defined $identity; - read_config(\%configured, "sendemail"); + read_config(\%known_config_keys, \%configured, "sendemail.$identity") if defined $identity; + read_config(\%known_config_keys, \%configured, "sendemail"); } # Begin by accumulating all the variables (defined above), that we will end up @@ -506,7 +523,7 @@ sub config_regexp { usage(); } -if ($forbid_sendmail_variables && (scalar config_regexp("^sendmail[.]")) != 0) { +if ($forbid_sendmail_variables && grep { /^sendmail/s } keys %known_config_keys) { die __("fatal: found configuration options for 'sendmail'\n" . "git-send-email is configured with the sendemail.* options - note the 'e'.\n" . "Set sendemail.forbidSendmailVariables to false to disable this check.\n"); -- 2.32.0.rc1.458.gd885d4f985c
Optimize git-send-email by only shelling out to "git var" if we need to. This is easily done by re-inventing our own small version of perl's Memoize module. I suppose I could just use Memoize itself, but in a subsequent patch I'll be micro-optimizing send-email's use of dependencies. Using Memoize is a measly extra 5-10 milliseconds, but as we'll see that'll end up mattering for us in the end. This brings the runtime of a plain "send-email" from around ~160-170ms to ~140m-150ms. The runtime of the tests is around the same, or around ~20s. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- git-send-email.perl | 20 +++++++++++++++----- 1 file changed, 15 insertions(+), 5 deletions(-) diff --git a/git-send-email.perl b/git-send-email.perl index de62cbf2506..38408ec75a8 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -588,8 +588,18 @@ sub config_regexp { } my ($repoauthor, $repocommitter); -($repoauthor) = Git::ident_person(@repo, 'author'); -($repocommitter) = Git::ident_person(@repo, 'committer'); +{ + my %cache; + my ($author, $committer); + my $common = sub { + my ($what) = @_; + return $cache{$what} if exists $cache{$what}; + ($cache{$what}) = Git::ident_person(@repo, $what); + return $cache{$what}; + }; + $repoauthor = sub { $common->('author') }; + $repocommitter = sub { $common->('committer') }; +} sub parse_address_line { return map { $_->format } Mail::Address->parse($_[0]); @@ -777,7 +787,7 @@ sub get_patch_subject { or die sprintf(__("Failed to open for writing %s: %s"), $compose_filename, $!); - my $tpl_sender = $sender || $repoauthor || $repocommitter || ''; + my $tpl_sender = $sender || $repoauthor->() || $repocommitter->() || ''; my $tpl_subject = $initial_subject || ''; my $tpl_in_reply_to = $initial_in_reply_to || ''; my $tpl_reply_to = $reply_to || ''; @@ -983,7 +993,7 @@ sub file_declares_8bit_cte { $sender =~ s/^\s+|\s+$//g; ($sender) = expand_aliases($sender); } else { - $sender = $repoauthor || $repocommitter || ''; + $sender = $repoauthor->() || $repocommitter->() || ''; } # $sender could be an already sanitized address @@ -1132,7 +1142,7 @@ sub make_message_id { $uniq = "$message_id_stamp-$message_id_serial"; my $du_part; - for ($sender, $repocommitter, $repoauthor) { + for ($sender, $repocommitter->(), $repoauthor->()) { $du_part = extract_valid_address(sanitize_address($_)); last if (defined $du_part and $du_part ne ''); } -- 2.32.0.rc1.458.gd885d4f985c
Change calls like "__ 'foo'" to "__('foo')" so the Perl compiler doesn't have to guess that "__" is a function. This makes the code more readable. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- git-send-email.perl | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/git-send-email.perl b/git-send-email.perl index 38408ec75a8..44dc3f6eb10 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -706,7 +706,7 @@ sub is_format_patch_arg { if (defined($format_patch)) { return $format_patch; } - die sprintf(__ <<EOF, $f, $f); + die sprintf(__(<<EOF), $f, $f); File '%s' exists but it could also be the range of commits to produce patches for. Please disambiguate by... @@ -792,7 +792,7 @@ sub get_patch_subject { my $tpl_in_reply_to = $initial_in_reply_to || ''; my $tpl_reply_to = $reply_to || ''; - print $c <<EOT1, Git::prefix_lines("GIT: ", __ <<EOT2), <<EOT3; + print $c <<EOT1, Git::prefix_lines("GIT: ", __(<<EOT2)), <<EOT3; From $tpl_sender # This line is ignored. EOT1 Lines beginning in "GIT:" will be removed. -- 2.32.0.rc1.458.gd885d4f985c
Change indirect object syntax such as "new X ARGS" to "X->new(ARGS)". This allows perl to see what "new" is at compile-time without having loaded Term::ReadLine. This doesn't matter now, but will in a subsequent commit when we start lazily loading it. Let's do the same for the adjacent "FakeTerm" package for consistency, even though we're not going to conditionally load it. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- git-send-email.perl | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/git-send-email.perl b/git-send-email.perl index 44dc3f6eb10..cc1027d8774 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -194,11 +194,11 @@ sub format_2822_time { my @repo = $repo ? ($repo) : (); my $term = eval { $ENV{"GIT_SEND_EMAIL_NOTTY"} - ? new Term::ReadLine 'git-send-email', \*STDIN, \*STDOUT - : new Term::ReadLine 'git-send-email'; + ? Term::ReadLine->new('git-send-email', \*STDIN, \*STDOUT) + : Term::ReadLine->new('git-send-email'); }; if ($@) { - $term = new FakeTerm "$@: going non-interactive"; + $term = FakeTerm->new("$@: going non-interactive"); } # Behavior modification variables -- 2.32.0.rc1.458.gd885d4f985c
Optimize the time git-send-email takes to do even the simplest of things (such as serving up "-h") from around ~150ms to ~80ms-~90ms by lazily loading the modules it requires. Before this change Devel::TraceUse would report 99/97 used modules under NO_GETTEXT=[|Y], respectively. Now it's 52/37. It now takes ~15s to run t9001-send-email.sh, down from ~20s. Changing File::Spec::Functions::{catdir,catfile} to invoking class methods on File::Spec itself is idiomatic. See [1] for a more elaborate explanation, the resulting code behaves the same way, just without the now-pointless function wrapper. 1. http://lore.kernel.org/git/8735u8mmj9.fsf@evledraar.gmail.com Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- git-send-email.perl | 71 +++++++++++++++++++++++++-------------------- 1 file changed, 39 insertions(+), 32 deletions(-) diff --git a/git-send-email.perl b/git-send-email.perl index cc1027d8774..a8949c9d313 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -19,20 +19,10 @@ use 5.008; use strict; use warnings $ENV{GIT_PERL_FATAL_WARNINGS} ? qw(FATAL all) : (); -use POSIX qw/strftime/; -use Term::ReadLine; use Getopt::Long; -use Text::ParseWords; -use Term::ANSIColor; -use File::Temp qw/ tempdir tempfile /; -use File::Spec::Functions qw(catdir catfile); use Git::LoadCPAN::Error qw(:try); -use Cwd qw(abs_path cwd); use Git; use Git::I18N; -use Net::Domain (); -use Net::SMTP (); -use Git::LoadCPAN::Mail::Address; Getopt::Long::Configure qw/ pass_through /; @@ -166,7 +156,6 @@ sub format_2822_time { ); } -my $have_email_valid = eval { require Email::Valid; 1 }; my $smtp; my $auth; my $num_sent = 0; @@ -192,14 +181,6 @@ sub format_2822_time { my $repo = eval { Git->repository() }; my @repo = $repo ? ($repo) : (); -my $term = eval { - $ENV{"GIT_SEND_EMAIL_NOTTY"} - ? Term::ReadLine->new('git-send-email', \*STDIN, \*STDOUT) - : Term::ReadLine->new('git-send-email'); -}; -if ($@) { - $term = FakeTerm->new("$@: going non-interactive"); -} # Behavior modification variables my ($quiet, $dry_run) = (0, 0); @@ -319,9 +300,9 @@ sub do_edit { # Handle Uncouth Termination sub signal_handler { - # Make text normal - print color("reset"), "\n"; + require Term::ANSIColor; + print Term::ANSIColor::color("reset"), "\n"; # SMTP password masked system "stty echo"; @@ -602,11 +583,13 @@ sub config_regexp { } sub parse_address_line { + require Git::LoadCPAN::Mail::Address; return map { $_->format } Mail::Address->parse($_[0]); } sub split_addrs { - return quotewords('\s*,\s*', 1, @_); + require Text::ParseWords; + return Text::ParseWords::quotewords('\s*,\s*', 1, @_); } my %aliases; @@ -655,10 +638,11 @@ sub parse_sendmail_aliases { s/\\"/"/g foreach @addr; $aliases{$alias} = \@addr }}}, - mailrc => sub { my $fh = shift; while (<$fh>) { + mailrc => sub { my $fh = shift; while (<$fh>) { if (/^alias\s+(\S+)\s+(.*?)\s*$/) { + require Text::ParseWords; # spaces delimit multiple addresses - $aliases{$1} = [ quotewords('\s+', 0, $2) ]; + $aliases{$1} = [ Text::ParseWords::quotewords('\s+', 0, $2) ]; }}}, pine => sub { my $fh = shift; my $f='\t[^\t]*'; for (my $x = ''; defined($x); $x = $_) { @@ -730,7 +714,8 @@ sub is_format_patch_arg { opendir my $dh, $f or die sprintf(__("Failed to opendir %s: %s"), $f, $!); - push @files, grep { -f $_ } map { catfile($f, $_) } + require File::Spec; + push @files, grep { -f $_ } map { File::Spec->catfile($f, $_) } sort readdir $dh; closedir $dh; } elsif ((-f $f or -p $f) and !is_format_patch_arg($f)) { @@ -743,7 +728,8 @@ sub is_format_patch_arg { if (@rev_list_opts) { die __("Cannot run git format-patch from outside a repository\n") unless $repo; - push @files, $repo->command('format-patch', '-o', tempdir(CLEANUP => 1), @rev_list_opts); + require File::Temp; + push @files, $repo->command('format-patch', '-o', File::Temp::tempdir(CLEANUP => 1), @rev_list_opts); } @files = handle_backup_files(@files); @@ -780,9 +766,10 @@ sub get_patch_subject { if ($compose) { # Note that this does not need to be secure, but we will make a small # effort to have it be unique + require File::Temp; $compose_filename = ($repo ? - tempfile(".gitsendemail.msg.XXXXXX", DIR => $repo->repo_path()) : - tempfile(".gitsendemail.msg.XXXXXX", DIR => "."))[1]; + File::Temp::tempfile(".gitsendemail.msg.XXXXXX", DIR => $repo->repo_path()) : + File::Temp::tempfile(".gitsendemail.msg.XXXXXX", DIR => "."))[1]; open my $c, ">", $compose_filename or die sprintf(__("Failed to open for writing %s: %s"), $compose_filename, $!); @@ -889,6 +876,19 @@ sub get_patch_subject { do_edit(@files); } +sub term { + my $term = eval { + require Term::ReadLine; + $ENV{"GIT_SEND_EMAIL_NOTTY"} + ? Term::ReadLine->new('git-send-email', \*STDIN, \*STDOUT) + : Term::ReadLine->new('git-send-email'); + }; + if ($@) { + $term = FakeTerm->new("$@: going non-interactive"); + } + return $term; +} + sub ask { my ($prompt, %arg) = @_; my $valid_re = $arg{valid_re}; @@ -896,6 +896,7 @@ sub ask { my $confirm_only = $arg{confirm_only}; my $resp; my $i = 0; + my $term = term(); return defined $default ? $default : undef unless defined $term->IN and defined fileno($term->IN) and defined $term->OUT and defined fileno($term->OUT); @@ -1076,6 +1077,7 @@ sub extract_valid_address { return $address if ($address =~ /^($local_part_regexp)$/); $address =~ s/^\s*<(.*)>\s*$/$1/; + my $have_email_valid = eval { require Email::Valid; 1 }; if ($have_email_valid) { return scalar Email::Valid->address($address); } @@ -1135,7 +1137,8 @@ sub validate_address_list { sub make_message_id { my $uniq; if (!defined $message_id_stamp) { - $message_id_stamp = strftime("%Y%m%d%H%M%S.$$", gmtime(time)); + require POSIX; + $message_id_stamp = POSIX::strftime("%Y%m%d%H%M%S.$$", gmtime(time)); $message_id_serial = 0; } $message_id_serial++; @@ -1305,6 +1308,7 @@ sub valid_fqdn { sub maildomain_net { my $maildomain; + require Net::Domain; my $domain = Net::Domain::domainname(); $maildomain = $domain if valid_fqdn($domain); @@ -1315,6 +1319,7 @@ sub maildomain_mta { my $maildomain; for my $host (qw(mailhost localhost)) { + require Net::SMTP; my $smtp = Net::SMTP->new($host); if (defined $smtp) { my $domain = $smtp->domain; @@ -1994,13 +1999,15 @@ sub validate_patch { if ($repo) { my $hooks_path = $repo->command_oneline('rev-parse', '--git-path', 'hooks'); - my $validate_hook = catfile($hooks_path, + require File::Spec; + my $validate_hook = File::Spec->catfile($hooks_path, 'sendemail-validate'); my $hook_error; if (-x $validate_hook) { - my $target = abs_path($fn); + require Cwd; + my $target = Cwd::abs_path($fn); # The hook needs a correct cwd and GIT_DIR. - my $cwd_save = cwd(); + my $cwd_save = Cwd::cwd(); chdir($repo->wc_path() or $repo->repo_path()) or die("chdir: $!"); local $ENV{"GIT_DIR"} = $repo->repo_path(); -- 2.32.0.rc1.458.gd885d4f985c
Instead of unconditionally requiring modules such as File::Spec, let's only load them when needed. This speeds up code that only needs a subset of the features Git.pm provides. This brings a plain invocation of "git send-email" down from 52/37 loaded modules under NO_GETTEXT=[|Y] to 39/18, and it now takes ~60-~70ms instead of ~80-~90ms. The runtime of t9001-send-email.sh test is down to ~13s from ~15s. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- perl/Git.pm | 32 ++++++++++++++++---------------- 1 file changed, 16 insertions(+), 16 deletions(-) diff --git a/perl/Git.pm b/perl/Git.pm index 02eacef0c2a..5562c0cede2 100644 --- a/perl/Git.pm +++ b/perl/Git.pm @@ -11,9 +11,6 @@ package Git; use strict; use warnings $ENV{GIT_PERL_FATAL_WARNINGS} ? qw(FATAL all) : (); -use File::Temp (); -use File::Spec (); - BEGIN { our ($VERSION, @ISA, @EXPORT, @EXPORT_OK); @@ -103,12 +100,9 @@ =head1 DESCRIPTION =cut -use Carp qw(carp croak); # but croak is bad - throw instead +sub carp { require Carp; goto &Carp::carp } +sub croak { require Carp; goto &Carp::croak } use Git::LoadCPAN::Error qw(:try); -use Cwd qw(abs_path cwd); -use IPC::Open2 qw(open2); -use Fcntl qw(SEEK_SET SEEK_CUR); -use Time::Local qw(timegm); } @@ -191,13 +185,15 @@ sub repository { $dir = undef; }; + require Cwd; if ($dir) { + require File::Spec; File::Spec->file_name_is_absolute($dir) or $dir = $opts{Directory} . '/' . $dir; - $opts{Repository} = abs_path($dir); + $opts{Repository} = Cwd::abs_path($dir); # If --git-dir went ok, this shouldn't die either. my $prefix = $search->command_oneline('rev-parse', '--show-prefix'); - $dir = abs_path($opts{Directory}) . '/'; + $dir = Cwd::abs_path($opts{Directory}) . '/'; if ($prefix) { if (substr($dir, -length($prefix)) ne $prefix) { throw Error::Simple("rev-parse confused me - $dir does not have trailing $prefix"); @@ -223,7 +219,7 @@ sub repository { throw Error::Simple("fatal: Not a git repository: $dir"); } - $opts{Repository} = abs_path($dir); + $opts{Repository} = Cwd::abs_path($dir); } delete $opts{Directory}; @@ -408,10 +404,12 @@ sub command_bidi_pipe { my $cwd_save = undef; if ($self) { shift; - $cwd_save = cwd(); + require Cwd; + $cwd_save = Cwd::cwd(); _setup_git_cmd_env($self); } - $pid = open2($in, $out, 'git', @_); + require IPC::Open2; + $pid = IPC::Open2::open2($in, $out, 'git', @_); chdir($cwd_save) if $cwd_save; return ($pid, $in, $out, join(' ', @_)); } @@ -538,7 +536,8 @@ sub get_tz_offset { my $t = shift || time; my @t = localtime($t); $t[5] += 1900; - my $gm = timegm(@t); + require Time::Local; + my $gm = Time::Local::timegm(@t); my $sign = qw( + + - )[ $gm <=> $t ]; return sprintf("%s%02d%02d", $sign, (gmtime(abs($t - $gm)))[2,1]); } @@ -1340,6 +1339,7 @@ sub _temp_cache { my $n = $name; $n =~ s/\W/_/g; # no strange chars + require File::Temp; ($$temp_fd, $fname) = File::Temp::tempfile( "Git_${n}_XXXXXX", UNLINK => 1, DIR => $tmpdir, ) or throw Error::Simple("couldn't open new temp file"); @@ -1362,9 +1362,9 @@ sub temp_reset { truncate $temp_fd, 0 or throw Error::Simple("couldn't truncate file"); - sysseek($temp_fd, 0, SEEK_SET) and seek($temp_fd, 0, SEEK_SET) + sysseek($temp_fd, 0, Fcntl::SEEK_SET()) and seek($temp_fd, 0, Fcntl::SEEK_SET()) or throw Error::Simple("couldn't seek to beginning of file"); - sysseek($temp_fd, 0, SEEK_CUR) == 0 and tell($temp_fd) == 0 + sysseek($temp_fd, 0, Fcntl::SEEK_CUR()) == 0 and tell($temp_fd) == 0 or throw Error::Simple("expected file position to be reset"); } -- 2.32.0.rc1.458.gd885d4f985c
Optimize the startup time of git-send-email by using an amended config_regexp() function to retrieve the list of config keys and values we're interested in. For boolean keys we can handle the [true|false] case ourselves, and the "--get" case didn't need any parsing. Let's leave "--path" and other "--bool" cases to "git config". I'm not bothering with the "undef" or "" case (true and false, respectively), let's just punt on those and others and have "git config --type=bool" handle it. The "grep { defined } @values" here covers a rather subtle case. For list values such as sendemail.to it is possible as with any other config key to provide a plain "-c sendemail.to", i.e. to set the key as a boolean true. In that case the Git::config() API will return an empty string, but this new parser will correctly return "undef". However, that means we can end up with "undef" in the middle of a list. E.g. for sendemail.smtpserveroption in conjuction with sendemail.smtpserver as a path this would have produce a warning. For most of the other keys we'd behave the same despite the subtle change in the value, e.g. sendemail.to would behave the same because Mail::Address->parse() happens to return an empty list if fed "undef". For the boolean values we were already prepared to handle these variables being initialized as undef anyway. This brings the runtime of "git send-email" from ~60-~70ms to a very steady ~40ms on my test box. We now run just one "git config" invocation on startup instead of 8, the exact number will differ based on the local sendemail.* config. I happen to have 8 of those set. This brings the runtime of t9001-send-email.sh from ~13s down to ~12s for me. The change there is less impressive as many of those tests set various config values, and we're also getting to the point of diminishing returns for optimizing "git send-email" itself. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- git-send-email.perl | 29 +++++++++++++++++++++-------- 1 file changed, 21 insertions(+), 8 deletions(-) diff --git a/git-send-email.perl b/git-send-email.perl index a8949c9d313..57911386835 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -334,7 +334,11 @@ sub read_config { my $target = $config_bool_settings{$setting}; my $key = "$prefix.$setting"; next unless exists $known_keys->{$key}; - my $v = Git::config_bool(@repo, $key); + my $v = (@{$known_keys->{$key}} == 1 && + (defined $known_keys->{$key}->[0] && + $known_keys->{$key}->[0] =~ /^(?:true|false)$/s)) + ? $known_keys->{$key}->[0] eq 'true' + : Git::config_bool(@repo, $key); next unless defined $v; next if $configured->{$setting}++; $$target = $v; @@ -363,13 +367,13 @@ sub read_config { my $key = "$prefix.$setting"; next unless exists $known_keys->{$key}; if (ref($target) eq "ARRAY") { - my @values = Git::config(@repo, $key); - next unless @values; + my @values = @{$known_keys->{$key}}; + @values = grep { defined } @values; next if $configured->{$setting}++; @$target = @values; } else { - my $v = Git::config(@repo, $key); + my $v = $known_keys->{$key}->[0]; next unless defined $v; next if $configured->{$setting}++; $$target = $v; @@ -381,12 +385,19 @@ sub config_regexp { my ($regex) = @_; my @ret; eval { - @ret = Git::command( + my $ret = Git::command( 'config', - '--name-only', + '--null', '--get-regexp', $regex, ); + @ret = map { + # We must always return ($k, $v) here, since + # empty config values will be just "key\0", + # not "key\nvalue\0". + my ($k, $v) = split /\n/, $_, 2; + ($k, $v); + } split /\0/, $ret; 1; } or do { # If we have no keys we're OK, otherwise re-throw @@ -399,8 +410,10 @@ sub config_regexp { # parses 'bool' etc.) by only doing so for config keys that exist. my %known_config_keys; { - my @known_config_keys = config_regexp("^sende?mail[.]"); - @known_config_keys{@known_config_keys} = (); + my @kv = config_regexp("^sende?mail[.]"); + while (my ($k, $v) = splice @kv, 0, 2) { + push @{$known_config_keys{$k}} => $v; + } } # sendemail.identity yields to --identity. We must parse this -- 2.32.0.rc1.458.gd885d4f985c
It has been pointed out[1] that cwd() invokes "pwd(1)" while getcwd() is a Perl-native XS function. For what we're using these for we can use getcwd(). The performance difference is miniscule, we're saving on the order of a millisecond or so, see [2] below for the benchmark. I don't think this matters in practice for optimizing git-send-email or perl execution (unlike the patches leading up to this one). But let's do it regardless of that, if only so we don't have to think about this as a low-hanging fruit anymore. 1. https://lore.kernel.org/git/20210512180517.GA11354@dcvr/ 2. $ perl -MBenchmark=:all -MCwd -wE 'cmpthese(10000, { getcwd => sub { getcwd }, cwd => sub { cwd }, pwd => sub { system "pwd >/dev/null" }})' (warning: too few iterations for a reliable count) Rate pwd cwd getcwd pwd 982/s -- -48% -100% cwd 1890/s 92% -- -100% getcwd 10000000000000000000/s 1018000000000000000% 529000000000000064% - Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- git-send-email.perl | 2 +- perl/Git.pm | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/git-send-email.perl b/git-send-email.perl index 57911386835..0efe85c0b02 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -2020,7 +2020,7 @@ sub validate_patch { require Cwd; my $target = Cwd::abs_path($fn); # The hook needs a correct cwd and GIT_DIR. - my $cwd_save = Cwd::cwd(); + my $cwd_save = Cwd::getcwd(); chdir($repo->wc_path() or $repo->repo_path()) or die("chdir: $!"); local $ENV{"GIT_DIR"} = $repo->repo_path(); diff --git a/perl/Git.pm b/perl/Git.pm index 5562c0cede2..090a7df63fc 100644 --- a/perl/Git.pm +++ b/perl/Git.pm @@ -405,7 +405,7 @@ sub command_bidi_pipe { if ($self) { shift; require Cwd; - $cwd_save = Cwd::cwd(); + $cwd_save = Cwd::getcwd(); _setup_git_cmd_env($self); } require IPC::Open2; -- 2.32.0.rc1.458.gd885d4f985c
Jeff King wrote: > On Thu, May 20, 2021 at 10:18:57AM +0200, Ævar Arnfjörð Bjarmason wrote: > > > A re-roll of [1], the work I based v1 on top of has landed in master, > > so this has been rebased on master. > > > > The changes here are minor, just a typo fix / commit message > > clarification, moving "require" closer to where it's used, and finally > > a new 10/10 patch to s/cwd/getcwd/g. > > I like all of this, except for the change in the interface of > Git::config_regexp(). You mention that it's new-ish, and probably not in > wide use. And I agree that's probably true. But it feels like violating > a principle of not breaking APIs, and we should stick to that principle > and not bend it for "well, it's not that old an API". APIs are broken all the time. That's why they have versions. Anyway, I find it odd that the git project cares about an API promise of an outdated language steadily in decline (it's peak was in 2004 [1]), and yet it does not provide an API of its own, like virtually all important software. Shouldn't we care more about a C API than a Perl API? (a non-existent API is orders of magnitude worse than an API that broke once) [1] https://www.tiobe.com/tiobe-index/perl/ -- Felipe Contreras
Ævar Arnfjörð Bjarmason wrote: > Returning a flattened list is idiomatic in Perl, it means that a caller > can do any of: > > # I only care about the last value for a key, or only about > # existence checks > my %hash = func(); I was staying on the sideline because I don't know what's idiomatic in Perl, but Perl and Ruby share a lot in common (one could say Perl is the grandfather of Ruby), and I do know very well what's idiomatic in Ruby. In perl you can do $ENV{'USER'}, and: while (my ($k, $v) = each %ENV) { print "$k = $v\n"; } Obviously it's idiomatic to use hashes this way [1]. It was a waste for Git::config_regexp to not do the sensible thing here. You can do exactly the same in Ruby: ENV['USER'] ENV.each { |k, v| print "#{k} = #{v}\n" } And the way I would parse these configurations in Ruby is something like: c = `git config -l -z`.split("\0").map { |e| e.split("\n") }.to_h c['sendemail.smtpserver'] And this just gave me an idea... [1] https://perldoc.perl.org/functions/each -- Felipe Contreras
Ævar Arnfjörð Bjarmason wrote:
> Optimize the time git-send-email takes to do even the simplest of
> things (such as serving up "-h") from around ~150ms to ~80ms-~90ms by
> lazily loading the modules it requires.
>
> Before this change Devel::TraceUse would report 99/97 used modules
> under NO_GETTEXT=[|Y], respectively. Now it's 52/37. It now takes ~15s
> to run t9001-send-email.sh, down from ~20s.
>
> Changing File::Spec::Functions::{catdir,catfile} to invoking class
> methods on File::Spec itself is idiomatic. See [1] for a more
> elaborate explanation, the resulting code behaves the same way, just
> without the now-pointless function wrapper.
I would reference `man File::Spec` rather than an email.
And while this change makes sense, I think it should be split in two.
Instead of doing:
-use Term::ANSIColor;
-print color("reset"), "\n";
+require Term::ANSIColor;
+print Term::ANSIColor::color("reset"), "\n";
We could do this in one patch:
-print color("reset"), "\n";
+print Term::ANSIColor::color("reset"), "\n";
That is just no-op noise that we can mostly ignore in the review.
Then the actual change to require Term::ANSIColor selectively would be
much simpler to see.
Cheers.
--
Felipe Contreras
Ævar Arnfjörð Bjarmason wrote:
> Change calls like "__ 'foo'" to "__('foo')" so the Perl compiler
> doesn't have to guess that "__" is a function. This makes the code
> more readable.
Agreed. Ruby has something similar:
foo arg1, arg2
foo(arg1, arg2)
As a C programmer I find the former a quirk of the language, and much
harder to read.
If it's a function call, present it as such.
--
Felipe Contreras
Ævar Arnfjörð Bjarmason wrote:
> Hopefully the final iteration. Updates a commit message to explain why
> I moved away from File::Spec::Functions, rebases on master, and
> explains and deals with the "undef in config" issue Jeff King noted.
The series looks good to me, except the one comment about splitting one
patch, but not a deal-breaker.
For what it's worth:
Reviewed-by: Felipe Contreras <felipe.contreras@gmail.com>
--
Felipe Contreras
On Fri, May 28 2021, Felipe Contreras wrote:
> Ævar Arnfjörð Bjarmason wrote:
>> Optimize the time git-send-email takes to do even the simplest of
>> things (such as serving up "-h") from around ~150ms to ~80ms-~90ms by
>> lazily loading the modules it requires.
>>
>> Before this change Devel::TraceUse would report 99/97 used modules
>> under NO_GETTEXT=[|Y], respectively. Now it's 52/37. It now takes ~15s
>> to run t9001-send-email.sh, down from ~20s.
>>
>> Changing File::Spec::Functions::{catdir,catfile} to invoking class
>> methods on File::Spec itself is idiomatic. See [1] for a more
>> elaborate explanation, the resulting code behaves the same way, just
>> without the now-pointless function wrapper.
>
> I would reference `man File::Spec` rather than an email.
>
> And while this change makes sense, I think it should be split in two.
>
> Instead of doing:
>
> -use Term::ANSIColor;
> -print color("reset"), "\n";
> +require Term::ANSIColor;
> +print Term::ANSIColor::color("reset"), "\n";
>
> We could do this in one patch:
>
> -print color("reset"), "\n";
> +print Term::ANSIColor::color("reset"), "\n";
>
> That is just no-op noise that we can mostly ignore in the review.
>
> Then the actual change to require Term::ANSIColor selectively would be
> much simpler to see.
You mean do the change from imported functions in one commit, and then
sprinkle the "require" in another one?
I think it's clearer this way, you can't really assert that it worked as
intended (i.e. you have no more imports) without the s/use/require/g, so
it makes sense to do both as one atomic change.
On Fri, May 28 2021, Felipe Contreras wrote:
> Ævar Arnfjörð Bjarmason wrote:
>> Change calls like "__ 'foo'" to "__('foo')" so the Perl compiler
>> doesn't have to guess that "__" is a function. This makes the code
>> more readable.
>
> Agreed. Ruby has something similar:
>
> foo arg1, arg2
> foo(arg1, arg2)
>
> As a C programmer I find the former a quirk of the language, and much
> harder to read.
>
> If it's a function call, present it as such.
I'm not too familiar with Ruby, but the important part here in Perl is
that it's not a "foo()" v.s. "foo" shortcut for a function invocation,
it's unambiguous whether that is a function and what its arguments are
until at that point in the program. Perl has dynamic grammar, you can't
parse it with a regular grammar without executing it.
On Fri, May 28 2021, Felipe Contreras wrote: > Ævar Arnfjörð Bjarmason wrote: >> Returning a flattened list is idiomatic in Perl, it means that a caller >> can do any of: >> >> # I only care about the last value for a key, or only about >> # existence checks >> my %hash = func(); > > I was staying on the sideline because I don't know what's idiomatic in > Perl, but Perl and Ruby share a lot in common (one could say Perl is the > grandfather of Ruby), and I do know very well what's idiomatic in Ruby. > > In perl you can do $ENV{'USER'}, and: > > while (my ($k, $v) = each %ENV) { > print "$k = $v\n"; > } > > Obviously it's idiomatic to use hashes this way [1]. For what it's worth idiomatic/good idea and "has an example in the perl documentation" unfortunately aren't always aligned. A lot of experienced Perl programmers avoid each() like the plague: http://blogs.perl.org/users/rurban/2014/04/do-not-use-each.html > It was a waste for Git::config_regexp to not do the sensible thing here. FWIW we're commenting on a v2 of a series that's at v5 now, and doesn't use config_regexp() at all, the relevant code is inlined in git-send-email.perl now. > You can do exactly the same in Ruby: ENV['USER'] > > ENV.each { |k, v| print "#{k} = #{v}\n" } > > And the way I would parse these configurations in Ruby is something like: > c = `git config -l -z`.split("\0").map { |e| e.split("\n") }.to_h > c['sendemail.smtpserver'] > > And this just gave me an idea... I'd probably do it that way in Ruby, but not in Perl. Things that superficially look the same in two languages can have completely different behaviors, a "hash" isn't a single type of data structure in these programming languages. In particular Ruby doesn't have hshes in the Perl sense of the word, it has an ordered key-value pair structure (IIRC under the hood they're hashes + a double linked list). Thus you can use it for things like parsing a key=>value text file where the key is unique and the order is important. In Perl hashes are only meant for key-value lookup, they are not ordered, and are actually actively randomly ordered for security reasons. In any modern version inserting a new key will have an avalance effect of completely changing the order. It's not even stable across invocations: $ perl -wE 'my %h; for ("a".."z") { $h{$_} = $_; say keys %h }' a ab bca dcba daebc cbaedf aecbfdg dgfcbaeh [...] The other important distinction (but I'm not so sure about Ruby here) is that Perl doesn't have any way to pass a hash or any other structure to another function, everything is flattened and pushed onto the stack. To pass a "hash" you're not passing the hash, but a "flattened" pointer to it on the stack. Thus passing and making use of these flattened values is idiomatic in Perl in a way that doesn't exist in a lot of other languages. In some other languages a function has to choose whether it's returning an array or a hash, in Perl you can just push the "flattened" items that make up the array on the stack, and have the caller decide if they're pushing those stack items into an array, or to a hash if they expect it to be meaningful as key-value pairs. In the context of Git's config format doing that is the perfect fit for config values, our config values *are* ordered, but they are also sort-of hashes, but whether it's "all values" or "last value wins" (or anything else, that's just the common ones) depends on the key/user. So by having a list of key-value pairs on the stack you can choose to put it into an array if you don't want to lose information, or put it into a hash if all you care about is "last key wins", or "I'm going to check for key existence". I think that in many other languages that wouldn't make any sense, and you'd always return a structure like: [ key => [zero or more values], [...] ] Or whatever, the caller can also unambiguously interpret those, but unlike Perl you'd need to write something to explicitly iterate the returned value (or a helper) to get it into a hash or a "flattened" array. In Perl it's trivial due to the "everything is on the stack" semantics. Anyway, all that being said the part we're talking about as a trivial part of this larger series. I'd much prefer to have it just land as "good enough" at this point. It works, we can always tweak it further later if there's a need to do that.
Ævar Arnfjörð Bjarmason wrote: > > On Fri, May 28 2021, Felipe Contreras wrote: > > > Ævar Arnfjörð Bjarmason wrote: > >> Optimize the time git-send-email takes to do even the simplest of > >> things (such as serving up "-h") from around ~150ms to ~80ms-~90ms by > >> lazily loading the modules it requires. > >> > >> Before this change Devel::TraceUse would report 99/97 used modules > >> under NO_GETTEXT=[|Y], respectively. Now it's 52/37. It now takes ~15s > >> to run t9001-send-email.sh, down from ~20s. > >> > >> Changing File::Spec::Functions::{catdir,catfile} to invoking class > >> methods on File::Spec itself is idiomatic. See [1] for a more > >> elaborate explanation, the resulting code behaves the same way, just > >> without the now-pointless function wrapper. > > > > I would reference `man File::Spec` rather than an email. > > > > And while this change makes sense, I think it should be split in two. > > > > Instead of doing: > > > > -use Term::ANSIColor; > > -print color("reset"), "\n"; > > +require Term::ANSIColor; > > +print Term::ANSIColor::color("reset"), "\n"; > > > > We could do this in one patch: > > > > -print color("reset"), "\n"; > > +print Term::ANSIColor::color("reset"), "\n"; > > > > That is just no-op noise that we can mostly ignore in the review. > > > > Then the actual change to require Term::ANSIColor selectively would be > > much simpler to see. > > You mean do the change from imported functions in one commit, and then > sprinkle the "require" in another one? Yes. > I think it's clearer this way, you can't really assert that it worked as > intended (i.e. you have no more imports) without the s/use/require/g, so > it makes sense to do both as one atomic change. It is clear, but there's some noise. And I don't see how you can assert there are no more imports from this patch, especially when the next patch gets rid of more imports. Moreover, it's already the case that the code uses namespaces in some cases: Net::Domain::domainname is one you did not have to convert. You would be doing the same for all other instances. Cheers. -- Felipe Contreras
Ævar Arnfjörð Bjarmason wrote:
>
> On Fri, May 28 2021, Felipe Contreras wrote:
>
> > Ævar Arnfjörð Bjarmason wrote:
> >> Change calls like "__ 'foo'" to "__('foo')" so the Perl compiler
> >> doesn't have to guess that "__" is a function. This makes the code
> >> more readable.
> >
> > Agreed. Ruby has something similar:
> >
> > foo arg1, arg2
> > foo(arg1, arg2)
> >
> > As a C programmer I find the former a quirk of the language, and much
> > harder to read.
> >
> > If it's a function call, present it as such.
>
> I'm not too familiar with Ruby, but the important part here in Perl is
> that it's not a "foo()" v.s. "foo" shortcut for a function invocation,
> it's unambiguous whether that is a function and what its arguments are
> until at that point in the program. Perl has dynamic grammar, you can't
> parse it with a regular grammar without executing it.
Yes, that is the important part, but not the only part.
I'm saying in addition it's easier to read for most git.git developers.
--
Felipe Contreras
Ævar Arnfjörð Bjarmason wrote: > > On Fri, May 28 2021, Felipe Contreras wrote: > > > Ævar Arnfjörð Bjarmason wrote: > >> Returning a flattened list is idiomatic in Perl, it means that a caller > >> can do any of: > >> > >> # I only care about the last value for a key, or only about > >> # existence checks > >> my %hash = func(); > > > > I was staying on the sideline because I don't know what's idiomatic in > > Perl, but Perl and Ruby share a lot in common (one could say Perl is the > > grandfather of Ruby), and I do know very well what's idiomatic in Ruby. > > > > In perl you can do $ENV{'USER'}, and: > > > > while (my ($k, $v) = each %ENV) { > > print "$k = $v\n"; > > } > > > > Obviously it's idiomatic to use hashes this way [1]. > > For what it's worth idiomatic/good idea and "has an example in the perl > documentation" unfortunately aren't always aligned. A lot of experienced > Perl programmers avoid each() like the plague: > http://blogs.perl.org/users/rurban/2014/04/do-not-use-each.html Perl is an old language, and each() was introduced in 2010, it's expected that some old-timers would not adapt to the new idioms. BTW, Ruby borrowed a lot from Perl, but I'm pretty sure Perl borrowed each() from Ruby. Untilmately it doesn't matter what you use to traverse %ENV, my point is that it's a hash. > > It was a waste for Git::config_regexp to not do the sensible thing here. > > FWIW we're commenting on a v2 of a series that's at v5 now, and doesn't > use config_regexp() at all, the relevant code is inlined in > git-send-email.perl now. I know, I've been following the threads. I'm trying to say it's a shame Git::config_regexp does not do the sensible thing. > > You can do exactly the same in Ruby: ENV['USER'] > > > > ENV.each { |k, v| print "#{k} = #{v}\n" } > > > > And the way I would parse these configurations in Ruby is something like: > > > c = `git config -l -z`.split("\0").map { |e| e.split("\n") }.to_h > > c['sendemail.smtpserver'] > > > > And this just gave me an idea... > > I'd probably do it that way in Ruby, but not in Perl. > > Things that superficially look the same in two languages can have > completely different behaviors, a "hash" isn't a single type of data > structure in these programming languages. > > In particular Ruby doesn't have hshes in the Perl sense of the word, it > has an ordered key-value pair structure (IIRC under the hood they're > hashes + a double linked list). > > Thus you can use it for things like parsing a key=>value text file where > the key is unique and the order is important. > > In Perl hashes are only meant for key-value lookup, they are not > ordered, and are actually actively randomly ordered for security > reasons. In any modern version inserting a new key will have an avalance > effect of completely changing the order. It's not even stable across > invocations: > > $ perl -wE 'my %h; for ("a".."z") { $h{$_} = $_; say keys %h }' > a > ab > bca > dcba > daebc > cbaedf > aecbfdg > dgfcbaeh > [...] This used to be the case in Ruby too. The order of hashes was not guaranteed. The situation is more complicated because not only do you have different versions, but you have different implementations. AFAIK the Ruby language specification doesn't say anything about ordering, although basically all implementations do order. > The other important distinction (but I'm not so sure about Ruby here) is > that Perl doesn't have any way to pass a hash or any other structure to > another function, everything is flattened and pushed onto the stack. > > To pass a "hash" you're not passing the hash, but a "flattened" pointer > to it on the stack. > > Thus passing and making use of these flattened values is idiomatic in > Perl in a way that doesn't exist in a lot of other languages. In some > other languages a function has to choose whether it's returning an array > or a hash, in Perl you can just push the "flattened" items that make up > the array on the stack, and have the caller decide if they're pushing > those stack items into an array, or to a hash if they expect it to be > meaningful as key-value pairs. Yeah, that's something that wasn't borrowed. In Ruby everything is an object. > In the context of Git's config format doing that is the perfect fit for > config values, our config values *are* ordered, but they are also > sort-of hashes, but whether it's "all values" or "last value wins" (or > anything else, that's just the common ones) depends on the key/user. > > So by having a list of key-value pairs on the stack you can choose to > put it into an array if you don't want to lose information, or put it > into a hash if all you care about is "last key wins", or "I'm going to > check for key existence". > > I think that in many other languages that wouldn't make any sense, and > you'd always return a structure like: > > [ > key => [zero or more values], > [...] > ] > > Or whatever, the caller can also unambiguously interpret those, but > unlike Perl you'd need to write something to explicitly iterate the > returned value (or a helper) to get it into a hash or a "flattened" > array. In Perl it's trivial due to the "everything is on the stack" > semantics. Indeed, but my point is that it's a hash for all intents and purposes: my %hash = func(); And it makes sense for it to be a hash, just like %ENV. And although the internals are different something very close would happen in Ruby: hash = func().to_h > Anyway, all that being said the part we're talking about as a trivial > part of this larger series. I'd much prefer to have it just land as > "good enough" at this point. It works, we can always tweak it further > later if there's a need to do that. Indeed, as I said, the entire patch series looks good to me. Cheers. -- Felipe Contreras
On Sat, May 29 2021, Felipe Contreras wrote:
> Ævar Arnfjörð Bjarmason wrote:
>>
>> On Fri, May 28 2021, Felipe Contreras wrote:
>>
>> > Ævar Arnfjörð Bjarmason wrote:
>> >> Returning a flattened list is idiomatic in Perl, it means that a caller
>> >> can do any of:
>> >>
>> >> # I only care about the last value for a key, or only about
>> >> # existence checks
>> >> my %hash = func();
>> >
>> > I was staying on the sideline because I don't know what's idiomatic in
>> > Perl, but Perl and Ruby share a lot in common (one could say Perl is the
>> > grandfather of Ruby), and I do know very well what's idiomatic in Ruby.
>> >
>> > In perl you can do $ENV{'USER'}, and:
>> >
>> > while (my ($k, $v) = each %ENV) {
>> > print "$k = $v\n";
>> > }
>> >
>> > Obviously it's idiomatic to use hashes this way [1].
>>
>> For what it's worth idiomatic/good idea and "has an example in the perl
>> documentation" unfortunately aren't always aligned. A lot of experienced
>> Perl programmers avoid each() like the plague:
>> http://blogs.perl.org/users/rurban/2014/04/do-not-use-each.html
>
> Perl is an old language, and each() was introduced in 2010, it's
> expected that some old-timers would not adapt to the new idioms.
each() has been in Perl since 1987 with perl v1.0, you must be confusing
it with something else.
In any case, the recommendation against it has nothing to do with its
age, it's that similar to strtok() it has global state. So e.g. you
can't safely iterate over a hash you're modifying with it, or if you
iterate over a top-level structure and pass that variable down to
another function, and it also uses each()...
Ævar Arnfjörð Bjarmason wrote: > > On Sat, May 29 2021, Felipe Contreras wrote: > > > Ævar Arnfjörð Bjarmason wrote: > >> > >> On Fri, May 28 2021, Felipe Contreras wrote: > >> > >> > Ævar Arnfjörð Bjarmason wrote: > >> >> Returning a flattened list is idiomatic in Perl, it means that a caller > >> >> can do any of: > >> >> > >> >> # I only care about the last value for a key, or only about > >> >> # existence checks > >> >> my %hash = func(); > >> > > >> > I was staying on the sideline because I don't know what's idiomatic in > >> > Perl, but Perl and Ruby share a lot in common (one could say Perl is the > >> > grandfather of Ruby), and I do know very well what's idiomatic in Ruby. > >> > > >> > In perl you can do $ENV{'USER'}, and: > >> > > >> > while (my ($k, $v) = each %ENV) { > >> > print "$k = $v\n"; > >> > } > >> > > >> > Obviously it's idiomatic to use hashes this way [1]. > >> > >> For what it's worth idiomatic/good idea and "has an example in the perl > >> documentation" unfortunately aren't always aligned. A lot of experienced > >> Perl programmers avoid each() like the plague: > >> http://blogs.perl.org/users/rurban/2014/04/do-not-use-each.html > > > > Perl is an old language, and each() was introduced in 2010, it's > > expected that some old-timers would not adapt to the new idioms. > > each() has been in Perl since 1987 with perl v1.0, you must be confusing > it with something else. I see. I read the each() documentation [1] too hastily: When called on a hash in list context, returns a 2-element list consisting of the key and value for the next element of a hash. In Perl 5.12 and later only... > In any case, the recommendation against it has nothing to do with its > age, it's that similar to strtok() it has global state. Yes, that's what I understood from the blog post you shared, but at least personally I never assume I can modify a hash like that. I see why some people need to be careful with it, but "avoid it like the plague" seems way too defensive programming to me. [1] https://perldoc.perl.org/functions/each -- Felipe Contreras
On Fri, May 28, 2021 at 11:23:39AM +0200, Ævar Arnfjörð Bjarmason wrote: > Hopefully the final iteration. Updates a commit message to explain why > I moved away from File::Spec::Functions, rebases on master, and > explains and deals with the "undef in config" issue Jeff King noted. Thanks. The solution was less invasive than I feared. I guess here: > @@ git-send-email.perl: sub read_config { > - my @values = Git::config(@repo, $key); > - next unless @values; > + my @values = @{$known_keys->{$key}}; > ++ @values = grep { defined } @values; > next if $configured->{$setting}++; > @$target = @values; > } > else { > - my $v = Git::config(@repo, $key); > -- next unless defined $v; > + my $v = $known_keys->{$key}->[0]; > + next unless defined $v; > next if $configured->{$setting}++; > $$target = $v; > - } we'd ignore such undef values, whereas presumably before they'd have triggered an error via Git::config(). I don't think it matters all that much either way, though. -Peff
On Mon, May 31 2021, Jeff King wrote:
> On Fri, May 28, 2021 at 11:23:39AM +0200, Ævar Arnfjörð Bjarmason wrote:
>
>> Hopefully the final iteration. Updates a commit message to explain why
>> I moved away from File::Spec::Functions, rebases on master, and
>> explains and deals with the "undef in config" issue Jeff King noted.
>
> Thanks. The solution was less invasive than I feared.
>
> I guess here:
>
>> @@ git-send-email.perl: sub read_config {
>> - my @values = Git::config(@repo, $key);
>> - next unless @values;
>> + my @values = @{$known_keys->{$key}};
>> ++ @values = grep { defined } @values;
>> next if $configured->{$setting}++;
>> @$target = @values;
>> }
>> else {
>> - my $v = Git::config(@repo, $key);
>> -- next unless defined $v;
>> + my $v = $known_keys->{$key}->[0];
>> + next unless defined $v;
>> next if $configured->{$setting}++;
>> $$target = $v;
>> - }
>
> we'd ignore such undef values, whereas presumably before they'd have
> triggered an error via Git::config(). I don't think it matters all that
> much either way, though.
They didn't error before, they were treated the same as the empty
string.
This is because Git.pm uses "git config --get" to retrieve them, but we
now use "--list --null".
This ultimately comes down to a limitation of git-config's one-shot CLI
interface. You get the empty string and zero exit code for both of:
git -c foo.bar= config --get foo.bar
git -c foo.bar config --get foo.bar
Ignoring them here for list values is technically a behavior change if
we were treating this as a black box, but in reality we know that these
could only conceivable be useful for the bool values, for list values
like "list of e-mail addresses" it makes no sense to have that in a the
middle of a list, and we were implicitly ignoring them anyway before
when we processed the empty string.
On Mon, May 31, 2021 at 11:53:04AM +0200, Ævar Arnfjörð Bjarmason wrote: > > I guess here: > > > >> @@ git-send-email.perl: sub read_config { > >> - my @values = Git::config(@repo, $key); > >> - next unless @values; > >> + my @values = @{$known_keys->{$key}}; > >> ++ @values = grep { defined } @values; > >> next if $configured->{$setting}++; > >> @$target = @values; > >> } > >> else { > >> - my $v = Git::config(@repo, $key); > >> -- next unless defined $v; > >> + my $v = $known_keys->{$key}->[0]; > >> + next unless defined $v; > >> next if $configured->{$setting}++; > >> $$target = $v; > >> - } > > > > we'd ignore such undef values, whereas presumably before they'd have > > triggered an error via Git::config(). I don't think it matters all that > > much either way, though. > > They didn't error before, they were treated the same as the empty > string. > > This is because Git.pm uses "git config --get" to retrieve them, but we > now use "--list --null". > > This ultimately comes down to a limitation of git-config's one-shot CLI > interface. You get the empty string and zero exit code for both of: > > git -c foo.bar= config --get foo.bar > git -c foo.bar config --get foo.bar Ah, OK. Thanks for clarifying. > Ignoring them here for list values is technically a behavior change if > we were treating this as a black box, but in reality we know that these > could only conceivable be useful for the bool values, for list values > like "list of e-mail addresses" it makes no sense to have that in a the > middle of a list, and we were implicitly ignoring them anyway before > when we processed the empty string. Right, it was the list one I was most puzzled by. But if we were already conflating implicit-bool and the empty string, I think it's OK. I agree that all of this is something that _shouldn't_ happen in the first place, for well-formed config. -Peff
On Fri, May 28, 2021 at 11:23:41AM +0200, Ævar Arnfjörð Bjarmason wrote:
> +test_expect_success $PREREQ '--[no-]xmailer with sendemail.xmailer' '
> + test_when_finished "test_unconfig sendemail.xmailer" &&
> + cat >>.git/config <<-\EOF &&
> + [sendemail]
> + xmailer
> + EOF
> + test_config sendemail.xmailer true &&
> + do_xmailer_test 1 "" &&
> + do_xmailer_test 0 "--no-xmailer" &&
> + do_xmailer_test 1 "--xmailer"
> +'
Is this test_config call in the middle of the snippet meant to be there?
It seems like it would wreck the test, since it will rewrite the file to
"xmailer = true". I'm guessing it's just leftover cruft from adapting
the earlier xmailer=true test?
(I wasn't looking carefully again at this series, but just happened to
be resolving a local conflict involving this hunk and noticed it).
-Peff