From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Georgios Kontaxis via GitGitGadget <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org,
"brian m. carlson" <sandals@crustytoothpaste.net>,
Georgios Kontaxis <geko1702+commits@99rst.org>
Subject: Re: [PATCH v6] gitweb: redacted e-mail addresses feature.
Date: Fri, 09 Apr 2021 00:43:19 +0200 [thread overview]
Message-ID: <87eefkieig.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <pull.910.v6.git.1616973963862.gitgitgadget@gmail.com>
On Mon, Mar 29 2021, Georgios Kontaxis via GitGitGadget wrote:
> [...]
> +email-privacy::
> + Redact e-mail addresses from the generated HTML, etc. content.
> + This obscures e-mail addresses retrieved from the author/committer
> + and comment sections of the Git log.
> + It is meant to hinder web crawlers that harvest and abuse addresses.
> + Such crawlers may not respect robots.txt.
> + Note that users and user tools also see the addresses as redacted.
> + If Gitweb is not the final step in a workflow then subsequent steps
> + may misbehave because of the redacted information they receive.
> + Disabled by default.
> +
> highlight::
> Server-side syntax highlight support in "blob" view. It requires
> `$highlight_bin` program to be available (see the description of
> diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
> index 0959a782eccb..01c6faf88006 100755
> --- a/gitweb/gitweb.perl
> +++ b/gitweb/gitweb.perl
> @@ -569,6 +569,15 @@ sub evaluate_uri {
> 'sub' => \&feature_extra_branch_refs,
> 'override' => 0,
> 'default' => []},
> +
> + # Redact e-mail addresses.
> +
> + # To enable system wide have in $GITWEB_CONFIG
> + # $feature{'email-privacy'}{'default'} = [1];
> + 'email-privacy' => {
> + 'sub' => sub { feature_bool('email-privacy', @_) },
> + 'override' => 1,
> + 'default' => [0]},
> );
>
> sub gitweb_get_feature {
> @@ -3449,6 +3458,13 @@ sub parse_date {
> return %date;
> }
>
> +sub hide_mailaddrs_if_private {
> + my $line = shift;
> + return $line unless gitweb_check_feature('email-privacy');
> + $line =~ s/<[^@>]+@[^>]+>/<redacted>/ig;
The /i here is redundant, since you have nothing that'll case-fold on
the LHS of the s///, doesn't harm anything either. Just a small note
since it's new in v6...
> + return $line;
> +}
> +
> sub parse_tag {
> my $tag_id = shift;
> my %tag;
> @@ -3465,7 +3481,7 @@ sub parse_tag {
> } elsif ($line =~ m/^tag (.+)$/) {
> $tag{'name'} = $1;
> } elsif ($line =~ m/^tagger (.*) ([0-9]+) (.*)$/) {
> - $tag{'author'} = $1;
> + $tag{'author'} = hide_mailaddrs_if_private($1);
> $tag{'author_epoch'} = $2;
> $tag{'author_tz'} = $3;
> if ($tag{'author'} =~ m/^([^<]+) <([^>]*)>/) {
> @@ -3513,7 +3529,7 @@ sub parse_commit_text {
> } elsif ((!defined $withparents) && ($line =~ m/^parent ($oid_regex)$/)) {
> push @parents, $1;
> } elsif ($line =~ m/^author (.*) ([0-9]+) (.*)$/) {
> - $co{'author'} = to_utf8($1);
> + $co{'author'} = hide_mailaddrs_if_private(to_utf8($1));
> $co{'author_epoch'} = $2;
> $co{'author_tz'} = $3;
> if ($co{'author'} =~ m/^([^<]+) <([^>]*)>/) {
> @@ -3523,7 +3539,7 @@ sub parse_commit_text {
> $co{'author_name'} = $co{'author'};
> }
> } elsif ($line =~ m/^committer (.*) ([0-9]+) (.*)$/) {
> - $co{'committer'} = to_utf8($1);
> + $co{'committer'} = hide_mailaddrs_if_private(to_utf8($1));
> $co{'committer_epoch'} = $2;
> $co{'committer_tz'} = $3;
> if ($co{'committer'} =~ m/^([^<]+) <([^>]*)>/) {
> @@ -3568,9 +3584,10 @@ sub parse_commit_text {
> if (! defined $co{'title'} || $co{'title'} eq "") {
> $co{'title'} = $co{'title_short'} = '(no commit message)';
> }
> - # remove added spaces
> + # remove added spaces, redact e-mail addresses if applicable.
> foreach my $line (@commit_lines) {
> $line =~ s/^ //;
> + $line = hide_mailaddrs_if_private($line);
> }
> $co{'comment'} = \@commit_lines;
>
> @@ -7489,7 +7506,8 @@ sub git_log_generic {
> -accesskey => "n", -title => "Alt-n"}, "next");
> }
> my $patch_max = gitweb_get_feature('patches');
> - if ($patch_max && !defined $file_name) {
> + if ($patch_max && !defined $file_name &&
> + !gitweb_check_feature('email-privacy')) {
> if ($patch_max < 0 || @commitlist <= $patch_max) {
> $paging_nav .= " ⋅ " .
> $cgi->a({-href => href(action=>"patches", -replay=>1)},
> @@ -7550,7 +7568,8 @@ sub git_commit {
> } @$parents ) .
> ')';
> }
> - if (gitweb_check_feature('patches') && @$parents <= 1) {
> + if (gitweb_check_feature('patches') && @$parents <= 1 &&
> + !gitweb_check_feature('email-privacy')) {
> $formats_nav .= " | " .
> $cgi->a({-href => href(action=>"patch", -replay=>1)},
> "patch");
> @@ -7863,7 +7882,8 @@ sub git_commitdiff {
> $formats_nav =
> $cgi->a({-href => href(action=>"commitdiff_plain", -replay=>1)},
> "raw");
> - if ($patch_max && @{$co{'parents'}} <= 1) {
> + if ($patch_max && @{$co{'parents'}} <= 1 &&
> + !gitweb_check_feature('email-privacy')) {
> $formats_nav .= " | " .
> $cgi->a({-href => href(action=>"patch", -replay=>1)},
> "patch");
I didn't run this, and hadn't kept up for a few rounds. I'm happy to see
the pos/while etc. looping gone, this LGTM.
next prev parent reply other threads:[~2021-04-08 22:43 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-20 23:42 [PATCH] gitweb: redacted e-mail addresses feature Georgios Kontaxis via GitGitGadget
2021-03-21 0:42 ` Ævar Arnfjörð Bjarmason
2021-03-21 1:27 ` brian m. carlson
2021-03-21 3:30 ` Georgios Kontaxis
2021-03-21 3:32 ` [PATCH v2] " Georgios Kontaxis via GitGitGadget
2021-03-21 17:28 ` [PATCH v3] " Georgios Kontaxis via GitGitGadget
2021-03-21 18:26 ` Ævar Arnfjörð Bjarmason
2021-03-21 18:48 ` Junio C Hamano
2021-03-21 19:48 ` Georgios Kontaxis
2021-03-21 18:42 ` Junio C Hamano
2021-03-21 18:57 ` Junio C Hamano
2021-03-21 19:05 ` Junio C Hamano
2021-03-21 20:07 ` Georgios Kontaxis
2021-03-21 22:17 ` Junio C Hamano
2021-03-21 23:14 ` Georgios Kontaxis
2021-03-22 4:25 ` Junio C Hamano
2021-03-22 6:57 ` [PATCH v4] " Georgios Kontaxis via GitGitGadget
2021-03-22 18:32 ` Junio C Hamano
2021-03-22 18:58 ` Georgios Kontaxis
2021-03-28 1:41 ` Junio C Hamano
2021-03-28 21:43 ` Georgios Kontaxis
2021-03-28 22:35 ` Junio C Hamano
2021-03-23 4:27 ` Georgios Kontaxis
2021-03-27 3:56 ` [PATCH v5] " Georgios Kontaxis via GitGitGadget
2021-03-28 23:26 ` [PATCH v6] " Georgios Kontaxis via GitGitGadget
2021-03-29 20:00 ` Junio C Hamano
2021-03-31 21:14 ` Junio C Hamano
2021-04-06 0:56 ` Junio C Hamano
2021-04-08 22:43 ` Ævar Arnfjörð Bjarmason [this message]
2021-04-08 22:51 ` Junio C Hamano
2021-03-29 1:47 ` [PATCH v5] " Eric Wong
2021-03-29 3:17 ` Georgios Kontaxis
2021-04-08 17:16 ` Eric Wong
2021-04-08 21:04 ` Junio C Hamano
2021-04-08 21:19 ` Eric Wong
2021-04-08 22:45 ` Ævar Arnfjörð Bjarmason
2021-04-08 22:54 ` Junio C Hamano
2021-03-21 6:00 ` [PATCH] " Junio C Hamano
2021-03-21 6:18 ` Junio C Hamano
2021-03-21 6:43 ` Georgios Kontaxis
2021-03-21 16:55 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87eefkieig.fsf@evledraar.gmail.com \
--to=avarab@gmail.com \
--cc=geko1702+commits@99rst.org \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=sandals@crustytoothpaste.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).