All of lore.kernel.org
 help / color / mirror / Atom feed
From: Justin Stitt <justinstitt@google.com>
To: Joe Perches <joe@perches.com>
Cc: linux-kernel@vger.kernel.org, Kees Cook <keescook@chromium.org>,
	Nick Desaulniers <ndesaulniers@google.com>
Subject: Re: [PATCH] get_maintainer/MAINTAINERS: confine K content matching to patches
Date: Thu, 5 Oct 2023 11:06:35 -0700	[thread overview]
Message-ID: <CAFhGd8oTHFDQ05M++E3ggAvs0567w5fSxovumX+vs8YXT8VXTA@mail.gmail.com> (raw)
In-Reply-To: <3dca40b677dd2fef979a5a581a2db91df2c21801.camel@perches.com>

On Wed, Oct 4, 2023 at 7:40 PM Joe Perches <joe@perches.com> wrote:
>
> On Wed, 2023-10-04 at 21:21 +0000, Justin Stitt wrote:
> > The current behavior of K: is a tad bit noisy. It matches against the
> > entire contents of files instead of just against the contents of a
> > patch.
> >
> > This means that a patch with a single character change (fixing a typo or
> > whitespace or something) would still to/cc maintainers and lists if the
> > affected file matched against the regex pattern given in K:. For
> > example, if a file has the word "clang" in it then every single patch
> > touching that file will to/cc Nick, Nathan and some lists.
> >
> > Let's change this behavior to only content match against patches
> > (subjects, message, diff) as this is what most people expect the
> > behavior already is. Most users of "K:" would prefer patch-only content
> > matching. If this is not the case let's add a new matching type as
> > proposed in [1].
>
> I'm glad to know  you are coming around to my suggestion.
:)

>
> I believe the file-based keyword matching should _not_ be
> removed and the option should be added for it like I suggested.

Having a command line flag allowing get_maintainer.pl
users to decide the behavior of K: is weird to me. If I'm a maintainer setting
my K: in MAINTAINERS I want some sort of consistent behavior. Some
patches will start hitting mailing list that DO have keywords in the patch
and others, confusingly, not. I understand setting the default state of
that flag is probably good enough as it is what 99.9% of users will use.

What do you think about this issue in particular, though?

To note, we get some speed-up here as pattern matching a patch that
touches lots of files would result in searching all of them in their
entirety. Just removing this behavior _might_ have a measurable
speed-up for patch series touching dozens of files.

>
> I also think it might be better to mark the "maintained" output
> differently as something like "keyword matched" instead.
>
>
> Something like:
> ---
>  scripts/get_maintainer.pl | 38 ++++++++++++++++++++------------------
>  1 file changed, 20 insertions(+), 18 deletions(-)
>
> diff --git a/scripts/get_maintainer.pl b/scripts/get_maintainer.pl
> index ab123b498fd9..befae75e61ab 100755
> --- a/scripts/get_maintainer.pl
> +++ b/scripts/get_maintainer.pl
> @@ -57,6 +57,7 @@ my $subsystem = 0;
>  my $status = 0;
>  my $letters = "";
>  my $keywords = 1;
> +my $keywords_in_file = 0;
>  my $sections = 0;
>  my $email_file_emails = 0;
>  my $from_filename = 0;
> @@ -272,6 +273,7 @@ if (!GetOptions(
>                 'letters=s' => \$letters,
>                 'pattern-depth=i' => \$pattern_depth,
>                 'k|keywords!' => \$keywords,
> +               'kf|keywords-in-file!' => \$keywords_in_file,
>                 'sections!' => \$sections,
>                 'fe|file-emails!' => \$email_file_emails,
>                 'f|file' => \$from_filename,
> @@ -318,6 +320,7 @@ if ($sections || $letters ne "") {
>      $subsystem = 0;
>      $web = 0;
>      $keywords = 0;
> +    $keywords_in_file = 0;
>      $interactive = 0;
>  } else {
>      my $selections = $email + $scm + $status + $subsystem + $web;
> @@ -548,16 +551,14 @@ foreach my $file (@ARGV) {
>         $file =~ s/^\Q${cur_path}\E//;  #strip any absolute path
>         $file =~ s/^\Q${lk_path}\E//;   #or the path to the lk tree
>         push(@files, $file);
> -       if ($file ne "MAINTAINERS" && -f $file && $keywords) {
> +       if ($file ne "MAINTAINERS" && -f $file && $keywords && $keywords_in_file) {
>             open(my $f, '<', $file)
>                 or die "$P: Can't open $file: $!\n";
>             my $text = do { local($/) ; <$f> };
>             close($f);
> -           if ($keywords) {
> -               foreach my $line (keys %keyword_hash) {
> -                   if ($text =~ m/$keyword_hash{$line}/x) {
> -                       push(@keyword_tvi, $line);
> -                   }
> +           foreach my $line (keys %keyword_hash) {
> +               if ($text =~ m/$keyword_hash{$line}/x) {
> +                   push(@keyword_tvi, $line);
>                 }
>             }
>         }
> @@ -919,7 +920,7 @@ sub get_maintainers {
>         }
>
>         foreach my $line (sort {$hash{$b} <=> $hash{$a}} keys %hash) {
> -           add_categories($line);
> +           add_categories($line, "");
>             if ($sections) {
>                 my $i;
>                 my $start = find_starting_index($line);
> @@ -947,7 +948,7 @@ sub get_maintainers {
>      if ($keywords) {
>         @keyword_tvi = sort_and_uniq(@keyword_tvi);
>         foreach my $line (@keyword_tvi) {
> -           add_categories($line);
> +           add_categories($line, ":Keyword");
>         }
>      }
>
> @@ -1076,6 +1077,7 @@ Output type options:
>  Other options:
>    --pattern-depth => Number of pattern directory traversals (default: 0 (all))
>    --keywords => scan patch for keywords (default: $keywords)
> +  --keywords-in-file => scan file for keywords (default: $keywords_in_file)
>    --sections => print all of the subsystem sections with pattern matches
>    --letters => print all matching 'letter' types from all matching sections
>    --mailmap => use .mailmap file (default: $email_use_mailmap)
> @@ -1086,7 +1088,7 @@ Other options:
>
>  Default options:
>    [--email --tree --nogit --git-fallback --m --r --n --l --multiline
> -   --pattern-depth=0 --remove-duplicates --rolestats]
> +   --pattern-depth=0 --remove-duplicates --rolestats --keywords]
>
>  Notes:
>    Using "-f directory" may give unexpected results:
> @@ -1312,7 +1314,7 @@ sub get_list_role {
>  }
>
>  sub add_categories {
> -    my ($index) = @_;
> +    my ($index, $suffix) = @_;
>
>      my $i;
>      my $start = find_starting_index($index);
> @@ -1342,7 +1344,7 @@ sub add_categories {
>                         if (!$hash_list_to{lc($list_address)}) {
>                             $hash_list_to{lc($list_address)} = 1;
>                             push(@list_to, [$list_address,
> -                                           "subscriber list${list_role}"]);
> +                                           "subscriber list${list_role}" . $suffix]);
>                         }
>                     }
>                 } else {
> @@ -1352,12 +1354,12 @@ sub add_categories {
>                                 if ($email_moderated_list) {
>                                     $hash_list_to{lc($list_address)} = 1;
>                                     push(@list_to, [$list_address,
> -                                                   "moderated list${list_role}"]);
> +                                                   "moderated list${list_role}" . $suffix]);
>                                 }
>                             } else {
>                                 $hash_list_to{lc($list_address)} = 1;
>                                 push(@list_to, [$list_address,
> -                                               "open list${list_role}"]);
> +                                               "open list${list_role}" . $suffix]);
>                             }
>                         }
>                     }
> @@ -1365,19 +1367,19 @@ sub add_categories {
>             } elsif ($ptype eq "M") {
>                 if ($email_maintainer) {
>                     my $role = get_maintainer_role($i);
> -                   push_email_addresses($pvalue, $role);
> +                   push_email_addresses($pvalue, $role . $suffix);
>                 }
>             } elsif ($ptype eq "R") {
>                 if ($email_reviewer) {
>                     my $subsystem = get_subsystem_name($i);
> -                   push_email_addresses($pvalue, "reviewer:$subsystem");
> +                   push_email_addresses($pvalue, "reviewer:$subsystem" . $suffix);
>                 }
>             } elsif ($ptype eq "T") {
> -               push(@scm, $pvalue);
> +               push(@scm, $pvalue . $suffix);
>             } elsif ($ptype eq "W") {
> -               push(@web, $pvalue);
> +               push(@web, $pvalue . $suffix);
>             } elsif ($ptype eq "S") {
> -               push(@status, $pvalue);
> +               push(@status, $pvalue . $suffix);
>             }
>         }
>      }
>

  reply	other threads:[~2023-10-05 18:07 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-04 21:21 [PATCH] get_maintainer/MAINTAINERS: confine K content matching to patches Justin Stitt
2023-10-05  2:40 ` Joe Perches
2023-10-05 18:06   ` Justin Stitt [this message]
2023-10-05 18:15     ` Joe Perches
2023-10-05 18:30       ` Justin Stitt
2023-10-05 18:42         ` Joe Perches
2023-10-05 19:52           ` Justin Stitt
2023-10-05 20:05             ` Joe Perches
2023-10-05 20:10               ` Justin Stitt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFhGd8oTHFDQ05M++E3ggAvs0567w5fSxovumX+vs8YXT8VXTA@mail.gmail.com \
    --to=justinstitt@google.com \
    --cc=joe@perches.com \
    --cc=keescook@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ndesaulniers@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.