All of lore.kernel.org
 help / color / mirror / Atom feed
From: Aditya Srivastava <yashsri421@gmail.com>
To: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Cc: Jonathan Corbet <corbet@lwn.net>,
	linux-kernel-mentees@lists.linuxfoundation.org,
	"open list:DOCUMENTATION" <linux-doc@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [RFC] scripts: kernel-doc: reduce repeated regex expressions into variables
Date: Fri, 23 Apr 2021 17:50:32 +0530	[thread overview]
Message-ID: <1101eeb0-306f-dbea-8819-8bddd80d361c@gmail.com> (raw)
In-Reply-To: <CAKXUXMx9q57cWXkcezKKo-uuh21Sd-Si9M9KydzFEMQ0ELYEng@mail.gmail.com>

On 23/4/21 1:03 am, Lukas Bulwahn wrote:
> Aditya Srivastava <yashsri421@gmail.com> schrieb am Do., 22. Apr. 2021,
> 21:18:
> 
>> There are some regex expressions in the kernel-doc script, which are used
>> repeatedly in the script.
>>
>> Reduce such expressions into variables, which can be used everywhere.
>>
>> A quick manual check found that no errors and warnings were added/removed
>> in this process.
>>
>> Suggested-by: Jonathan Corbet <corbet@lwn.net>
>> Signed-off-by: Aditya Srivastava <yashsri421@gmail.com>
>> ---
>>  scripts/kernel-doc | 89 ++++++++++++++++++++++++++--------------------
>>  1 file changed, 50 insertions(+), 39 deletions(-)
>>
>> diff --git a/scripts/kernel-doc b/scripts/kernel-doc
>> index 2a85d34fdcd0..579c9fdd275f 100755
>> --- a/scripts/kernel-doc
>> +++ b/scripts/kernel-doc
>> @@ -406,6 +406,7 @@ my $doc_inline_sect =
>> '\s*\*\s*(@\s*[\w][\w\.]*\s*):(.*)';
>>  my $doc_inline_end = '^\s*\*/\s*$';
>>  my $doc_inline_oneline = '^\s*/\*\*\s*(@[\w\s]+):\s*(.*)\s*\*/\s*$';
>>  my $export_symbol = '^\s*EXPORT_SYMBOL(_GPL)?\s*\(\s*(\w+)\s*\)\s*;';
>> +my $pointer_function = qr{([^\(]*\(\*)\s*\)\s*\(([^\)]*)\)};
>>
>>  my %parameterdescs;
>>  my %parameterdesc_start_lines;
>> @@ -694,7 +695,7 @@ sub output_function_man(%) {
>>             $post = ");";
>>         }
>>         $type = $args{'parametertypes'}{$parameter};
>> -       if ($type =~ m/([^\(]*\(\*)\s*\)\s*\(([^\)]*)\)/) {
>> +       if ($type =~ m/$pointer_function/) {
>>             # pointer-to-function
>>             print ".BI \"" . $parenth . $1 . "\" " . " \") (" . $2 . ")" .
>> $post . "\"\n";
>>         } else {
>> @@ -974,7 +975,7 @@ sub output_function_rst(%) {
>>         $count++;
>>         $type = $args{'parametertypes'}{$parameter};
>>
>> -       if ($type =~ m/([^\(]*\(\*)\s*\)\s*\(([^\)]*)\)/) {
>> +       if ($type =~ m/$pointer_function/) {
>>             # pointer-to-function
>>             print $1 . $parameter . ") (" . $2 . ")";
>>         } else {
>> @@ -1210,8 +1211,14 @@ sub dump_struct($$) {
>>      my $decl_type;
>>      my $members;
>>      my $type = qr{struct|union};
>> +    my $packed = qr{__packed};
>> +    my $aligned = qr{__aligned};
>> +    my $cacheline_aligned_in_smp = qr{____cacheline_aligned_in_smp};
>> +    my $cacheline_aligned = qr{____cacheline_aligned};
>> +    my $attribute = qr{__attribute__\s*\(\([a-z0-9,_\*\s\(\)]*\)\)}i;
>>      # For capturing struct/union definition body, i.e.
>> "{members*}qualifiers*"
>> -    my $definition_body =
>> qr{\{(.*)\}(?:\s*(?:__packed|__aligned|____cacheline_aligned_in_smp|____cacheline_aligned|__attribute__\s*\(\([a-z0-9,_\s\(\)]*\)\)))*};
>> +    my $definition_body =
>> qr{\{(.*)\}(?:\s*(?:$packed|$aligned|$cacheline_aligned_in_smp|$cacheline_aligned|$attribute))*};
>> +    my $struct_members =
>> qr{($type)([^\{\};]+)\{([^\{\}]*)\}([^\{\}\;]*)\;};
>>
>>      if ($x =~ /($type)\s+(\w+)\s*$definition_body/) {
>>         $decl_type = $1;
>> @@ -1235,27 +1242,27 @@ sub dump_struct($$) {
>>         # strip comments:
>>         $members =~ s/\/\*.*?\*\///gos;
>>         # strip attributes
>> -       $members =~ s/\s*__attribute__\s*\(\([a-z0-9,_\*\s\(\)]*\)\)/ /gi;
>> -       $members =~ s/\s*__aligned\s*\([^;]*\)/ /gos;
>> -       $members =~ s/\s*__packed\s*/ /gos;
>> +       $members =~ s/\s*$attribute/ /gi;
>> +       $members =~ s/\s*$aligned\s*\([^;]*\)/ /gos;
>> +       $members =~ s/\s*$packed\s*/ /gos;
>>         $members =~ s/\s*CRYPTO_MINALIGN_ATTR/ /gos;
>> -       $members =~ s/\s*____cacheline_aligned_in_smp/ /gos;
>> -       $members =~ s/\s*____cacheline_aligned/ /gos;
>> +       $members =~ s/\s*$cacheline_aligned_in_smp/ /gos;
>> +       $members =~ s/\s*$cacheline_aligned/ /gos;
>>
>> +       my $args = qr{([^,)]+)};
>>         # replace DECLARE_BITMAP
>>         $members =~
>> s/__ETHTOOL_DECLARE_LINK_MODE_MASK\s*\(([^\)]+)\)/DECLARE_BITMAP($1,
>> __ETHTOOL_LINK_MODE_MASK_NBITS)/gos;
>> -       $members =~ s/DECLARE_BITMAP\s*\(([^,)]+),\s*([^,)]+)\)/unsigned
>> long $1\[BITS_TO_LONGS($2)\]/gos;
>> +       $members =~ s/DECLARE_BITMAP\s*\($args,\s*$args\)/unsigned long
>> $1\[BITS_TO_LONGS($2)\]/gos;
>>         # replace DECLARE_HASHTABLE
>> -       $members =~
>> s/DECLARE_HASHTABLE\s*\(([^,)]+),\s*([^,)]+)\)/unsigned long $1\[1 << (($2)
>> - 1)\]/gos;
>> +       $members =~ s/DECLARE_HASHTABLE\s*\($args,\s*$args\)/unsigned long
>> $1\[1 << (($2) - 1)\]/gos;
>>         # replace DECLARE_KFIFO
>> -       $members =~
>> s/DECLARE_KFIFO\s*\(([^,)]+),\s*([^,)]+),\s*([^,)]+)\)/$2 \*$1/gos;
>> +       $members =~ s/DECLARE_KFIFO\s*\($args,\s*$args,\s*$args\)/$2
>> \*$1/gos;
>>         # replace DECLARE_KFIFO_PTR
>> -       $members =~ s/DECLARE_KFIFO_PTR\s*\(([^,)]+),\s*([^,)]+)\)/$2
>> \*$1/gos;
>> -
>> +       $members =~ s/DECLARE_KFIFO_PTR\s*\($args,\s*$args\)/$2 \*$1/gos;
>>         my $declaration = $members;
>>
>>         # Split nested struct/union elements as newer ones
>> -       while ($members =~
>> m/(struct|union)([^\{\};]+)\{([^\{\}]*)\}([^\{\}\;]*)\;/) {
>> +       while ($members =~ m/$struct_members/) {
>>                 my $newmember;
>>                 my $maintype = $1;
>>                 my $ids = $4;
>> @@ -1315,7 +1322,7 @@ sub dump_struct($$) {
>>                                 }
>>                         }
>>                 }
>> -               $members =~
>> s/(struct|union)([^\{\};]+)\{([^\{\}]*)\}([^\{\}\;]*)\;/$newmember/;
>> +               $members =~ s/$struct_members/$newmember/;
>>         }
>>
>>         # Ignore other nested elements, like enums
>> @@ -1555,8 +1562,9 @@ sub create_parameterlist($$$$) {
>>      my $param;
>>
>>      # temporarily replace commas inside function pointer definition
>> -    while ($args =~ /(\([^\),]+),/) {
>> -       $args =~ s/(\([^\),]+),/$1#/g;
>> +    my $arg_expr = qr{\([^\),]+};
>> +    while ($args =~ /$arg_expr,/) {
>> +       $args =~ s/($arg_expr),/$1#/g;
>>      }
>>
>>      foreach my $arg (split($splitter, $args)) {
>> @@ -1808,8 +1816,11 @@ sub dump_function($$) {
>>      # - parport_register_device (function pointer parameters)
>>      # - atomic_set (macro)
>>      # - pci_match_device, __copy_to_user (long return type)
>> +    my $name = qr{[a-zA-Z0-9_~:]+};
>> +    my $prototype_end1 = qr{\(([^\(]*)\)};
>> +    my $prototype_end2 = qr{\(([^\{]*)\)};
>>
> 
> Why do you need end1 and end2 here?
> 

Thanks for pointing out, Lukas. I am looking into the possibility of
combining these expressions, and testing against the files.
Please let me know if there are any more improvements possible :)

Thanks
Aditya

WARNING: multiple messages have this Message-ID (diff)
From: Aditya Srivastava <yashsri421@gmail.com>
To: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Cc: "open list:DOCUMENTATION" <linux-doc@vger.kernel.org>,
	linux-kernel-mentees@lists.linuxfoundation.org,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Jonathan Corbet <corbet@lwn.net>
Subject: Re: [RFC] scripts: kernel-doc: reduce repeated regex expressions into variables
Date: Fri, 23 Apr 2021 17:50:32 +0530	[thread overview]
Message-ID: <1101eeb0-306f-dbea-8819-8bddd80d361c@gmail.com> (raw)
In-Reply-To: <CAKXUXMx9q57cWXkcezKKo-uuh21Sd-Si9M9KydzFEMQ0ELYEng@mail.gmail.com>

On 23/4/21 1:03 am, Lukas Bulwahn wrote:
> Aditya Srivastava <yashsri421@gmail.com> schrieb am Do., 22. Apr. 2021,
> 21:18:
> 
>> There are some regex expressions in the kernel-doc script, which are used
>> repeatedly in the script.
>>
>> Reduce such expressions into variables, which can be used everywhere.
>>
>> A quick manual check found that no errors and warnings were added/removed
>> in this process.
>>
>> Suggested-by: Jonathan Corbet <corbet@lwn.net>
>> Signed-off-by: Aditya Srivastava <yashsri421@gmail.com>
>> ---
>>  scripts/kernel-doc | 89 ++++++++++++++++++++++++++--------------------
>>  1 file changed, 50 insertions(+), 39 deletions(-)
>>
>> diff --git a/scripts/kernel-doc b/scripts/kernel-doc
>> index 2a85d34fdcd0..579c9fdd275f 100755
>> --- a/scripts/kernel-doc
>> +++ b/scripts/kernel-doc
>> @@ -406,6 +406,7 @@ my $doc_inline_sect =
>> '\s*\*\s*(@\s*[\w][\w\.]*\s*):(.*)';
>>  my $doc_inline_end = '^\s*\*/\s*$';
>>  my $doc_inline_oneline = '^\s*/\*\*\s*(@[\w\s]+):\s*(.*)\s*\*/\s*$';
>>  my $export_symbol = '^\s*EXPORT_SYMBOL(_GPL)?\s*\(\s*(\w+)\s*\)\s*;';
>> +my $pointer_function = qr{([^\(]*\(\*)\s*\)\s*\(([^\)]*)\)};
>>
>>  my %parameterdescs;
>>  my %parameterdesc_start_lines;
>> @@ -694,7 +695,7 @@ sub output_function_man(%) {
>>             $post = ");";
>>         }
>>         $type = $args{'parametertypes'}{$parameter};
>> -       if ($type =~ m/([^\(]*\(\*)\s*\)\s*\(([^\)]*)\)/) {
>> +       if ($type =~ m/$pointer_function/) {
>>             # pointer-to-function
>>             print ".BI \"" . $parenth . $1 . "\" " . " \") (" . $2 . ")" .
>> $post . "\"\n";
>>         } else {
>> @@ -974,7 +975,7 @@ sub output_function_rst(%) {
>>         $count++;
>>         $type = $args{'parametertypes'}{$parameter};
>>
>> -       if ($type =~ m/([^\(]*\(\*)\s*\)\s*\(([^\)]*)\)/) {
>> +       if ($type =~ m/$pointer_function/) {
>>             # pointer-to-function
>>             print $1 . $parameter . ") (" . $2 . ")";
>>         } else {
>> @@ -1210,8 +1211,14 @@ sub dump_struct($$) {
>>      my $decl_type;
>>      my $members;
>>      my $type = qr{struct|union};
>> +    my $packed = qr{__packed};
>> +    my $aligned = qr{__aligned};
>> +    my $cacheline_aligned_in_smp = qr{____cacheline_aligned_in_smp};
>> +    my $cacheline_aligned = qr{____cacheline_aligned};
>> +    my $attribute = qr{__attribute__\s*\(\([a-z0-9,_\*\s\(\)]*\)\)}i;
>>      # For capturing struct/union definition body, i.e.
>> "{members*}qualifiers*"
>> -    my $definition_body =
>> qr{\{(.*)\}(?:\s*(?:__packed|__aligned|____cacheline_aligned_in_smp|____cacheline_aligned|__attribute__\s*\(\([a-z0-9,_\s\(\)]*\)\)))*};
>> +    my $definition_body =
>> qr{\{(.*)\}(?:\s*(?:$packed|$aligned|$cacheline_aligned_in_smp|$cacheline_aligned|$attribute))*};
>> +    my $struct_members =
>> qr{($type)([^\{\};]+)\{([^\{\}]*)\}([^\{\}\;]*)\;};
>>
>>      if ($x =~ /($type)\s+(\w+)\s*$definition_body/) {
>>         $decl_type = $1;
>> @@ -1235,27 +1242,27 @@ sub dump_struct($$) {
>>         # strip comments:
>>         $members =~ s/\/\*.*?\*\///gos;
>>         # strip attributes
>> -       $members =~ s/\s*__attribute__\s*\(\([a-z0-9,_\*\s\(\)]*\)\)/ /gi;
>> -       $members =~ s/\s*__aligned\s*\([^;]*\)/ /gos;
>> -       $members =~ s/\s*__packed\s*/ /gos;
>> +       $members =~ s/\s*$attribute/ /gi;
>> +       $members =~ s/\s*$aligned\s*\([^;]*\)/ /gos;
>> +       $members =~ s/\s*$packed\s*/ /gos;
>>         $members =~ s/\s*CRYPTO_MINALIGN_ATTR/ /gos;
>> -       $members =~ s/\s*____cacheline_aligned_in_smp/ /gos;
>> -       $members =~ s/\s*____cacheline_aligned/ /gos;
>> +       $members =~ s/\s*$cacheline_aligned_in_smp/ /gos;
>> +       $members =~ s/\s*$cacheline_aligned/ /gos;
>>
>> +       my $args = qr{([^,)]+)};
>>         # replace DECLARE_BITMAP
>>         $members =~
>> s/__ETHTOOL_DECLARE_LINK_MODE_MASK\s*\(([^\)]+)\)/DECLARE_BITMAP($1,
>> __ETHTOOL_LINK_MODE_MASK_NBITS)/gos;
>> -       $members =~ s/DECLARE_BITMAP\s*\(([^,)]+),\s*([^,)]+)\)/unsigned
>> long $1\[BITS_TO_LONGS($2)\]/gos;
>> +       $members =~ s/DECLARE_BITMAP\s*\($args,\s*$args\)/unsigned long
>> $1\[BITS_TO_LONGS($2)\]/gos;
>>         # replace DECLARE_HASHTABLE
>> -       $members =~
>> s/DECLARE_HASHTABLE\s*\(([^,)]+),\s*([^,)]+)\)/unsigned long $1\[1 << (($2)
>> - 1)\]/gos;
>> +       $members =~ s/DECLARE_HASHTABLE\s*\($args,\s*$args\)/unsigned long
>> $1\[1 << (($2) - 1)\]/gos;
>>         # replace DECLARE_KFIFO
>> -       $members =~
>> s/DECLARE_KFIFO\s*\(([^,)]+),\s*([^,)]+),\s*([^,)]+)\)/$2 \*$1/gos;
>> +       $members =~ s/DECLARE_KFIFO\s*\($args,\s*$args,\s*$args\)/$2
>> \*$1/gos;
>>         # replace DECLARE_KFIFO_PTR
>> -       $members =~ s/DECLARE_KFIFO_PTR\s*\(([^,)]+),\s*([^,)]+)\)/$2
>> \*$1/gos;
>> -
>> +       $members =~ s/DECLARE_KFIFO_PTR\s*\($args,\s*$args\)/$2 \*$1/gos;
>>         my $declaration = $members;
>>
>>         # Split nested struct/union elements as newer ones
>> -       while ($members =~
>> m/(struct|union)([^\{\};]+)\{([^\{\}]*)\}([^\{\}\;]*)\;/) {
>> +       while ($members =~ m/$struct_members/) {
>>                 my $newmember;
>>                 my $maintype = $1;
>>                 my $ids = $4;
>> @@ -1315,7 +1322,7 @@ sub dump_struct($$) {
>>                                 }
>>                         }
>>                 }
>> -               $members =~
>> s/(struct|union)([^\{\};]+)\{([^\{\}]*)\}([^\{\}\;]*)\;/$newmember/;
>> +               $members =~ s/$struct_members/$newmember/;
>>         }
>>
>>         # Ignore other nested elements, like enums
>> @@ -1555,8 +1562,9 @@ sub create_parameterlist($$$$) {
>>      my $param;
>>
>>      # temporarily replace commas inside function pointer definition
>> -    while ($args =~ /(\([^\),]+),/) {
>> -       $args =~ s/(\([^\),]+),/$1#/g;
>> +    my $arg_expr = qr{\([^\),]+};
>> +    while ($args =~ /$arg_expr,/) {
>> +       $args =~ s/($arg_expr),/$1#/g;
>>      }
>>
>>      foreach my $arg (split($splitter, $args)) {
>> @@ -1808,8 +1816,11 @@ sub dump_function($$) {
>>      # - parport_register_device (function pointer parameters)
>>      # - atomic_set (macro)
>>      # - pci_match_device, __copy_to_user (long return type)
>> +    my $name = qr{[a-zA-Z0-9_~:]+};
>> +    my $prototype_end1 = qr{\(([^\(]*)\)};
>> +    my $prototype_end2 = qr{\(([^\{]*)\)};
>>
> 
> Why do you need end1 and end2 here?
> 

Thanks for pointing out, Lukas. I am looking into the possibility of
combining these expressions, and testing against the files.
Please let me know if there are any more improvements possible :)

Thanks
Aditya
_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

  reply	other threads:[~2021-04-23 12:20 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-22 19:18 [RFC] scripts: kernel-doc: reduce repeated regex expressions into variables Aditya Srivastava
2021-04-22 19:18 ` Aditya Srivastava
2021-04-22 19:33 ` Lukas Bulwahn
2021-04-23 12:20   ` Aditya Srivastava [this message]
2021-04-23 12:20     ` Aditya Srivastava
2021-04-23 13:21 ` Matthew Wilcox
2021-04-23 13:21   ` Matthew Wilcox
2021-04-24 11:57   ` Aditya Srivastava
2021-04-24 11:57     ` Aditya Srivastava
2021-04-24 12:47     ` [RFC v2] " Aditya Srivastava
2021-04-24 12:47       ` Aditya Srivastava
2021-04-27 15:55       ` Jonathan Corbet
2021-04-27 15:55         ` Jonathan Corbet
2021-04-27 16:56         ` Matthew Wilcox
2021-04-27 16:56           ` Matthew Wilcox
2021-04-29  6:37           ` [RFC v3] " Aditya Srivastava
2021-04-29  6:37             ` Aditya Srivastava
2021-04-29 23:39             ` Jonathan Corbet
2021-04-29 23:39               ` Jonathan Corbet
2021-04-30  2:03               ` Joe Perches
2021-04-30  2:03                 ` Joe Perches
2021-05-01  9:30               ` Aditya Srivastava
2021-05-01  9:30                 ` Aditya Srivastava
2021-05-01 15:03                 ` Jonathan Corbet
2021-05-01 15:03                   ` Jonathan Corbet
2021-05-14 14:42                   ` [RFC v4] " Aditya Srivastava
2021-05-14 14:42                     ` Aditya Srivastava
2021-05-14 15:10                     ` Aditya Srivastava
2021-05-14 15:10                       ` Aditya Srivastava
2021-05-17 17:49                     ` Jonathan Corbet
2021-05-17 17:49                       ` Jonathan Corbet
2021-05-01 15:43             ` [RFC v3] " Matthew Wilcox
2021-05-01 15:43               ` Matthew Wilcox
2021-05-14 16:17               ` Aditya Srivastava
2021-05-14 16:17                 ` Aditya Srivastava
2021-04-26 17:31     ` [RFC] " Matthew Wilcox
2021-04-26 17:31       ` Matthew Wilcox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1101eeb0-306f-dbea-8819-8bddd80d361c@gmail.com \
    --to=yashsri421@gmail.com \
    --cc=corbet@lwn.net \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel-mentees@lists.linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lukas.bulwahn@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.