All of lore.kernel.org
 help / color / mirror / Atom feed
From: Aditya Srivastava <yashsri421@gmail.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: corbet@lwn.net, lukas.bulwahn@gmail.com,
	linux-kernel-mentees@lists.linuxfoundation.org,
	linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC] scripts: kernel-doc: reduce repeated regex expressions into variables
Date: Sat, 24 Apr 2021 17:27:34 +0530	[thread overview]
Message-ID: <6f76ddcb-7076-4c91-9c4c-995002c4cb91@gmail.com> (raw)
In-Reply-To: <20210423132117.GB235567@casper.infradead.org>

On 23/4/21 6:51 pm, Matthew Wilcox wrote:
> On Fri, Apr 23, 2021 at 12:48:39AM +0530, Aditya Srivastava wrote:
>> +my $pointer_function = qr{([^\(]*\(\*)\s*\)\s*\(([^\)]*)\)};
> 
> Is that a pointer-to-function?  Or as people who write C usually call it,
> a function pointer?  Wouldn't it be better to call it $function_pointer?
> 
Will do it.

>> @@ -1210,8 +1211,14 @@ sub dump_struct($$) {
>>      my $decl_type;
>>      my $members;
>>      my $type = qr{struct|union};
>> +    my $packed = qr{__packed};
>> +    my $aligned = qr{__aligned};
>> +    my $cacheline_aligned_in_smp = qr{____cacheline_aligned_in_smp};
>> +    my $cacheline_aligned = qr{____cacheline_aligned};
> 
> I don't think those four definitions actually simplify anything.
> 
>> +    my $attribute = qr{__attribute__\s*\(\([a-z0-9,_\*\s\(\)]*\)\)}i;
> 
> ... whereas this one definitely does.
> 
>> -	$members =~ s/\s*__attribute__\s*\(\([a-z0-9,_\*\s\(\)]*\)\)/ /gi;
>> -	$members =~ s/\s*__aligned\s*\([^;]*\)/ /gos;
>> -	$members =~ s/\s*__packed\s*/ /gos;
>> +	$members =~ s/\s*$attribute/ /gi;
>> +	$members =~ s/\s*$aligned\s*\([^;]*\)/ /gos;
> 
> Maybe put the \s*\([^;]*\) into $aligned?  Then it becomes a useful
> abstraction.

Actually, I had made these variables as they were repeated here and at
-    my $definition_body =
qr{\{(.*)\}(?:\s*(?:__packed|__aligned|____cacheline_aligned_in_smp|____cacheline_aligned|__attribute__\s*\(\([a-z0-9,_\s\(\)]*\)\)))*};
+    my $definition_body =
qr{\{(.*)\}(?:\s*(?:$packed|$aligned|$cacheline_aligned_in_smp|$cacheline_aligned|$attribute))*};

So, defining them at a place might help.

What do you think?

> 
>> -    } elsif ($prototype =~ m/^()([a-zA-Z0-9_~:]+)\s*\(([^\(]*)\)/ ||
>> -	$prototype =~ m/^(\w+)\s+([a-zA-Z0-9_~:]+)\s*\(([^\(]*)\)/ ||
>> -	$prototype =~ m/^(\w+\s*\*+)\s*([a-zA-Z0-9_~:]+)\s*\(([^\(]*)\)/ ||
>> -	$prototype =~ m/^(\w+\s+\w+)\s+([a-zA-Z0-9_~:]+)\s*\(([^\(]*)\)/ ||
>> -	$prototype =~ m/^(\w+\s+\w+\s*\*+)\s*([a-zA-Z0-9_~:]+)\s*\(([^\(]*)\)/ ||
>> -	$prototype =~ m/^(\w+\s+\w+\s+\w+)\s+([a-zA-Z0-9_~:]+)\s*\(([^\(]*)\)/ ||
>> -	$prototype =~ m/^(\w+\s+\w+\s+\w+\s*\*+)\s*([a-zA-Z0-9_~:]+)\s*\(([^\(]*)\)/ ||
>> -	$prototype =~ m/^()([a-zA-Z0-9_~:]+)\s*\(([^\{]*)\)/ ||
>> -	$prototype =~ m/^(\w+)\s+([a-zA-Z0-9_~:]+)\s*\(([^\{]*)\)/ ||
>> -	$prototype =~ m/^(\w+\s*\*+)\s*([a-zA-Z0-9_~:]+)\s*\(([^\{]*)\)/ ||
>> -	$prototype =~ m/^(\w+\s+\w+)\s+([a-zA-Z0-9_~:]+)\s*\(([^\{]*)\)/ ||
>> -	$prototype =~ m/^(\w+\s+\w+\s*\*+)\s*([a-zA-Z0-9_~:]+)\s*\(([^\{]*)\)/ ||
>> -	$prototype =~ m/^(\w+\s+\w+\s+\w+)\s+([a-zA-Z0-9_~:]+)\s*\(([^\{]*)\)/ ||
>> -	$prototype =~ m/^(\w+\s+\w+\s+\w+\s*\*+)\s*([a-zA-Z0-9_~:]+)\s*\(([^\{]*)\)/ ||
>> -	$prototype =~ m/^(\w+\s+\w+\s+\w+\s+\w+)\s+([a-zA-Z0-9_~:]+)\s*\(([^\{]*)\)/ ||
>> -	$prototype =~ m/^(\w+\s+\w+\s+\w+\s+\w+\s*\*+)\s*([a-zA-Z0-9_~:]+)\s*\(([^\{]*)\)/ ||
>> -	$prototype =~ m/^(\w+\s+\w+\s*\*+\s*\w+\s*\*+\s*)\s*([a-zA-Z0-9_~:]+)\s*\(([^\{]*)\)/)  {
>> +    } elsif ($prototype =~ m/^()($name)\s*$prototype_end1/ ||
>> +	$prototype =~ m/^(\w+)\s+($name)\s*$prototype_end1/ ||
>> +	$prototype =~ m/^(\w+\s*\*+)\s*($name)\s*$prototype_end1/ ||
>> +	$prototype =~ m/^(\w+\s+\w+)\s+($name)\s*$prototype_end1/ ||
>> +	$prototype =~ m/^(\w+\s+\w+\s*\*+)\s*($name)\s*$prototype_end1/ ||
>> +	$prototype =~ m/^(\w+\s+\w+\s+\w+)\s+($name)\s*$prototype_end1/ ||
>> +	$prototype =~ m/^(\w+\s+\w+\s+\w+\s*\*+)\s*($name)\s*$prototype_end1/ ||
>> +	$prototype =~ m/^()($name)\s*$prototype_end2/ ||
>> +	$prototype =~ m/^(\w+)\s+($name)\s*$prototype_end2/ ||
>> +	$prototype =~ m/^(\w+\s*\*+)\s*($name)\s*$prototype_end2/ ||
>> +	$prototype =~ m/^(\w+\s+\w+)\s+($name)\s*$prototype_end2/ ||
>> +	$prototype =~ m/^(\w+\s+\w+\s*\*+)\s*($name)\s*$prototype_end2/ ||
>> +	$prototype =~ m/^(\w+\s+\w+\s+\w+)\s+($name)\s*$prototype_end2/ ||
>> +	$prototype =~ m/^(\w+\s+\w+\s+\w+\s*\*+)\s*($name)\s*$prototype_end2/ ||
>> +	$prototype =~ m/^(\w+\s+\w+\s+\w+\s+\w+)\s+($name)\s*$prototype_end2/ ||
>> +	$prototype =~ m/^(\w+\s+\w+\s+\w+\s+\w+\s*\*+)\s*($name)\s*$prototype_end2/ ||
>> +	$prototype =~ m/^(\w+\s+\w+\s*\*+\s*\w+\s*\*+\s*)\s*($name)\s*$prototype_end2/)  {
> 
> This is probably the best patch I've seen so far this year.
> 
> Now, can we go further?  For example:
> 	$prototype_end = $prototype_end1|$prototype_end2
> That would let us cut the number of lines here in half.
> > Can we create a definition for a variable number of \w and \s and '*'
> in the return type?  In fact, can we define a regex that matches a type?
> So this would become:
> 
>> +    } elsif ($prototype =~ m/^($type)\s*($name)\s*$prototype_end/) {
> 

I have been able to reduce these expressions furthermore. Will send a
v2 in few..

Thanks
Aditya

WARNING: multiple messages have this Message-ID (diff)
From: Aditya Srivastava <yashsri421@gmail.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: linux-doc@vger.kernel.org,
	linux-kernel-mentees@lists.linuxfoundation.org,
	linux-kernel@vger.kernel.org, corbet@lwn.net
Subject: Re: [RFC] scripts: kernel-doc: reduce repeated regex expressions into variables
Date: Sat, 24 Apr 2021 17:27:34 +0530	[thread overview]
Message-ID: <6f76ddcb-7076-4c91-9c4c-995002c4cb91@gmail.com> (raw)
In-Reply-To: <20210423132117.GB235567@casper.infradead.org>

On 23/4/21 6:51 pm, Matthew Wilcox wrote:
> On Fri, Apr 23, 2021 at 12:48:39AM +0530, Aditya Srivastava wrote:
>> +my $pointer_function = qr{([^\(]*\(\*)\s*\)\s*\(([^\)]*)\)};
> 
> Is that a pointer-to-function?  Or as people who write C usually call it,
> a function pointer?  Wouldn't it be better to call it $function_pointer?
> 
Will do it.

>> @@ -1210,8 +1211,14 @@ sub dump_struct($$) {
>>      my $decl_type;
>>      my $members;
>>      my $type = qr{struct|union};
>> +    my $packed = qr{__packed};
>> +    my $aligned = qr{__aligned};
>> +    my $cacheline_aligned_in_smp = qr{____cacheline_aligned_in_smp};
>> +    my $cacheline_aligned = qr{____cacheline_aligned};
> 
> I don't think those four definitions actually simplify anything.
> 
>> +    my $attribute = qr{__attribute__\s*\(\([a-z0-9,_\*\s\(\)]*\)\)}i;
> 
> ... whereas this one definitely does.
> 
>> -	$members =~ s/\s*__attribute__\s*\(\([a-z0-9,_\*\s\(\)]*\)\)/ /gi;
>> -	$members =~ s/\s*__aligned\s*\([^;]*\)/ /gos;
>> -	$members =~ s/\s*__packed\s*/ /gos;
>> +	$members =~ s/\s*$attribute/ /gi;
>> +	$members =~ s/\s*$aligned\s*\([^;]*\)/ /gos;
> 
> Maybe put the \s*\([^;]*\) into $aligned?  Then it becomes a useful
> abstraction.

Actually, I had made these variables as they were repeated here and at
-    my $definition_body =
qr{\{(.*)\}(?:\s*(?:__packed|__aligned|____cacheline_aligned_in_smp|____cacheline_aligned|__attribute__\s*\(\([a-z0-9,_\s\(\)]*\)\)))*};
+    my $definition_body =
qr{\{(.*)\}(?:\s*(?:$packed|$aligned|$cacheline_aligned_in_smp|$cacheline_aligned|$attribute))*};

So, defining them at a place might help.

What do you think?

> 
>> -    } elsif ($prototype =~ m/^()([a-zA-Z0-9_~:]+)\s*\(([^\(]*)\)/ ||
>> -	$prototype =~ m/^(\w+)\s+([a-zA-Z0-9_~:]+)\s*\(([^\(]*)\)/ ||
>> -	$prototype =~ m/^(\w+\s*\*+)\s*([a-zA-Z0-9_~:]+)\s*\(([^\(]*)\)/ ||
>> -	$prototype =~ m/^(\w+\s+\w+)\s+([a-zA-Z0-9_~:]+)\s*\(([^\(]*)\)/ ||
>> -	$prototype =~ m/^(\w+\s+\w+\s*\*+)\s*([a-zA-Z0-9_~:]+)\s*\(([^\(]*)\)/ ||
>> -	$prototype =~ m/^(\w+\s+\w+\s+\w+)\s+([a-zA-Z0-9_~:]+)\s*\(([^\(]*)\)/ ||
>> -	$prototype =~ m/^(\w+\s+\w+\s+\w+\s*\*+)\s*([a-zA-Z0-9_~:]+)\s*\(([^\(]*)\)/ ||
>> -	$prototype =~ m/^()([a-zA-Z0-9_~:]+)\s*\(([^\{]*)\)/ ||
>> -	$prototype =~ m/^(\w+)\s+([a-zA-Z0-9_~:]+)\s*\(([^\{]*)\)/ ||
>> -	$prototype =~ m/^(\w+\s*\*+)\s*([a-zA-Z0-9_~:]+)\s*\(([^\{]*)\)/ ||
>> -	$prototype =~ m/^(\w+\s+\w+)\s+([a-zA-Z0-9_~:]+)\s*\(([^\{]*)\)/ ||
>> -	$prototype =~ m/^(\w+\s+\w+\s*\*+)\s*([a-zA-Z0-9_~:]+)\s*\(([^\{]*)\)/ ||
>> -	$prototype =~ m/^(\w+\s+\w+\s+\w+)\s+([a-zA-Z0-9_~:]+)\s*\(([^\{]*)\)/ ||
>> -	$prototype =~ m/^(\w+\s+\w+\s+\w+\s*\*+)\s*([a-zA-Z0-9_~:]+)\s*\(([^\{]*)\)/ ||
>> -	$prototype =~ m/^(\w+\s+\w+\s+\w+\s+\w+)\s+([a-zA-Z0-9_~:]+)\s*\(([^\{]*)\)/ ||
>> -	$prototype =~ m/^(\w+\s+\w+\s+\w+\s+\w+\s*\*+)\s*([a-zA-Z0-9_~:]+)\s*\(([^\{]*)\)/ ||
>> -	$prototype =~ m/^(\w+\s+\w+\s*\*+\s*\w+\s*\*+\s*)\s*([a-zA-Z0-9_~:]+)\s*\(([^\{]*)\)/)  {
>> +    } elsif ($prototype =~ m/^()($name)\s*$prototype_end1/ ||
>> +	$prototype =~ m/^(\w+)\s+($name)\s*$prototype_end1/ ||
>> +	$prototype =~ m/^(\w+\s*\*+)\s*($name)\s*$prototype_end1/ ||
>> +	$prototype =~ m/^(\w+\s+\w+)\s+($name)\s*$prototype_end1/ ||
>> +	$prototype =~ m/^(\w+\s+\w+\s*\*+)\s*($name)\s*$prototype_end1/ ||
>> +	$prototype =~ m/^(\w+\s+\w+\s+\w+)\s+($name)\s*$prototype_end1/ ||
>> +	$prototype =~ m/^(\w+\s+\w+\s+\w+\s*\*+)\s*($name)\s*$prototype_end1/ ||
>> +	$prototype =~ m/^()($name)\s*$prototype_end2/ ||
>> +	$prototype =~ m/^(\w+)\s+($name)\s*$prototype_end2/ ||
>> +	$prototype =~ m/^(\w+\s*\*+)\s*($name)\s*$prototype_end2/ ||
>> +	$prototype =~ m/^(\w+\s+\w+)\s+($name)\s*$prototype_end2/ ||
>> +	$prototype =~ m/^(\w+\s+\w+\s*\*+)\s*($name)\s*$prototype_end2/ ||
>> +	$prototype =~ m/^(\w+\s+\w+\s+\w+)\s+($name)\s*$prototype_end2/ ||
>> +	$prototype =~ m/^(\w+\s+\w+\s+\w+\s*\*+)\s*($name)\s*$prototype_end2/ ||
>> +	$prototype =~ m/^(\w+\s+\w+\s+\w+\s+\w+)\s+($name)\s*$prototype_end2/ ||
>> +	$prototype =~ m/^(\w+\s+\w+\s+\w+\s+\w+\s*\*+)\s*($name)\s*$prototype_end2/ ||
>> +	$prototype =~ m/^(\w+\s+\w+\s*\*+\s*\w+\s*\*+\s*)\s*($name)\s*$prototype_end2/)  {
> 
> This is probably the best patch I've seen so far this year.
> 
> Now, can we go further?  For example:
> 	$prototype_end = $prototype_end1|$prototype_end2
> That would let us cut the number of lines here in half.
> > Can we create a definition for a variable number of \w and \s and '*'
> in the return type?  In fact, can we define a regex that matches a type?
> So this would become:
> 
>> +    } elsif ($prototype =~ m/^($type)\s*($name)\s*$prototype_end/) {
> 

I have been able to reduce these expressions furthermore. Will send a
v2 in few..

Thanks
Aditya
_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

  reply	other threads:[~2021-04-24 11:57 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-22 19:18 [RFC] scripts: kernel-doc: reduce repeated regex expressions into variables Aditya Srivastava
2021-04-22 19:18 ` Aditya Srivastava
2021-04-22 19:33 ` Lukas Bulwahn
2021-04-23 12:20   ` Aditya Srivastava
2021-04-23 12:20     ` Aditya Srivastava
2021-04-23 13:21 ` Matthew Wilcox
2021-04-23 13:21   ` Matthew Wilcox
2021-04-24 11:57   ` Aditya Srivastava [this message]
2021-04-24 11:57     ` Aditya Srivastava
2021-04-24 12:47     ` [RFC v2] " Aditya Srivastava
2021-04-24 12:47       ` Aditya Srivastava
2021-04-27 15:55       ` Jonathan Corbet
2021-04-27 15:55         ` Jonathan Corbet
2021-04-27 16:56         ` Matthew Wilcox
2021-04-27 16:56           ` Matthew Wilcox
2021-04-29  6:37           ` [RFC v3] " Aditya Srivastava
2021-04-29  6:37             ` Aditya Srivastava
2021-04-29 23:39             ` Jonathan Corbet
2021-04-29 23:39               ` Jonathan Corbet
2021-04-30  2:03               ` Joe Perches
2021-04-30  2:03                 ` Joe Perches
2021-05-01  9:30               ` Aditya Srivastava
2021-05-01  9:30                 ` Aditya Srivastava
2021-05-01 15:03                 ` Jonathan Corbet
2021-05-01 15:03                   ` Jonathan Corbet
2021-05-14 14:42                   ` [RFC v4] " Aditya Srivastava
2021-05-14 14:42                     ` Aditya Srivastava
2021-05-14 15:10                     ` Aditya Srivastava
2021-05-14 15:10                       ` Aditya Srivastava
2021-05-17 17:49                     ` Jonathan Corbet
2021-05-17 17:49                       ` Jonathan Corbet
2021-05-01 15:43             ` [RFC v3] " Matthew Wilcox
2021-05-01 15:43               ` Matthew Wilcox
2021-05-14 16:17               ` Aditya Srivastava
2021-05-14 16:17                 ` Aditya Srivastava
2021-04-26 17:31     ` [RFC] " Matthew Wilcox
2021-04-26 17:31       ` Matthew Wilcox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6f76ddcb-7076-4c91-9c4c-995002c4cb91@gmail.com \
    --to=yashsri421@gmail.com \
    --cc=corbet@lwn.net \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel-mentees@lists.linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lukas.bulwahn@gmail.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.