linux-kernel-mentees.lists.linuxfoundation.org archive mirror
 help / color / mirror / Atom feed
* [RFC] scripts: kernel-doc: improve parsing for kernel-doc comments syntax
@ 2021-04-14 19:25 Aditya Srivastava
  2021-04-15 11:17 ` Aditya Srivastava
  2021-04-15 21:29 ` Jonathan Corbet
  0 siblings, 2 replies; 3+ messages in thread
From: Aditya Srivastava @ 2021-04-14 19:25 UTC (permalink / raw)
  To: corbet; +Cc: linux-doc, linux-kernel-mentees, linux-kernel, yashsri421

Currently kernel-doc does not identify some cases of probable kernel
doc comments, for e.g. pointer used as declaration type for identifier,
space separated identifier, etc.

Some example of these cases in files can be:
i)" *  journal_t * jbd2_journal_init_dev() - creates and initialises a journal structure"
in fs/jbd2/journal.c

ii) "*      dget, dget_dlock -      get a reference to a dentry" in
include/linux/dcache.h

iii) "  * DEFINE_SEQLOCK(sl) - Define a statically allocated seqlock_t"
in include/linux/seqlock.h

Also improve identification for non-kerneldoc comments. For e.g.,

i) " *	The following functions allow us to read data using a swap map"
in kernel/power/swap.c does follow the kernel-doc like syntax, but the
content inside does not adheres to the expected format.

Improve parsing by adding support for these probable attempts to write
kernel-doc comment.

Suggested-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/lkml/87mtujktl2.fsf@meer.lwn.net
Signed-off-by: Aditya Srivastava <yashsri421@gmail.com>
---
 scripts/kernel-doc | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/scripts/kernel-doc b/scripts/kernel-doc
index 888913528185..37665aa41e6b 100755
--- a/scripts/kernel-doc
+++ b/scripts/kernel-doc
@@ -2110,17 +2110,25 @@ sub process_name($$) {
     } elsif (/$doc_decl/o) {
 	$identifier = $1;
 	my $is_kernel_comment = 0;
-	if (/^\s*\*\s*([\w\s]+?)(\(\))?\s*([-:].*)?$/) {
+	my $decl_start = qr{\s*\*};
+	my $fn_type = qr{\w+\s*\*\s*}; # i.e. pointer declaration type, foo * bar() - desc
+	my $parenthesis = qr{\(\w*\)};
+	my $decl_end = qr{[-:].*};
+	if (/^$decl_start\s*([\w\s]+?)$parenthesis?\s*$decl_end?$/) {
 	    $identifier = $1;
-	    $decl_type = 'function';
-	    $identifier =~ s/^define\s+//;
-	    $is_kernel_comment = 1;
 	}
 	if ($identifier =~ m/^(struct|union|enum|typedef)\b\s*(\S*)/) {
 	    $decl_type = $1;
 	    $identifier = $2;
 	    $is_kernel_comment = 1;
 	}
+	elsif (/^$decl_start\s*$fn_type?(\w+)\s*$parenthesis?\s*$decl_end?$/ ||	# i.e. foo()
+	    /^$decl_start\s*$fn_type?(\w+.*)$parenthesis?\s*$decl_end$/) {	# i.e. static void foo() - description; or misspelt identifier
+	    $identifier = $1;
+	    $decl_type = 'function';
+	    $identifier =~ s/^define\s+//;
+	    $is_kernel_comment = 1;
+	}
 	$identifier =~ s/\s+$//;
 
 	$state = STATE_BODY;
-- 
2.17.1

_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [RFC] scripts: kernel-doc: improve parsing for kernel-doc comments syntax
  2021-04-14 19:25 [RFC] scripts: kernel-doc: improve parsing for kernel-doc comments syntax Aditya Srivastava
@ 2021-04-15 11:17 ` Aditya Srivastava
  2021-04-15 21:29 ` Jonathan Corbet
  1 sibling, 0 replies; 3+ messages in thread
From: Aditya Srivastava @ 2021-04-15 11:17 UTC (permalink / raw)
  To: corbet; +Cc: linux-kernel-mentees, linux-kernel, linux-doc

On 15/4/21 12:55 am, Aditya Srivastava wrote:
> Currently kernel-doc does not identify some cases of probable kernel
> doc comments, for e.g. pointer used as declaration type for identifier,
> space separated identifier, etc.
> 
> Some example of these cases in files can be:
> i)" *  journal_t * jbd2_journal_init_dev() - creates and initialises a journal structure"
> in fs/jbd2/journal.c
> 
> ii) "*      dget, dget_dlock -      get a reference to a dentry" in
> include/linux/dcache.h
> 
> iii) "  * DEFINE_SEQLOCK(sl) - Define a statically allocated seqlock_t"
> in include/linux/seqlock.h
> 
> Also improve identification for non-kerneldoc comments. For e.g.,
> 
> i) " *	The following functions allow us to read data using a swap map"
> in kernel/power/swap.c does follow the kernel-doc like syntax, but the
> content inside does not adheres to the expected format.
> 
> Improve parsing by adding support for these probable attempts to write
> kernel-doc comment.
> 
> Suggested-by: Jonathan Corbet <corbet@lwn.net>
> Link: https://lore.kernel.org/lkml/87mtujktl2.fsf@meer.lwn.net
> Signed-off-by: Aditya Srivastava <yashsri421@gmail.com>
> ---
>  scripts/kernel-doc | 16 ++++++++++++----
>  1 file changed, 12 insertions(+), 4 deletions(-)
> 
> diff --git a/scripts/kernel-doc b/scripts/kernel-doc
> index 888913528185..37665aa41e6b 100755
> --- a/scripts/kernel-doc
> +++ b/scripts/kernel-doc
> @@ -2110,17 +2110,25 @@ sub process_name($$) {
>      } elsif (/$doc_decl/o) {
>  	$identifier = $1;
>  	my $is_kernel_comment = 0;
> -	if (/^\s*\*\s*([\w\s]+?)(\(\))?\s*([-:].*)?$/) {
> +	my $decl_start = qr{\s*\*};
> +	my $fn_type = qr{\w+\s*\*\s*}; # i.e. pointer declaration type, foo * bar() - desc
> +	my $parenthesis = qr{\(\w*\)};
> +	my $decl_end = qr{[-:].*};
> +	if (/^$decl_start\s*([\w\s]+?)$parenthesis?\s*$decl_end?$/) {
>  	    $identifier = $1;
> -	    $decl_type = 'function';
> -	    $identifier =~ s/^define\s+//;
> -	    $is_kernel_comment = 1;
>  	}
>  	if ($identifier =~ m/^(struct|union|enum|typedef)\b\s*(\S*)/) {
>  	    $decl_type = $1;
>  	    $identifier = $2;
>  	    $is_kernel_comment = 1;
>  	}
> +	elsif (/^$decl_start\s*$fn_type?(\w+)\s*$parenthesis?\s*$decl_end?$/ ||	# i.e. foo()
> +	    /^$decl_start\s*$fn_type?(\w+.*)$parenthesis?\s*$decl_end$/) {	# i.e. static void foo() - description; or misspelt identifier
> +	    $identifier = $1;
> +	    $decl_type = 'function';
> +	    $identifier =~ s/^define\s+//;
> +	    $is_kernel_comment = 1;
> +	}
>  	$identifier =~ s/\s+$//;
>  
>  	$state = STATE_BODY;
> 

Hi
I have generated a diff file for changes in kernel-doc warnings for
all the files in the kernel-tree, before and after this patch.
It can be found at:
https://github.com/AdityaSrivast/kernel-tasks/blob/master/random/kernel-doc/kernel_doc_comment_syntax_improvement_diff.txt

Thanks
Aditya
_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RFC] scripts: kernel-doc: improve parsing for kernel-doc comments syntax
  2021-04-14 19:25 [RFC] scripts: kernel-doc: improve parsing for kernel-doc comments syntax Aditya Srivastava
  2021-04-15 11:17 ` Aditya Srivastava
@ 2021-04-15 21:29 ` Jonathan Corbet
  1 sibling, 0 replies; 3+ messages in thread
From: Jonathan Corbet @ 2021-04-15 21:29 UTC (permalink / raw)
  To: Aditya Srivastava
  Cc: linux-doc, linux-kernel-mentees, linux-kernel, yashsri421

Aditya Srivastava <yashsri421@gmail.com> writes:

> Currently kernel-doc does not identify some cases of probable kernel
> doc comments, for e.g. pointer used as declaration type for identifier,
> space separated identifier, etc.
>
> Some example of these cases in files can be:
> i)" *  journal_t * jbd2_journal_init_dev() - creates and initialises a journal structure"
> in fs/jbd2/journal.c
>
> ii) "*      dget, dget_dlock -      get a reference to a dentry" in
> include/linux/dcache.h
>
> iii) "  * DEFINE_SEQLOCK(sl) - Define a statically allocated seqlock_t"
> in include/linux/seqlock.h
>
> Also improve identification for non-kerneldoc comments. For e.g.,
>
> i) " *	The following functions allow us to read data using a swap map"
> in kernel/power/swap.c does follow the kernel-doc like syntax, but the
> content inside does not adheres to the expected format.
>
> Improve parsing by adding support for these probable attempts to write
> kernel-doc comment.
>
> Suggested-by: Jonathan Corbet <corbet@lwn.net>
> Link: https://lore.kernel.org/lkml/87mtujktl2.fsf@meer.lwn.net
> Signed-off-by: Aditya Srivastava <yashsri421@gmail.com>
> ---
>  scripts/kernel-doc | 16 ++++++++++++----
>  1 file changed, 12 insertions(+), 4 deletions(-)

OK, I've applied this, but I have a couple of comments...

> diff --git a/scripts/kernel-doc b/scripts/kernel-doc
> index 888913528185..37665aa41e6b 100755
> --- a/scripts/kernel-doc
> +++ b/scripts/kernel-doc
> @@ -2110,17 +2110,25 @@ sub process_name($$) {
>      } elsif (/$doc_decl/o) {
>  	$identifier = $1;
>  	my $is_kernel_comment = 0;
> -	if (/^\s*\*\s*([\w\s]+?)(\(\))?\s*([-:].*)?$/) {
> +	my $decl_start = qr{\s*\*};

I appreciate the attempt to make the regexes a bit more comprehensible,
but we can do better yet, methinks.  This $decl_start is very much like
$doc_com defined globally.

It would really help a lot if we could at least take the incredible mass
of regexes in this program and boil them down to a smaller, unique set
that is used throughout.  kernel-doc might still make brains explode,
but perhaps the blast radius would be a bit smaller.

> +	my $fn_type = qr{\w+\s*\*\s*}; # i.e. pointer declaration type, foo * bar() - desc

Some of the lines in this change go waaaaay beyond the 80-character
limit; please try not to do that.  I fixed up the offending comments
this time around.

Thanks,

jon

_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-04-15 21:29 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-14 19:25 [RFC] scripts: kernel-doc: improve parsing for kernel-doc comments syntax Aditya Srivastava
2021-04-15 11:17 ` Aditya Srivastava
2021-04-15 21:29 ` Jonathan Corbet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).