netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/9] get_abi.pl: Check for missing symbols at the ABI specs
@ 2021-09-08 14:58 Mauro Carvalho Chehab
  2021-09-08 14:58 ` [PATCH 1/9] scripts: " Mauro Carvalho Chehab
  2021-09-09 13:51 ` [PATCH 0/9] " Greg KH
  0 siblings, 2 replies; 4+ messages in thread
From: Mauro Carvalho Chehab @ 2021-09-08 14:58 UTC (permalink / raw)
  To: Linux Doc Mailing List, Greg KH
  Cc: Mauro Carvalho Chehab, linux-kernel, Jonathan Corbet,
	Anton Vorontsov, Colin Cross, John Fastabend, KP Singh,
	Kees Cook, Martin KaFai Lau, Song Liu, Tony Luck, Yonghong Song,
	bpf, netdev

Hi Greg,

Sometime ago, I discussed with Jonathan Cameron about providing 
a way check that the ABI documentation is incomplete.

While it would be doable to validate the ABI by searching __ATTR and 
similar macros around the driver, this would probably be very complex
and would take a while to parse.

So, I ended by implementing a new feature at scripts/get_abi.pl
which does a check on the sysfs contents of a running system:
it reads everything under /sys and reads the entire ABI from
Documentation/ABI. It then warns for symbols that weren't found,
optionally showing possible candidates that might be misdefined.

I opted to place it on 3 patches:

The first patch adds the basic logic. It runs really quicky (up to 2
seconds), but it doesn't use sysfs softlinks.

Patch 2 adds support for also parsing softlinks. It slows the logic,
with now takes ~40 seconds to run on my desktop (and ~23
seconds on a HiKey970 ARM board). There are space there for
performance improvements, by using a more sophisticated
algorithm, at the expense of making the code harder to
understand. I ended opting to use a simple implementation
for now, as ~40 seconds sounds acceptable on my eyes.

Patch 3 adds an optional parameter to allow filtering the results
using a regex given by the user.

One of the problems with the current ABI definitions is that several
symbols define wildcards, on non-standard ways. The more commonly
wildcards used there are:

	<foo>
	{foo}
	[foo]
	X
	Y
	Z
	/.../

The script converts the above wildcards into (somewhat relaxed)
regexes.

There's one place using  "(some description)". This one is harder to
parse, as parenthesis are used by the parsing regexes. As this happens
only on one file, patch 4 addresses such case.

Patch 5 to 9 fix some other ABI troubles I identified.

In long term, perhaps the better would be to just use regex on What:
fields, as this would avoid extra heuristics at get_abi.pl, but this is
OOT from this patch, and would mean a large number of changes.

-

As reference, I sent an early implementation of this change as a RFC:
	https://lore.kernel.org/lkml/cover.1624014140.git.mchehab+huawei@kernel.org/

Mauro Carvalho Chehab (9):
  scripts: get_abi.pl: Check for missing symbols at the ABI specs
  scripts: get_abi.pl: detect softlinks
  scripts: get_abi.pl: add an option to filter undefined results
  ABI: sysfs-bus-usb: better document variable argument
  ABI: sysfs-module: better document module name parameter
  ABI: sysfs-tty: better document module name parameter
  ABI: sysfs-kernel-slab: use a wildcard for the cache name
  ABI: security: fix location for evm and ima_policy
  ABI: sysfs-module: document initstate

 Documentation/ABI/stable/sysfs-module       |  10 +-
 Documentation/ABI/testing/evm               |   4 +-
 Documentation/ABI/testing/ima_policy        |   2 +-
 Documentation/ABI/testing/sysfs-bus-usb     |  16 +-
 Documentation/ABI/testing/sysfs-kernel-slab |  94 ++++-----
 Documentation/ABI/testing/sysfs-module      |   7 +
 Documentation/ABI/testing/sysfs-tty         |  32 +--
 scripts/get_abi.pl                          | 218 +++++++++++++++++++-
 8 files changed, 303 insertions(+), 80 deletions(-)

-- 
2.31.1



^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 1/9] scripts: get_abi.pl: Check for missing symbols at the ABI specs
  2021-09-08 14:58 [PATCH 0/9] get_abi.pl: Check for missing symbols at the ABI specs Mauro Carvalho Chehab
@ 2021-09-08 14:58 ` Mauro Carvalho Chehab
  2021-09-09 13:51 ` [PATCH 0/9] " Greg KH
  1 sibling, 0 replies; 4+ messages in thread
From: Mauro Carvalho Chehab @ 2021-09-08 14:58 UTC (permalink / raw)
  To: Linux Doc Mailing List, Greg KH
  Cc: Mauro Carvalho Chehab, Jonathan Corbet, Alexei Starovoitov,
	Andrii Nakryiko, Anton Vorontsov, Colin Cross, Daniel Borkmann,
	John Fastabend, KP Singh, Kees Cook, Martin KaFai Lau, Song Liu,
	Tony Luck, Yonghong Song, bpf, linux-kernel, netdev

Check for the symbols that exists under /sys but aren't
defined at Documentation/ABI.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 scripts/get_abi.pl | 88 ++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 86 insertions(+), 2 deletions(-)

diff --git a/scripts/get_abi.pl b/scripts/get_abi.pl
index d7aa82094296..31b2fdf1f318 100755
--- a/scripts/get_abi.pl
+++ b/scripts/get_abi.pl
@@ -13,7 +13,9 @@ my $help = 0;
 my $man = 0;
 my $debug = 0;
 my $enable_lineno = 0;
+my $show_warnings = 1;
 my $prefix="Documentation/ABI";
+my $sysfs_prefix="/sys";
 
 #
 # If true, assumes that the description is formatted with ReST
@@ -36,7 +38,7 @@ pod2usage(2) if (scalar @ARGV < 1 || @ARGV > 2);
 
 my ($cmd, $arg) = @ARGV;
 
-pod2usage(2) if ($cmd ne "search" && $cmd ne "rest" && $cmd ne "validate");
+pod2usage(2) if ($cmd ne "search" && $cmd ne "rest" && $cmd ne "validate" && $cmd ne "undefined");
 pod2usage(2) if ($cmd eq "search" && !$arg);
 
 require Data::Dumper if ($debug);
@@ -50,6 +52,8 @@ my %symbols;
 sub parse_error($$$$) {
 	my ($file, $ln, $msg, $data) = @_;
 
+	return if (!$show_warnings);
+
 	$data =~ s/\s+$/\n/;
 
 	print STDERR "Warning: file $file#$ln:\n\t$msg";
@@ -521,11 +525,86 @@ sub search_symbols {
 	}
 }
 
+# Exclude /sys/kernel/debug and /sys/kernel/tracing from the search path
+sub skip_debugfs {
+	if (($File::Find::dir =~ m,^/sys/kernel,)) {
+		return grep {!/(debug|tracing)/ } @_;
+	}
+
+	if (($File::Find::dir =~ m,^/sys/fs,)) {
+		return grep {!/(pstore|bpf|fuse)/ } @_;
+	}
+
+	return @_
+}
+
+my %leaf;
+
+my $escape_symbols = qr { ([\x01-\x08\x0e-\x1f\x21-\x29\x2b-\x2d\x3a-\x40\x7b-\xff]) }x;
+sub parse_existing_sysfs {
+	my $file = $File::Find::name;
+
+	my $mode = (stat($file))[2];
+	return if ($mode & S_IFDIR);
+
+	my $leave = $file;
+	$leave =~ s,.*/,,;
+
+	if (defined($leaf{$leave})) {
+		# FIXME: need to check if the path makes sense
+		my $what = $leaf{$leave};
+
+		$what =~ s/,/ /g;
+
+		$what =~ s/\<[^\>]+\>/.*/g;
+		$what =~ s/\{[^\}]+\}/.*/g;
+		$what =~ s/\[[^\]]+\]/.*/g;
+		$what =~ s,/\.\.\./,/.*/,g;
+		$what =~ s,/\*/,/.*/,g;
+
+		$what =~ s/\s+/ /g;
+
+		# Escape all other symbols
+		$what =~ s/$escape_symbols/\\$1/g;
+
+		foreach my $i (split / /,$what) {
+			if ($file =~ m#^$i$#) {
+#				print "$file: $i: OK!\n";
+				return;
+			}
+		}
+
+		print "$file: $leave is defined at $what\n";
+
+		return;
+	}
+
+	print "$file not found.\n";
+}
+
+sub undefined_symbols {
+	foreach my $what (sort keys %data) {
+		my $leave = $what;
+		$leave =~ s,.*/,,;
+
+		if (defined($leaf{$leave})) {
+			$leaf{$leave} .= " " . $what;
+		} else {
+			$leaf{$leave} = $what;
+		}
+	}
+
+	find({wanted =>\&parse_existing_sysfs, preprocess =>\&skip_debugfs, no_chdir => 1}, $sysfs_prefix);
+}
+
 # Ensure that the prefix will always end with a slash
 # While this is not needed for find, it makes the patch nicer
 # with --enable-lineno
 $prefix =~ s,/?$,/,;
 
+if ($cmd eq "undefined" || $cmd eq "search") {
+	$show_warnings = 0;
+}
 #
 # Parses all ABI files located at $prefix dir
 #
@@ -536,7 +615,9 @@ print STDERR Data::Dumper->Dump([\%data], [qw(*data)]) if ($debug);
 #
 # Handles the command
 #
-if ($cmd eq "search") {
+if ($cmd eq "undefined") {
+	undefined_symbols;
+} elsif ($cmd eq "search") {
 	search_symbols;
 } else {
 	if ($cmd eq "rest") {
@@ -575,6 +656,9 @@ B<rest>                  - output the ABI in ReST markup language
 
 B<validate>              - validate the ABI contents
 
+B<undefined>             - existing symbols at the system that aren't
+                           defined at Documentation/ABI
+
 =back
 
 =head1 OPTIONS
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH 0/9] get_abi.pl: Check for missing symbols at the ABI specs
  2021-09-08 14:58 [PATCH 0/9] get_abi.pl: Check for missing symbols at the ABI specs Mauro Carvalho Chehab
  2021-09-08 14:58 ` [PATCH 1/9] scripts: " Mauro Carvalho Chehab
@ 2021-09-09 13:51 ` Greg KH
  2021-09-14 14:24   ` Mauro Carvalho Chehab
  1 sibling, 1 reply; 4+ messages in thread
From: Greg KH @ 2021-09-09 13:51 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Linux Doc Mailing List, linux-kernel, Jonathan Corbet,
	Anton Vorontsov, Colin Cross, John Fastabend, KP Singh,
	Kees Cook, Martin KaFai Lau, Song Liu, Tony Luck, Yonghong Song,
	bpf, netdev

On Wed, Sep 08, 2021 at 04:58:47PM +0200, Mauro Carvalho Chehab wrote:
> Hi Greg,
> 
> Sometime ago, I discussed with Jonathan Cameron about providing 
> a way check that the ABI documentation is incomplete.
> 
> While it would be doable to validate the ABI by searching __ATTR and 
> similar macros around the driver, this would probably be very complex
> and would take a while to parse.
> 
> So, I ended by implementing a new feature at scripts/get_abi.pl
> which does a check on the sysfs contents of a running system:
> it reads everything under /sys and reads the entire ABI from
> Documentation/ABI. It then warns for symbols that weren't found,
> optionally showing possible candidates that might be misdefined.
> 
> I opted to place it on 3 patches:
> 
> The first patch adds the basic logic. It runs really quicky (up to 2
> seconds), but it doesn't use sysfs softlinks.
> 
> Patch 2 adds support for also parsing softlinks. It slows the logic,
> with now takes ~40 seconds to run on my desktop (and ~23
> seconds on a HiKey970 ARM board). There are space there for
> performance improvements, by using a more sophisticated
> algorithm, at the expense of making the code harder to
> understand. I ended opting to use a simple implementation
> for now, as ~40 seconds sounds acceptable on my eyes.
> 
> Patch 3 adds an optional parameter to allow filtering the results
> using a regex given by the user.
> 
> One of the problems with the current ABI definitions is that several
> symbols define wildcards, on non-standard ways. The more commonly
> wildcards used there are:
> 
> 	<foo>
> 	{foo}
> 	[foo]
> 	X
> 	Y
> 	Z
> 	/.../
> 
> The script converts the above wildcards into (somewhat relaxed)
> regexes.
> 
> There's one place using  "(some description)". This one is harder to
> parse, as parenthesis are used by the parsing regexes. As this happens
> only on one file, patch 4 addresses such case.
> 
> Patch 5 to 9 fix some other ABI troubles I identified.
> 
> In long term, perhaps the better would be to just use regex on What:
> fields, as this would avoid extra heuristics at get_abi.pl, but this is
> OOT from this patch, and would mean a large number of changes.

This is cool stuff, thanks for doing this!

I'll look at it more once 5.15-rc1 is out, thanks.

greg k-h

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH 0/9] get_abi.pl: Check for missing symbols at the ABI specs
  2021-09-09 13:51 ` [PATCH 0/9] " Greg KH
@ 2021-09-14 14:24   ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 4+ messages in thread
From: Mauro Carvalho Chehab @ 2021-09-14 14:24 UTC (permalink / raw)
  To: Greg KH
  Cc: Linux Doc Mailing List, linux-kernel, Jonathan Corbet,
	Anton Vorontsov, Colin Cross, John Fastabend, KP Singh,
	Kees Cook, Martin KaFai Lau, Song Liu, Tony Luck, Yonghong Song,
	bpf, netdev

Em Thu, 9 Sep 2021 15:51:00 +0200
Greg KH <gregkh@linuxfoundation.org> escreveu:

> On Wed, Sep 08, 2021 at 04:58:47PM +0200, Mauro Carvalho Chehab wrote:
> > Hi Greg,
> > 
> > Sometime ago, I discussed with Jonathan Cameron about providing 
> > a way check that the ABI documentation is incomplete.
> > 
> > While it would be doable to validate the ABI by searching __ATTR and 
> > similar macros around the driver, this would probably be very complex
> > and would take a while to parse.
> > 
> > So, I ended by implementing a new feature at scripts/get_abi.pl
> > which does a check on the sysfs contents of a running system:
> > it reads everything under /sys and reads the entire ABI from
> > Documentation/ABI. It then warns for symbols that weren't found,
> > optionally showing possible candidates that might be misdefined.
> > 
> > I opted to place it on 3 patches:
> > 
> > The first patch adds the basic logic. It runs really quicky (up to 2
> > seconds), but it doesn't use sysfs softlinks.
> > 
> > Patch 2 adds support for also parsing softlinks. It slows the logic,
> > with now takes ~40 seconds to run on my desktop (and ~23
> > seconds on a HiKey970 ARM board). There are space there for
> > performance improvements, by using a more sophisticated
> > algorithm, at the expense of making the code harder to
> > understand. I ended opting to use a simple implementation
> > for now, as ~40 seconds sounds acceptable on my eyes.
> > 
> > Patch 3 adds an optional parameter to allow filtering the results
> > using a regex given by the user.
> > 
> > One of the problems with the current ABI definitions is that several
> > symbols define wildcards, on non-standard ways. The more commonly
> > wildcards used there are:
> > 
> > 	<foo>
> > 	{foo}
> > 	[foo]
> > 	X
> > 	Y
> > 	Z
> > 	/.../
> > 
> > The script converts the above wildcards into (somewhat relaxed)
> > regexes.
> > 
> > There's one place using  "(some description)". This one is harder to
> > parse, as parenthesis are used by the parsing regexes. As this happens
> > only on one file, patch 4 addresses such case.
> > 
> > Patch 5 to 9 fix some other ABI troubles I identified.
> > 
> > In long term, perhaps the better would be to just use regex on What:
> > fields, as this would avoid extra heuristics at get_abi.pl, but this is
> > OOT from this patch, and would mean a large number of changes.  
> 
> This is cool stuff, thanks for doing this!
> 
> I'll look at it more once 5.15-rc1 is out, thanks.

FYI, there's a new version at:

	https://git.kernel.org/pub/scm/linux/kernel/git/mchehab/devel.git/log/?h=get_undefined

In order for get_abi.pl to convert What: into regex, changes are needed on
existing ABI files. One alternative would be to convert everything into
regex, but that would probably mean that most ABI files would require work.

In order to avoid a huge number of patches/changes, I opted to touch only
the ones that aren't following the de-facto wildcard standards already 
found on most of the ABI files. So, I added support at get_abi.pl to
consider those patterns as wildcards:

	/.../
	*
	<foo>
	X
	Y
	Z
	[0-9] (and variants)

The files that use something else meaning a wildcard need changes, in order
to avoid ambiguity when the script decides if a character is either a 
wildcard or not. 

One of the issues there is with "N". several files use it as a wildcard, 
but USB sysfs parameters have several ABI nodes with an uppercase "N"
letter (like bNumInterfaces and such). So, this one had to be converted
too (and represents the vast majority of patches).

Anyway, as the number of such patches is high, I'll submit the work 
on three separate series:

	- What: changes needed for regex conversion;
	- get_abi.pl updates;
	- Some additions for missing symbols found on my
	  desktop.

Thanks,
Mauro

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-09-14 14:24 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-08 14:58 [PATCH 0/9] get_abi.pl: Check for missing symbols at the ABI specs Mauro Carvalho Chehab
2021-09-08 14:58 ` [PATCH 1/9] scripts: " Mauro Carvalho Chehab
2021-09-09 13:51 ` [PATCH 0/9] " Greg KH
2021-09-14 14:24   ` Mauro Carvalho Chehab

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).