All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] get_abi: improve message output and fix a regression
@ 2021-09-28 10:14 Mauro Carvalho Chehab
  2021-09-28 10:14 ` [PATCH 1/3] scripts: get_abi.pl: use STDERR for search-string and show-hints Mauro Carvalho Chehab
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Mauro Carvalho Chehab @ 2021-09-28 10:14 UTC (permalink / raw)
  To: Linux Doc Mailing List, Greg Kroah-Hartman
  Cc: Mauro Carvalho Chehab, linux-kernel, Jonathan Corbet

Hi Greg,

As promised on

	https://lore.kernel.org/lkml/20210928120304.62319fba@coco.lan/T/#u

I'm adding progress info when  get_abi.pl is checking for undefined ABI symbols
on patches 1 and 2.

That will help not only to identify what is causing delays on the script, but also
to notify the user that processing it could take some time on some systems.

If you run it on your big server with:

  scripts/get_abi.pl undefined 2>logs

The "logs" file will contain timestamps relative to the time the script started to
do the regex matches for sysfs files. It should be printing one line every
time the progress completes 1% or one second after the last progress output.

-

Patch 3 is just a small fix for an issue introduced by an earlier change at
the script (unrelated to the other two patches).

Mauro Carvalho Chehab (3):
  scripts:: use STDERR for search-string and show-hints
  scripts: get_abi.pl: show progress
  ABI: evm: place a second what at the next line

 Documentation/ABI/testing/evm |   3 +-
 scripts/get_abi.pl            | 165 ++++++++++++++++++++++------------
 2 files changed, 111 insertions(+), 57 deletions(-)

-- 
2.31.1



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/3] scripts: get_abi.pl: use STDERR for search-string and show-hints
  2021-09-28 10:14 [PATCH 0/3] get_abi: improve message output and fix a regression Mauro Carvalho Chehab
@ 2021-09-28 10:14 ` Mauro Carvalho Chehab
  2021-09-28 10:14 ` [PATCH 2/3] scripts: get_abi.pl: show progress Mauro Carvalho Chehab
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 12+ messages in thread
From: Mauro Carvalho Chehab @ 2021-09-28 10:14 UTC (permalink / raw)
  To: Linux Doc Mailing List, Greg Kroah-Hartman
  Cc: Mauro Carvalho Chehab, Jonathan Corbet, linux-kernel

On undefined checks, use STDOUT only for the not found entries.

All other data (search-string and show-hints) is printed at
STDERR.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---

To mailbombing on a large number of people, only mailing lists were C/C on the cover.
See [PATCH 0/3] at: https://lore.kernel.org/all/cover.1632823172.git.mchehab+huawei@kernel.org/

 scripts/get_abi.pl | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/scripts/get_abi.pl b/scripts/get_abi.pl
index 4978163f5b16..a6c91f822363 100755
--- a/scripts/get_abi.pl
+++ b/scripts/get_abi.pl
@@ -728,9 +728,9 @@ sub check_undefined_symbols {
 		for (my $i = 0; $i < @names; $i++) {
 			if ($found_string && $hint) {
 				if (!$i) {
-					print "--> $names[$i]\n";
+					print STDERR "--> $names[$i]\n";
 				} else {
-					print "    $names[$i]\n";
+					print STDERR "    $names[$i]\n";
 				}
 			}
 			foreach my $re (@expr) {
@@ -760,17 +760,17 @@ sub check_undefined_symbols {
 		}
 		next if ($exact);
 
+		print "$file not found.\n" if (!$search_string || $found_string);
+
 		if ($hint && (!$search_string || $found_string)) {
 			my $what = $leaf{$leave}->{what};
 			$what =~ s/\xac/\n\t/g;
 			if ($leave ne "others") {
-				print "    more likely regexes:\n\t$what\n";
+				print STDERR "    more likely regexes:\n\t$what\n";
 			} else {
-				print "    tested regexes:\n\t$what\n";
+				print STDERR "    tested regexes:\n\t$what\n";
 			}
-			next;
 		}
-		print "$file not found.\n" if (!$search_string || $found_string);
 	}
 }
 
@@ -852,7 +852,7 @@ sub undefined_symbols {
 				}
 			}
 			if ($search_string && $added) {
-				print "What: $what\n" if ($what =~ m#$search_string#);
+				print STDERR "What: $what\n" if ($what =~ m#$search_string#);
 			}
 
 		}
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/3] scripts: get_abi.pl: show progress
  2021-09-28 10:14 [PATCH 0/3] get_abi: improve message output and fix a regression Mauro Carvalho Chehab
  2021-09-28 10:14 ` [PATCH 1/3] scripts: get_abi.pl: use STDERR for search-string and show-hints Mauro Carvalho Chehab
@ 2021-09-28 10:14 ` Mauro Carvalho Chehab
  2021-09-28 10:14 ` [PATCH 3/3] ABI: evm: place a second what at the next line Mauro Carvalho Chehab
  2021-09-28 11:04 ` [PATCH 0/3] get_abi: improve message output and fix a regression Greg Kroah-Hartman
  3 siblings, 0 replies; 12+ messages in thread
From: Mauro Carvalho Chehab @ 2021-09-28 10:14 UTC (permalink / raw)
  To: Linux Doc Mailing List, Greg Kroah-Hartman
  Cc: Mauro Carvalho Chehab, Jonathan Corbet, linux-kernel

As parsing the sysfs entries can take a long time, add
progress information.

The progress logic will update the stats on every second,
or on 1% steps of the progress.

When STDERR is a console, it will use a single line, using
a VT-100 command to erase the line before rewriting it.
Otherwise, it will put one message on a separate line.
That would help to identify what parts of sysfs checking
that it is taking more time to process.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---

To mailbombing on a large number of people, only mailing lists were C/C on the cover.
See [PATCH 0/3] at: https://lore.kernel.org/all/cover.1632823172.git.mchehab+huawei@kernel.org/

 scripts/get_abi.pl | 163 ++++++++++++++++++++++++++++++---------------
 1 file changed, 108 insertions(+), 55 deletions(-)

diff --git a/scripts/get_abi.pl b/scripts/get_abi.pl
index a6c91f822363..841d889747c0 100755
--- a/scripts/get_abi.pl
+++ b/scripts/get_abi.pl
@@ -9,6 +9,7 @@ use utf8;
 use Pod::Usage qw(pod2usage);
 use Getopt::Long;
 use File::Find;
+use IO::Handle;
 use Fcntl ':mode';
 use Cwd 'abs_path';
 use Data::Dumper;
@@ -702,87 +703,137 @@ sub get_leave($)
 	return $leave;
 }
 
-sub check_undefined_symbols {
-	foreach my $file_ref (sort @files) {
-		my @names = @{$$file_ref{"__name"}};
-		my $file = $names[0];
+my @not_found;
 
-		my $exact = 0;
-		my $found_string;
+sub check_file($$)
+{
+	my $file_ref = shift;
+	my $names_ref = shift;
+	my @names = @{$names_ref};
+	my $file = $names[0];
 
-		my $leave = get_leave($file);
-		if (!defined($leaf{$leave})) {
-			$leave = "others";
+	my $found_string;
+
+	my $leave = get_leave($file);
+	if (!defined($leaf{$leave})) {
+		$leave = "others";
+	}
+	my @expr = @{$leaf{$leave}->{expr}};
+	die ("\rmissing rules for $leave") if (!defined($leaf{$leave}));
+
+	my $path = $file;
+	$path =~ s,(.*/).*,$1,;
+
+	if ($search_string) {
+		return if (!($file =~ m#$search_string#));
+		$found_string = 1;
+	}
+
+	for (my $i = 0; $i < @names; $i++) {
+		if ($found_string && $hint) {
+			if (!$i) {
+				print STDERR "--> $names[$i]\n";
+			} else {
+				print STDERR "    $names[$i]\n";
+			}
+		}
+		foreach my $re (@expr) {
+			print STDERR "$names[$i] =~ /^$re\$/\n" if ($debug && $dbg_undefined);
+			if ($names[$i] =~ $re) {
+				return;
+			}
 		}
+	}
+
+	if ($leave ne "others") {
 		my @expr = @{$leaf{$leave}->{expr}};
-		die ("missing rules for $leave") if (!defined($leaf{$leave}));
-
-		my $path = $file;
-		$path =~ s,(.*/).*,$1,;
-
-		if ($search_string) {
-			next if (!($file =~ m#$search_string#));
-			$found_string = 1;
-		}
-
 		for (my $i = 0; $i < @names; $i++) {
-			if ($found_string && $hint) {
-				if (!$i) {
-					print STDERR "--> $names[$i]\n";
-				} else {
-					print STDERR "    $names[$i]\n";
-				}
-			}
 			foreach my $re (@expr) {
-				print "$names[$i] =~ /^$re\$/\n" if ($debug && $dbg_undefined);
+				print STDERR "$names[$i] =~ /^$re\$/\n" if ($debug && $dbg_undefined);
 				if ($names[$i] =~ $re) {
-					$exact = 1;
-					last;
+					return;
 				}
 			}
-			last if ($exact);
 		}
-		next if ($exact);
+	}
 
+	push @not_found, $file if (!$search_string || $found_string);
+
+	if ($hint && (!$search_string || $found_string)) {
+		my $what = $leaf{$leave}->{what};
+		$what =~ s/\xac/\n\t/g;
 		if ($leave ne "others") {
-			my @expr = @{$leaf{$leave}->{expr}};
-			for (my $i = 0; $i < @names; $i++) {
-				foreach my $re (@expr) {
-					print "$names[$i] =~ /^$re\$/\n" if ($debug && $dbg_undefined);
-					if ($names[$i] =~ $re) {
-						$exact = 1;
-						last;
-					}
-				}
-				last if ($exact);
-			}
-			last if ($exact);
+			print STDERR "\r    more likely regexes:\n\t$what\n";
+		} else {
+			print STDERR "\r    tested regexes:\n\t$what\n";
 		}
-		next if ($exact);
-
-		print "$file not found.\n" if (!$search_string || $found_string);
-
-		if ($hint && (!$search_string || $found_string)) {
-			my $what = $leaf{$leave}->{what};
-			$what =~ s/\xac/\n\t/g;
-			if ($leave ne "others") {
-				print STDERR "    more likely regexes:\n\t$what\n";
-			} else {
-				print STDERR "    tested regexes:\n\t$what\n";
-			}
+	}
+}
+
+sub check_undefined_symbols {
+	my $num_files = scalar @files;
+	my $next_i = 0;
+	my $start_time = times;
+
+	my $last_time = $start_time;
+
+	# When either debug or hint is enabled, there's no sense showing
+	# progress, as the progress will be overriden.
+	if ($hint || ($debug && $dbg_undefined)) {
+		$next_i = $num_files;
+	}
+
+	my $is_console;
+	$is_console = 1 if (-t STDERR);
+
+	for (my $i = 0; $i < $num_files; $i++) {
+		my $file_ref = $files[$i];
+		my @names = @{$$file_ref{"__name"}};
+
+		check_file($file_ref, \@names);
+
+		my $cur_time = times;
+
+		if ($i == $next_i || $cur_time > $last_time + 1) {
+			my $percent = $i * 100 / $num_files;
+
+			my $tm = $cur_time - $start_time;
+			my $time = sprintf "%d:%02d", int($tm), 60 * ($tm - int($tm));
+
+			printf STDERR "\33[2K\r", if ($is_console);
+			printf STDERR "%s: processing sysfs files... %i%%: $names[0]", $time, $percent;
+			printf STDERR "\n", if (!$is_console);
+			STDERR->flush();
+
+			$next_i = int (($percent + 1) * $num_files / 100);
+			$last_time = $cur_time;
 		}
 	}
+
+	my $cur_time = times;
+	my $tm = $cur_time - $start_time;
+	my $time = sprintf "%d:%02d", int($tm), 60 * ($tm - int($tm));
+
+	printf STDERR "\33[2K\r", if ($is_console);
+	printf STDERR "%s: processing sysfs files... done\n", $time;
+
+	foreach my $file (@not_found) {
+		print "$file not found.\n";
+	}
 }
 
 sub undefined_symbols {
+	print STDERR "Reading $sysfs_prefix directory contents...";
 	find({
 		wanted =>\&parse_existing_sysfs,
 		preprocess =>\&dont_parse_special_attributes,
 		no_chdir => 1
 	     }, $sysfs_prefix);
+	print STDERR "done.\n";
 
 	$leaf{"others"}->{what} = "";
 
+	print STDERR "Converting ABI What fields into regexes...";
 	foreach my $w (sort keys %data) {
 		foreach my $what (split /\xac/,$w) {
 			next if (!($what =~ m/^$sysfs_prefix/));
@@ -871,6 +922,8 @@ sub undefined_symbols {
 		my $abs_file = $aliases{$link};
 		graph_add_link($abs_file, $link);
 	}
+	print STDERR "done.\n";
+
 	check_undefined_symbols;
 }
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/3] ABI: evm: place a second what at the next line
  2021-09-28 10:14 [PATCH 0/3] get_abi: improve message output and fix a regression Mauro Carvalho Chehab
  2021-09-28 10:14 ` [PATCH 1/3] scripts: get_abi.pl: use STDERR for search-string and show-hints Mauro Carvalho Chehab
  2021-09-28 10:14 ` [PATCH 2/3] scripts: get_abi.pl: show progress Mauro Carvalho Chehab
@ 2021-09-28 10:14 ` Mauro Carvalho Chehab
  2021-09-28 11:04 ` [PATCH 0/3] get_abi: improve message output and fix a regression Greg Kroah-Hartman
  3 siblings, 0 replies; 12+ messages in thread
From: Mauro Carvalho Chehab @ 2021-09-28 10:14 UTC (permalink / raw)
  To: Linux Doc Mailing List, Greg Kroah-Hartman
  Cc: Mauro Carvalho Chehab, Jonathan Corbet, Fabrice Gasnier,
	Mimi Zohar, Roberto Sassu, linux-kernel

Originally, get_abi.pl was using spaces to separate What: parameters,
but there are several references that declare things like:

	/sys/class/powercap/.../<power zone>/enabled

So, the logic was changes in order to properly address it.
That broke the second What added by
Changeset 18e49b304633 ("ABI: security: fix location for evm and ima_policy").

As the only file that defines multiple What: at the same line is
this file, let's move the second What: to a separate line.

Fixes: 18e49b304633 ("ABI: security: fix location for evm and ima_policy")
Fixes: ab9c14805b37 ("scripts: get_abi.pl: Better handle multiple What parameters")
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---

To mailbombing on a large number of people, only mailing lists were C/C on the cover.
See [PATCH 0/3] at: https://lore.kernel.org/all/cover.1632823172.git.mchehab+huawei@kernel.org/

 Documentation/ABI/testing/evm | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/Documentation/ABI/testing/evm b/Documentation/ABI/testing/evm
index 4b76a19b7bb4..44750a933db4 100644
--- a/Documentation/ABI/testing/evm
+++ b/Documentation/ABI/testing/evm
@@ -1,4 +1,5 @@
-What:		/sys/kernel/security/evm /sys/kernel/security/*/evm
+What:		/sys/kernel/security/evm
+What:		/sys/kernel/security/*/evm
 Date:		March 2011
 Contact:	Mimi Zohar <zohar@us.ibm.com>
 Description:
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/3] get_abi: improve message output and fix a regression
  2021-09-28 10:14 [PATCH 0/3] get_abi: improve message output and fix a regression Mauro Carvalho Chehab
                   ` (2 preceding siblings ...)
  2021-09-28 10:14 ` [PATCH 3/3] ABI: evm: place a second what at the next line Mauro Carvalho Chehab
@ 2021-09-28 11:04 ` Greg Kroah-Hartman
  2021-09-28 12:27   ` Mauro Carvalho Chehab
  3 siblings, 1 reply; 12+ messages in thread
From: Greg Kroah-Hartman @ 2021-09-28 11:04 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Linux Doc Mailing List, linux-kernel, Jonathan Corbet

On Tue, Sep 28, 2021 at 12:14:01PM +0200, Mauro Carvalho Chehab wrote:
> Hi Greg,
> 
> As promised on
> 
> 	https://lore.kernel.org/lkml/20210928120304.62319fba@coco.lan/T/#u
> 
> I'm adding progress info when  get_abi.pl is checking for undefined ABI symbols
> on patches 1 and 2.
> 
> That will help not only to identify what is causing delays on the script, but also
> to notify the user that processing it could take some time on some systems.
> 
> If you run it on your big server with:
> 
>   scripts/get_abi.pl undefined 2>logs
> 
> The "logs" file will contain timestamps relative to the time the script started to
> do the regex matches for sysfs files. It should be printing one line every
> time the progress completes 1% or one second after the last progress output.

Adding more debugging and tweaking the script a bit to show the file it
is about to check, not the one it finished checking, I got the following
debug output that seems to pinpoint the problem file.

The sysfs file that is causing problems is:
	/sys/devices/pci0000:40/0000:40:00.2/iommu/ivhd1/amd-iommu/cap

and here's some debugging output for the regex it needs to search for
this:

/sys/devices/pci0000:40/0000:40:00.2/iommu/ivhd1/amd-iommu/cap =~ /^(?^:^/sys/class/iommu/.*/amd\-iommu/cap$)$/
/sys/devices/pci0000:40/0000:40:00.2/iommu/ivhd1/amd-iommu/cap =~ /^(?^:^/sys/class/iommu/.*/intel\-iommu/cap$)$/
/sys/devices/pci0000:40/0000:40:00.2/iommu/ivhd1/amd-iommu/cap =~ /^(?^:^/sys/devices/pci.*.*.*.*\:.*.*/0000\:.*.*\:.*.*..*/dma/dma.*chan.*/quickdata/cap$)$/
/sys/devices/pci0000:40/0000:40:07.0/iommu/amd-iommu/cap =~ /^(?^:^/sys/class/iommu/.*/amd\-iommu/cap$)$/
/sys/devices/pci0000:40/0000:40:07.0/iommu/amd-iommu/cap =~ /^(?^:^/sys/class/iommu/.*/intel\-iommu/cap$)$/
/sys/devices/pci0000:40/0000:40:07.0/iommu/amd-iommu/cap =~ /^(?^:^/sys/devices/pci.*.*.*.*\:.*.*/0000\:.*.*\:.*.*..*/dma/dma.*chan.*/quickdata/cap$)$/
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:02/device:7a/physical_node/iommu/ivhd1/amd-iommu/cap =~ /^(?^:^/sys/class/iommu/.*/amd\-iommu/cap$)$/
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:02/device:7a/physical_node/iommu/ivhd1/amd-iommu/cap =~ /^(?^:^/sys/class/iommu/.*/intel\-iommu/cap$)$/
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:02/device:7a/physical_node/iommu/ivhd1/amd-iommu/cap =~ /^(?^:^/sys/devices/pci.*.*.*.*\:.*.*/0000\:.*.*\:.*.*..*/dma/dma.*chan.*/quickdata/cap$)$/
/sys/devices/pci0000:40/0000:40:01.3/0000:4a:00.0/0000:4b:0a.0/0000:50:00.0/iommu/amd-iommu/cap =~ /^(?^:^/sys/class/iommu/.*/amd\-iommu/cap$)$/
/sys/devices/pci0000:40/0000:40:01.3/0000:4a:00.0/0000:4b:0a.0/0000:50:00.0/iommu/amd-iommu/cap =~ /^(?^:^/sys/class/iommu/.*/intel\-iommu/cap$)$/
/sys/devices/pci0000:40/0000:40:01.3/0000:4a:00.0/0000:4b:0a.0/0000:50:00.0/iommu/amd-iommu/cap =~ /^(?^:^/sys/devices/pci.*.*.*.*\:.*.*/0000\:.*.*\:.*.*..*/dma/dma.*chan.*/quickdata/cap$)$/


And sometimes this thing finishes in 20 seconds, and others, many many
minutes.  It's not deterministic at all, which is odd.  Is the sysfs
tree being sorted so that this should always have the same search order?

Anyway, I've applied this series as well, this helps in finding the
problems :)

Note, I can provide an off-list tarball of /sys/ if that would help in
debugging anything on your end.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/3] get_abi: improve message output and fix a regression
  2021-09-28 11:04 ` [PATCH 0/3] get_abi: improve message output and fix a regression Greg Kroah-Hartman
@ 2021-09-28 12:27   ` Mauro Carvalho Chehab
  2021-09-28 13:43     ` Mauro Carvalho Chehab
  2021-09-28 17:18     ` Greg Kroah-Hartman
  0 siblings, 2 replies; 12+ messages in thread
From: Mauro Carvalho Chehab @ 2021-09-28 12:27 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: Linux Doc Mailing List, linux-kernel, Jonathan Corbet

Em Tue, 28 Sep 2021 13:04:22 +0200
Greg Kroah-Hartman <gregkh@linuxfoundation.org> escreveu:

> On Tue, Sep 28, 2021 at 12:14:01PM +0200, Mauro Carvalho Chehab wrote:
> > Hi Greg,
> > 
> > As promised on
> > 
> > 	https://lore.kernel.org/lkml/20210928120304.62319fba@coco.lan/T/#u
> > 
> > I'm adding progress info when  get_abi.pl is checking for undefined ABI symbols
> > on patches 1 and 2.
> > 
> > That will help not only to identify what is causing delays on the script, but also
> > to notify the user that processing it could take some time on some systems.
> > 
> > If you run it on your big server with:
> > 
> >   scripts/get_abi.pl undefined 2>logs
> > 
> > The "logs" file will contain timestamps relative to the time the script started to
> > do the regex matches for sysfs files. It should be printing one line every
> > time the progress completes 1% or one second after the last progress output.  
> 
> Adding more debugging and tweaking the script a bit to show the file it
> is about to check, not the one it finished checking,

Feel free to modify the script and add such debug/tweaks if you find
it useful. 

> I got the following
> debug output that seems to pinpoint the problem file.
> 
> The sysfs file that is causing problems is:
> 	/sys/devices/pci0000:40/0000:40:00.2/iommu/ivhd1/amd-iommu/cap
> 
> and here's some debugging output for the regex it needs to search for
> this:
> 
> /sys/devices/pci0000:40/0000:40:00.2/iommu/ivhd1/amd-iommu/cap =~ /^(?^:^/sys/class/iommu/.*/amd\-iommu/cap$)$/
> /sys/devices/pci0000:40/0000:40:00.2/iommu/ivhd1/amd-iommu/cap =~ /^(?^:^/sys/class/iommu/.*/intel\-iommu/cap$)$/
> /sys/devices/pci0000:40/0000:40:00.2/iommu/ivhd1/amd-iommu/cap =~ /^(?^:^/sys/devices/pci.*.*.*.*\:.*.*/0000\:.*.*\:.*.*..*/dma/dma.*chan.*/quickdata/cap$)$/
> /sys/devices/pci0000:40/0000:40:07.0/iommu/amd-iommu/cap =~ /^(?^:^/sys/class/iommu/.*/amd\-iommu/cap$)$/
> /sys/devices/pci0000:40/0000:40:07.0/iommu/amd-iommu/cap =~ /^(?^:^/sys/class/iommu/.*/intel\-iommu/cap$)$/
> /sys/devices/pci0000:40/0000:40:07.0/iommu/amd-iommu/cap =~ /^(?^:^/sys/devices/pci.*.*.*.*\:.*.*/0000\:.*.*\:.*.*..*/dma/dma.*chan.*/quickdata/cap$)$/
> /sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:02/device:7a/physical_node/iommu/ivhd1/amd-iommu/cap =~ /^(?^:^/sys/class/iommu/.*/amd\-iommu/cap$)$/
> /sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:02/device:7a/physical_node/iommu/ivhd1/amd-iommu/cap =~ /^(?^:^/sys/class/iommu/.*/intel\-iommu/cap$)$/
> /sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:02/device:7a/physical_node/iommu/ivhd1/amd-iommu/cap =~ /^(?^:^/sys/devices/pci.*.*.*.*\:.*.*/0000\:.*.*\:.*.*..*/dma/dma.*chan.*/quickdata/cap$)$/
> /sys/devices/pci0000:40/0000:40:01.3/0000:4a:00.0/0000:4b:0a.0/0000:50:00.0/iommu/amd-iommu/cap =~ /^(?^:^/sys/class/iommu/.*/amd\-iommu/cap$)$/
> /sys/devices/pci0000:40/0000:40:01.3/0000:4a:00.0/0000:4b:0a.0/0000:50:00.0/iommu/amd-iommu/cap =~ /^(?^:^/sys/class/iommu/.*/intel\-iommu/cap$)$/
> /sys/devices/pci0000:40/0000:40:01.3/0000:4a:00.0/0000:4b:0a.0/0000:50:00.0/iommu/amd-iommu/cap =~ /^(?^:^/sys/devices/pci.*.*.*.*\:.*.*/0000\:.*.*\:.*.*..*/dma/dma.*chan.*/quickdata/cap$)$/

Hmm... interesting. Perhaps the problem is on regexes like this:

	/^(?^:^/sys/devices/pci.*.*.*.*\:.*.*/0000\:.*.*\:.*.*..*/dma/dma.*chan.*/quickdata/cap$)$/

Which actually represents this What:
	/sys/devices/pciXXXX:XX/0000:XX:XX.X/dma/dma<n>chan<n>/quickdata/cap

The script could have done a better job escaping "." character on
it (but that is not too trivial) and grouping altogether ".*" 
repetitions, although, in this specific case, probably the best
regex would be, instead:

	/sys/devices/pci[\da-f:\.]+/dma/dma\d+chan\d+/quickdata/cap

One possible long-term solution would be to directly use regexes 
directly on "What" fields inside Documentation/ABI, but on some parts
this would require some changes, like, for instance:

	/sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/kone/roccatkone<minor>/weight

Ok, we could likely use capture groups like:

	(?<NAME>pattern)

but IMO that would make it a lot harder to be understood by humans.


> And sometimes this thing finishes in 20 seconds, and others, many many
> minutes.  It's not deterministic at all, which is odd.  Is the sysfs
> tree being sorted so that this should always have the same search order?

No, because it uses a lot of hashes in order to speed it up. Yet,
it shouldn't be hard - nor it would significantly affect the processing
time - to make it more deterministic. See the enclosed path.

> Anyway, I've applied this series as well, this helps in finding the
> problems :)

Thanks!

> Note, I can provide an off-list tarball of /sys/ if that would help in
> debugging anything on your end.

Yeah, that can help. Feel free to send it to me.

Btw, I just got an arm64 server with 128 CPUs for testing. I'm trying
to allocate also a big x86 server here, but I'm not sure if it is AMD or
Intel.

Thanks,
Mauro

[PATCH] scripts: get_abi.pl: make undefined search more deterministic

Sort keys on hashes during undefined search, in order to
make the script more deterministic.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>

diff --git a/scripts/get_abi.pl b/scripts/get_abi.pl
index 841d889747c0..d32dcd7cca5d 100755
--- a/scripts/get_abi.pl
+++ b/scripts/get_abi.pl
@@ -775,6 +775,8 @@ sub check_undefined_symbols {
 	my $next_i = 0;
 	my $start_time = times;
 
+	@files = sort @files;
+
 	my $last_time = $start_time;
 
 	# When either debug or hint is enabled, there's no sense showing
@@ -909,16 +911,16 @@ sub undefined_symbols {
 		}
 	}
 	# Compile regexes
-	foreach my $l (keys %leaf) {
+	foreach my $l (sort keys %leaf) {
 		my @expr;
-		foreach my $w(split /\xac/, $leaf{$l}->{what}) {
+		foreach my $w(sort split /\xac/, $leaf{$l}->{what}) {
 			push @expr, qr /^$w$/;
 		}
 		$leaf{$l}->{expr} = \@expr;
 	}
 
 	# Take links into account
-	foreach my $link (keys %aliases) {
+	foreach my $link (sort keys %aliases) {
 		my $abs_file = $aliases{$link};
 		graph_add_link($abs_file, $link);
 	}


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/3] get_abi: improve message output and fix a regression
  2021-09-28 12:27   ` Mauro Carvalho Chehab
@ 2021-09-28 13:43     ` Mauro Carvalho Chehab
  2021-09-28 17:19       ` Greg Kroah-Hartman
  2021-09-28 17:18     ` Greg Kroah-Hartman
  1 sibling, 1 reply; 12+ messages in thread
From: Mauro Carvalho Chehab @ 2021-09-28 13:43 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: Linux Doc Mailing List, linux-kernel, Jonathan Corbet

Em Tue, 28 Sep 2021 14:27:39 +0200
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> escreveu:

> Em Tue, 28 Sep 2021 13:04:22 +0200
> Greg Kroah-Hartman <gregkh@linuxfoundation.org> escreveu:
> 
> > On Tue, Sep 28, 2021 at 12:14:01PM +0200, Mauro Carvalho Chehab wrote:  
> > > Hi Greg,
> > > 
> > > As promised on
> > > 
> > > 	https://lore.kernel.org/lkml/20210928120304.62319fba@coco.lan/T/#u
> > > 
> > > I'm adding progress info when  get_abi.pl is checking for undefined ABI symbols
> > > on patches 1 and 2.
> > > 
> > > That will help not only to identify what is causing delays on the script, but also
> > > to notify the user that processing it could take some time on some systems.
> > > 
> > > If you run it on your big server with:
> > > 
> > >   scripts/get_abi.pl undefined 2>logs
> > > 
> > > The "logs" file will contain timestamps relative to the time the script started to
> > > do the regex matches for sysfs files. It should be printing one line every
> > > time the progress completes 1% or one second after the last progress output.    
> > 
> > Adding more debugging and tweaking the script a bit to show the file it
> > is about to check, not the one it finished checking,  
> 
> Feel free to modify the script and add such debug/tweaks if you find
> it useful. 
> 
> > I got the following
> > debug output that seems to pinpoint the problem file.
> > 
> > The sysfs file that is causing problems is:
> > 	/sys/devices/pci0000:40/0000:40:00.2/iommu/ivhd1/amd-iommu/cap
> > 
> 
> Btw, I just got an arm64 server with 128 CPUs for testing. I'm trying
> to allocate also a big x86 server here, but I'm not sure if it is AMD or
> Intel.

Some tests on a Gigabyte R182-Z91-00 server, equipped with AMD EPYC 7352 
24-Core Processors (total 96 threads):

	$ find /sys |wc -l
	233981

	$ time ./scripts/get_abi.pl undefined >undefined 2>logs

	real	0m38.917s
	user	0m34.554s
	sys	0m4.292s

PS.: this machine doesn't have anything at /sys/class/iommu.

On a Huawei TaiShan 200 (Model 2280) with 128 ARM cores:

	$ find /sys |wc -l
	99362
	$ time ./scripts/get_abi.pl undefined >undefined 2>logs

	real	0m29.311s
	user	0m26.173s
	sys	0m3.061s

Both machines are using Perl 5.26.

Thanks,
Mauro

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/3] get_abi: improve message output and fix a regression
  2021-09-28 12:27   ` Mauro Carvalho Chehab
  2021-09-28 13:43     ` Mauro Carvalho Chehab
@ 2021-09-28 17:18     ` Greg Kroah-Hartman
  2021-09-28 21:51       ` [PATCH] scripts: get_abi.pl: make undefined search more deterministic Mauro Carvalho Chehab
  2021-09-28 21:54       ` [PATCH 0/3] get_abi: improve message output and fix a regression Mauro Carvalho Chehab
  1 sibling, 2 replies; 12+ messages in thread
From: Greg Kroah-Hartman @ 2021-09-28 17:18 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Linux Doc Mailing List, linux-kernel, Jonathan Corbet

On Tue, Sep 28, 2021 at 02:27:39PM +0200, Mauro Carvalho Chehab wrote:
> Em Tue, 28 Sep 2021 13:04:22 +0200
> Greg Kroah-Hartman <gregkh@linuxfoundation.org> escreveu:
> 
> > On Tue, Sep 28, 2021 at 12:14:01PM +0200, Mauro Carvalho Chehab wrote:
> > > Hi Greg,
> > > 
> > > As promised on
> > > 
> > > 	https://lore.kernel.org/lkml/20210928120304.62319fba@coco.lan/T/#u
> > > 
> > > I'm adding progress info when  get_abi.pl is checking for undefined ABI symbols
> > > on patches 1 and 2.
> > > 
> > > That will help not only to identify what is causing delays on the script, but also
> > > to notify the user that processing it could take some time on some systems.
> > > 
> > > If you run it on your big server with:
> > > 
> > >   scripts/get_abi.pl undefined 2>logs
> > > 
> > > The "logs" file will contain timestamps relative to the time the script started to
> > > do the regex matches for sysfs files. It should be printing one line every
> > > time the progress completes 1% or one second after the last progress output.  
> > 
> > Adding more debugging and tweaking the script a bit to show the file it
> > is about to check, not the one it finished checking,
> 
> Feel free to modify the script and add such debug/tweaks if you find
> it useful. 
> 
> > I got the following
> > debug output that seems to pinpoint the problem file.
> > 
> > The sysfs file that is causing problems is:
> > 	/sys/devices/pci0000:40/0000:40:00.2/iommu/ivhd1/amd-iommu/cap
> > 
> > and here's some debugging output for the regex it needs to search for
> > this:
> > 
> > /sys/devices/pci0000:40/0000:40:00.2/iommu/ivhd1/amd-iommu/cap =~ /^(?^:^/sys/class/iommu/.*/amd\-iommu/cap$)$/
> > /sys/devices/pci0000:40/0000:40:00.2/iommu/ivhd1/amd-iommu/cap =~ /^(?^:^/sys/class/iommu/.*/intel\-iommu/cap$)$/
> > /sys/devices/pci0000:40/0000:40:00.2/iommu/ivhd1/amd-iommu/cap =~ /^(?^:^/sys/devices/pci.*.*.*.*\:.*.*/0000\:.*.*\:.*.*..*/dma/dma.*chan.*/quickdata/cap$)$/
> > /sys/devices/pci0000:40/0000:40:07.0/iommu/amd-iommu/cap =~ /^(?^:^/sys/class/iommu/.*/amd\-iommu/cap$)$/
> > /sys/devices/pci0000:40/0000:40:07.0/iommu/amd-iommu/cap =~ /^(?^:^/sys/class/iommu/.*/intel\-iommu/cap$)$/
> > /sys/devices/pci0000:40/0000:40:07.0/iommu/amd-iommu/cap =~ /^(?^:^/sys/devices/pci.*.*.*.*\:.*.*/0000\:.*.*\:.*.*..*/dma/dma.*chan.*/quickdata/cap$)$/
> > /sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:02/device:7a/physical_node/iommu/ivhd1/amd-iommu/cap =~ /^(?^:^/sys/class/iommu/.*/amd\-iommu/cap$)$/
> > /sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:02/device:7a/physical_node/iommu/ivhd1/amd-iommu/cap =~ /^(?^:^/sys/class/iommu/.*/intel\-iommu/cap$)$/
> > /sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:02/device:7a/physical_node/iommu/ivhd1/amd-iommu/cap =~ /^(?^:^/sys/devices/pci.*.*.*.*\:.*.*/0000\:.*.*\:.*.*..*/dma/dma.*chan.*/quickdata/cap$)$/
> > /sys/devices/pci0000:40/0000:40:01.3/0000:4a:00.0/0000:4b:0a.0/0000:50:00.0/iommu/amd-iommu/cap =~ /^(?^:^/sys/class/iommu/.*/amd\-iommu/cap$)$/
> > /sys/devices/pci0000:40/0000:40:01.3/0000:4a:00.0/0000:4b:0a.0/0000:50:00.0/iommu/amd-iommu/cap =~ /^(?^:^/sys/class/iommu/.*/intel\-iommu/cap$)$/
> > /sys/devices/pci0000:40/0000:40:01.3/0000:4a:00.0/0000:4b:0a.0/0000:50:00.0/iommu/amd-iommu/cap =~ /^(?^:^/sys/devices/pci.*.*.*.*\:.*.*/0000\:.*.*\:.*.*..*/dma/dma.*chan.*/quickdata/cap$)$/
> 
> Hmm... interesting. Perhaps the problem is on regexes like this:
> 
> 	/^(?^:^/sys/devices/pci.*.*.*.*\:.*.*/0000\:.*.*\:.*.*..*/dma/dma.*chan.*/quickdata/cap$)$/
> 
> Which actually represents this What:
> 	/sys/devices/pciXXXX:XX/0000:XX:XX.X/dma/dma<n>chan<n>/quickdata/cap
> 
> The script could have done a better job escaping "." character on
> it (but that is not too trivial) and grouping altogether ".*" 
> repetitions, although, in this specific case, probably the best
> regex would be, instead:
> 
> 	/sys/devices/pci[\da-f:\.]+/dma/dma\d+chan\d+/quickdata/cap
> 
> One possible long-term solution would be to directly use regexes 
> directly on "What" fields inside Documentation/ABI, but on some parts
> this would require some changes, like, for instance:
> 
> 	/sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/kone/roccatkone<minor>/weight
> 
> Ok, we could likely use capture groups like:
> 
> 	(?<NAME>pattern)
> 
> but IMO that would make it a lot harder to be understood by humans.
> 
> 
> > And sometimes this thing finishes in 20 seconds, and others, many many
> > minutes.  It's not deterministic at all, which is odd.  Is the sysfs
> > tree being sorted so that this should always have the same search order?
> 
> No, because it uses a lot of hashes in order to speed it up. Yet,
> it shouldn't be hard - nor it would significantly affect the processing
> time - to make it more deterministic. See the enclosed path.

That patch solved everything.

It now only takes 10 seconds.  Every time.  Without the patch, it feels
hung for some reason.

Care to turn that into a patch that I can take?

> > Anyway, I've applied this series as well, this helps in finding the
> > problems :)
> 
> Thanks!
> 
> > Note, I can provide an off-list tarball of /sys/ if that would help in
> > debugging anything on your end.
> 
> Yeah, that can help. Feel free to send it to me.
> 
> Btw, I just got an arm64 server with 128 CPUs for testing. I'm trying
> to allocate also a big x86 server here, but I'm not sure if it is AMD or
> Intel.

This is on perl 5.34 here.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/3] get_abi: improve message output and fix a regression
  2021-09-28 13:43     ` Mauro Carvalho Chehab
@ 2021-09-28 17:19       ` Greg Kroah-Hartman
  0 siblings, 0 replies; 12+ messages in thread
From: Greg Kroah-Hartman @ 2021-09-28 17:19 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Linux Doc Mailing List, linux-kernel, Jonathan Corbet

On Tue, Sep 28, 2021 at 03:43:53PM +0200, Mauro Carvalho Chehab wrote:
> Em Tue, 28 Sep 2021 14:27:39 +0200
> Mauro Carvalho Chehab <mchehab+huawei@kernel.org> escreveu:
> 
> > Em Tue, 28 Sep 2021 13:04:22 +0200
> > Greg Kroah-Hartman <gregkh@linuxfoundation.org> escreveu:
> > 
> > > On Tue, Sep 28, 2021 at 12:14:01PM +0200, Mauro Carvalho Chehab wrote:  
> > > > Hi Greg,
> > > > 
> > > > As promised on
> > > > 
> > > > 	https://lore.kernel.org/lkml/20210928120304.62319fba@coco.lan/T/#u
> > > > 
> > > > I'm adding progress info when  get_abi.pl is checking for undefined ABI symbols
> > > > on patches 1 and 2.
> > > > 
> > > > That will help not only to identify what is causing delays on the script, but also
> > > > to notify the user that processing it could take some time on some systems.
> > > > 
> > > > If you run it on your big server with:
> > > > 
> > > >   scripts/get_abi.pl undefined 2>logs
> > > > 
> > > > The "logs" file will contain timestamps relative to the time the script started to
> > > > do the regex matches for sysfs files. It should be printing one line every
> > > > time the progress completes 1% or one second after the last progress output.    
> > > 
> > > Adding more debugging and tweaking the script a bit to show the file it
> > > is about to check, not the one it finished checking,  
> > 
> > Feel free to modify the script and add such debug/tweaks if you find
> > it useful. 
> > 
> > > I got the following
> > > debug output that seems to pinpoint the problem file.
> > > 
> > > The sysfs file that is causing problems is:
> > > 	/sys/devices/pci0000:40/0000:40:00.2/iommu/ivhd1/amd-iommu/cap
> > > 
> > 
> > Btw, I just got an arm64 server with 128 CPUs for testing. I'm trying
> > to allocate also a big x86 server here, but I'm not sure if it is AMD or
> > Intel.
> 
> Some tests on a Gigabyte R182-Z91-00 server, equipped with AMD EPYC 7352 
> 24-Core Processors (total 96 threads):
> 
> 	$ find /sys |wc -l
> 	233981
> 
> 	$ time ./scripts/get_abi.pl undefined >undefined 2>logs
> 
> 	real	0m38.917s
> 	user	0m34.554s
> 	sys	0m4.292s
> 
> PS.: this machine doesn't have anything at /sys/class/iommu.
> 
> On a Huawei TaiShan 200 (Model 2280) with 128 ARM cores:
> 
> 	$ find /sys |wc -l
> 	99362
> 	$ time ./scripts/get_abi.pl undefined >undefined 2>logs
> 
> 	real	0m29.311s
> 	user	0m26.173s
> 	sys	0m3.061s
> 
> Both machines are using Perl 5.26.

Try it with your sorting patch, that fixed it for me and now it runs in
10 seconds.

crazy.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH] scripts: get_abi.pl: make undefined search more deterministic
  2021-09-28 17:18     ` Greg Kroah-Hartman
@ 2021-09-28 21:51       ` Mauro Carvalho Chehab
  2021-09-29  7:13         ` Greg Kroah-Hartman
  2021-09-28 21:54       ` [PATCH 0/3] get_abi: improve message output and fix a regression Mauro Carvalho Chehab
  1 sibling, 1 reply; 12+ messages in thread
From: Mauro Carvalho Chehab @ 2021-09-28 21:51 UTC (permalink / raw)
  To: Linux Doc Mailing List, Greg Kroah-Hartman
  Cc: Mauro Carvalho Chehab, Jonathan Corbet, linux-kernel

Sort keys on hashes during undefined search, in order to
make the script more deterministic.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 scripts/get_abi.pl | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/scripts/get_abi.pl b/scripts/get_abi.pl
index 841d889747c0..d32dcd7cca5d 100755
--- a/scripts/get_abi.pl
+++ b/scripts/get_abi.pl
@@ -775,6 +775,8 @@ sub check_undefined_symbols {
 	my $next_i = 0;
 	my $start_time = times;
 
+	@files = sort @files;
+
 	my $last_time = $start_time;
 
 	# When either debug or hint is enabled, there's no sense showing
@@ -909,16 +911,16 @@ sub undefined_symbols {
 		}
 	}
 	# Compile regexes
-	foreach my $l (keys %leaf) {
+	foreach my $l (sort keys %leaf) {
 		my @expr;
-		foreach my $w(split /\xac/, $leaf{$l}->{what}) {
+		foreach my $w(sort split /\xac/, $leaf{$l}->{what}) {
 			push @expr, qr /^$w$/;
 		}
 		$leaf{$l}->{expr} = \@expr;
 	}
 
 	# Take links into account
-	foreach my $link (keys %aliases) {
+	foreach my $link (sort keys %aliases) {
 		my $abs_file = $aliases{$link};
 		graph_add_link($abs_file, $link);
 	}
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/3] get_abi: improve message output and fix a regression
  2021-09-28 17:18     ` Greg Kroah-Hartman
  2021-09-28 21:51       ` [PATCH] scripts: get_abi.pl: make undefined search more deterministic Mauro Carvalho Chehab
@ 2021-09-28 21:54       ` Mauro Carvalho Chehab
  1 sibling, 0 replies; 12+ messages in thread
From: Mauro Carvalho Chehab @ 2021-09-28 21:54 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: Linux Doc Mailing List, linux-kernel, Jonathan Corbet

Em Tue, 28 Sep 2021 19:18:31 +0200
Greg Kroah-Hartman <gregkh@linuxfoundation.org> escreveu:

> On Tue, Sep 28, 2021 at 02:27:39PM +0200, Mauro Carvalho Chehab wrote:
> > Em Tue, 28 Sep 2021 13:04:22 +0200
> > Greg Kroah-Hartman <gregkh@linuxfoundation.org> escreveu:
> >   
> > > On Tue, Sep 28, 2021 at 12:14:01PM +0200, Mauro Carvalho Chehab wrote:  
> > > > Hi Greg,
> > > > 
> > > > As promised on
> > > > 
> > > > 	https://lore.kernel.org/lkml/20210928120304.62319fba@coco.lan/T/#u
> > > > 
> > > > I'm adding progress info when  get_abi.pl is checking for undefined ABI symbols
> > > > on patches 1 and 2.
> > > > 
> > > > That will help not only to identify what is causing delays on the script, but also
> > > > to notify the user that processing it could take some time on some systems.
> > > > 
> > > > If you run it on your big server with:
> > > > 
> > > >   scripts/get_abi.pl undefined 2>logs
> > > > 
> > > > The "logs" file will contain timestamps relative to the time the script started to
> > > > do the regex matches for sysfs files. It should be printing one line every
> > > > time the progress completes 1% or one second after the last progress output.    
> > > 
> > > Adding more debugging and tweaking the script a bit to show the file it
> > > is about to check, not the one it finished checking,  
> > 
> > Feel free to modify the script and add such debug/tweaks if you find
> > it useful. 
> >   
> > > I got the following
> > > debug output that seems to pinpoint the problem file.
> > > 
> > > The sysfs file that is causing problems is:
> > > 	/sys/devices/pci0000:40/0000:40:00.2/iommu/ivhd1/amd-iommu/cap
> > > 
> > > and here's some debugging output for the regex it needs to search for
> > > this:
> > > 
> > > /sys/devices/pci0000:40/0000:40:00.2/iommu/ivhd1/amd-iommu/cap =~ /^(?^:^/sys/class/iommu/.*/amd\-iommu/cap$)$/
> > > /sys/devices/pci0000:40/0000:40:00.2/iommu/ivhd1/amd-iommu/cap =~ /^(?^:^/sys/class/iommu/.*/intel\-iommu/cap$)$/
> > > /sys/devices/pci0000:40/0000:40:00.2/iommu/ivhd1/amd-iommu/cap =~ /^(?^:^/sys/devices/pci.*.*.*.*\:.*.*/0000\:.*.*\:.*.*..*/dma/dma.*chan.*/quickdata/cap$)$/
> > > /sys/devices/pci0000:40/0000:40:07.0/iommu/amd-iommu/cap =~ /^(?^:^/sys/class/iommu/.*/amd\-iommu/cap$)$/
> > > /sys/devices/pci0000:40/0000:40:07.0/iommu/amd-iommu/cap =~ /^(?^:^/sys/class/iommu/.*/intel\-iommu/cap$)$/
> > > /sys/devices/pci0000:40/0000:40:07.0/iommu/amd-iommu/cap =~ /^(?^:^/sys/devices/pci.*.*.*.*\:.*.*/0000\:.*.*\:.*.*..*/dma/dma.*chan.*/quickdata/cap$)$/
> > > /sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:02/device:7a/physical_node/iommu/ivhd1/amd-iommu/cap =~ /^(?^:^/sys/class/iommu/.*/amd\-iommu/cap$)$/
> > > /sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:02/device:7a/physical_node/iommu/ivhd1/amd-iommu/cap =~ /^(?^:^/sys/class/iommu/.*/intel\-iommu/cap$)$/
> > > /sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:02/device:7a/physical_node/iommu/ivhd1/amd-iommu/cap =~ /^(?^:^/sys/devices/pci.*.*.*.*\:.*.*/0000\:.*.*\:.*.*..*/dma/dma.*chan.*/quickdata/cap$)$/
> > > /sys/devices/pci0000:40/0000:40:01.3/0000:4a:00.0/0000:4b:0a.0/0000:50:00.0/iommu/amd-iommu/cap =~ /^(?^:^/sys/class/iommu/.*/amd\-iommu/cap$)$/
> > > /sys/devices/pci0000:40/0000:40:01.3/0000:4a:00.0/0000:4b:0a.0/0000:50:00.0/iommu/amd-iommu/cap =~ /^(?^:^/sys/class/iommu/.*/intel\-iommu/cap$)$/
> > > /sys/devices/pci0000:40/0000:40:01.3/0000:4a:00.0/0000:4b:0a.0/0000:50:00.0/iommu/amd-iommu/cap =~ /^(?^:^/sys/devices/pci.*.*.*.*\:.*.*/0000\:.*.*\:.*.*..*/dma/dma.*chan.*/quickdata/cap$)$/  
> > 
> > Hmm... interesting. Perhaps the problem is on regexes like this:
> > 
> > 	/^(?^:^/sys/devices/pci.*.*.*.*\:.*.*/0000\:.*.*\:.*.*..*/dma/dma.*chan.*/quickdata/cap$)$/
> > 
> > Which actually represents this What:
> > 	/sys/devices/pciXXXX:XX/0000:XX:XX.X/dma/dma<n>chan<n>/quickdata/cap
> > 
> > The script could have done a better job escaping "." character on
> > it (but that is not too trivial) and grouping altogether ".*" 
> > repetitions, although, in this specific case, probably the best
> > regex would be, instead:
> > 
> > 	/sys/devices/pci[\da-f:\.]+/dma/dma\d+chan\d+/quickdata/cap
> > 
> > One possible long-term solution would be to directly use regexes 
> > directly on "What" fields inside Documentation/ABI, but on some parts
> > this would require some changes, like, for instance:
> > 
> > 	/sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/kone/roccatkone<minor>/weight
> > 
> > Ok, we could likely use capture groups like:
> > 
> > 	(?<NAME>pattern)
> > 
> > but IMO that would make it a lot harder to be understood by humans.
> > 
> >   
> > > And sometimes this thing finishes in 20 seconds, and others, many many
> > > minutes.  It's not deterministic at all, which is odd.  Is the sysfs
> > > tree being sorted so that this should always have the same search order?  
> > 
> > No, because it uses a lot of hashes in order to speed it up. Yet,
> > it shouldn't be hard - nor it would significantly affect the processing
> > time - to make it more deterministic. See the enclosed path.  
> 
> That patch solved everything.
> 
> It now only takes 10 seconds.  Every time.  Without the patch, it feels
> hung for some reason.

The explanation is simple  - still weird :-) - basically, when an ABI
symbol is found, the regex test loop stops.

Sorting the regexes probably placed the slowest regex to happen
*after* the one that matched the ABI symbol.

> Care to turn that into a patch that I can take?

Sure. Just sent it.

> 
> > > Anyway, I've applied this series as well, this helps in finding the
> > > problems :)  
> > 
> > Thanks!
> >   
> > > Note, I can provide an off-list tarball of /sys/ if that would help in
> > > debugging anything on your end.  
> > 
> > Yeah, that can help. Feel free to send it to me.
> > 
> > Btw, I just got an arm64 server with 128 CPUs for testing. I'm trying
> > to allocate also a big x86 server here, but I'm not sure if it is AMD or
> > Intel.  
> 
> This is on perl 5.34 here.
> 
> thanks,
> 
> greg k-h



Thanks,
Mauro

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] scripts: get_abi.pl: make undefined search more deterministic
  2021-09-28 21:51       ` [PATCH] scripts: get_abi.pl: make undefined search more deterministic Mauro Carvalho Chehab
@ 2021-09-29  7:13         ` Greg Kroah-Hartman
  0 siblings, 0 replies; 12+ messages in thread
From: Greg Kroah-Hartman @ 2021-09-29  7:13 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Linux Doc Mailing List, Jonathan Corbet, linux-kernel

On Tue, Sep 28, 2021 at 11:51:32PM +0200, Mauro Carvalho Chehab wrote:
> Sort keys on hashes during undefined search, in order to
> make the script more deterministic.
> 
> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> ---
>  scripts/get_abi.pl | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/scripts/get_abi.pl b/scripts/get_abi.pl
> index 841d889747c0..d32dcd7cca5d 100755
> --- a/scripts/get_abi.pl
> +++ b/scripts/get_abi.pl
> @@ -775,6 +775,8 @@ sub check_undefined_symbols {
>  	my $next_i = 0;
>  	my $start_time = times;
>  
> +	@files = sort @files;
> +
>  	my $last_time = $start_time;
>  
>  	# When either debug or hint is enabled, there's no sense showing
> @@ -909,16 +911,16 @@ sub undefined_symbols {
>  		}
>  	}
>  	# Compile regexes
> -	foreach my $l (keys %leaf) {
> +	foreach my $l (sort keys %leaf) {
>  		my @expr;
> -		foreach my $w(split /\xac/, $leaf{$l}->{what}) {
> +		foreach my $w(sort split /\xac/, $leaf{$l}->{what}) {
>  			push @expr, qr /^$w$/;
>  		}
>  		$leaf{$l}->{expr} = \@expr;
>  	}
>  
>  	# Take links into account
> -	foreach my $link (keys %aliases) {
> +	foreach my $link (sort keys %aliases) {
>  		my $abs_file = $aliases{$link};
>  		graph_add_link($abs_file, $link);
>  	}
> -- 
> 2.31.1
> 

Much better, now we are at a reproducable 10 seconds on my large box.
Thanks for all the work on this, now to go and start adding the missing
documentation :)

greg k-h

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2021-09-29  7:13 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-28 10:14 [PATCH 0/3] get_abi: improve message output and fix a regression Mauro Carvalho Chehab
2021-09-28 10:14 ` [PATCH 1/3] scripts: get_abi.pl: use STDERR for search-string and show-hints Mauro Carvalho Chehab
2021-09-28 10:14 ` [PATCH 2/3] scripts: get_abi.pl: show progress Mauro Carvalho Chehab
2021-09-28 10:14 ` [PATCH 3/3] ABI: evm: place a second what at the next line Mauro Carvalho Chehab
2021-09-28 11:04 ` [PATCH 0/3] get_abi: improve message output and fix a regression Greg Kroah-Hartman
2021-09-28 12:27   ` Mauro Carvalho Chehab
2021-09-28 13:43     ` Mauro Carvalho Chehab
2021-09-28 17:19       ` Greg Kroah-Hartman
2021-09-28 17:18     ` Greg Kroah-Hartman
2021-09-28 21:51       ` [PATCH] scripts: get_abi.pl: make undefined search more deterministic Mauro Carvalho Chehab
2021-09-29  7:13         ` Greg Kroah-Hartman
2021-09-28 21:54       ` [PATCH 0/3] get_abi: improve message output and fix a regression Mauro Carvalho Chehab

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.