linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] leaking_addresses: simplify and optimize
@ 2018-02-19  2:50 Tobin C. Harding
  2018-02-19  2:50 ` [PATCH 1/4] leaking_addresses: do not parse binary files Tobin C. Harding
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Tobin C. Harding @ 2018-02-19  2:50 UTC (permalink / raw)
  To: Kernel Hardening; +Cc: Tobin C. Harding, Tycho Andersen, LKML

leaking_addresses.pl is currently woefully slow.  This series fixes
that.  Also, currently configuring which files/directories to skip is
overly complicated.  We can simplify the code configuration quite easily
by combining the configuration arrays.

This series does not apply on top of the mainline but is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/tobin/leaks.git leaks-testing


thanks,
Tobin.

Tobin C. Harding (4):
  leaking_addresses: do not parse binary files
  leaking_addresses: simplify path skipping
  leaking_addresses: cache architecture name
  leaking_addresses: add scan_once array

 scripts/leaking_addresses.pl | 125 ++++++++++++++++++++++---------------------
 1 file changed, 64 insertions(+), 61 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/4] leaking_addresses: do not parse binary files
  2018-02-19  2:50 [PATCH 0/4] leaking_addresses: simplify and optimize Tobin C. Harding
@ 2018-02-19  2:50 ` Tobin C. Harding
  2018-02-19  2:50 ` [PATCH 2/4] leaking_addresses: simplify path skipping Tobin C. Harding
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 9+ messages in thread
From: Tobin C. Harding @ 2018-02-19  2:50 UTC (permalink / raw)
  To: Kernel Hardening; +Cc: Tobin C. Harding, Tycho Andersen, LKML

Currently script parses binary files.  Since we are scanning for
readable kernel addresses there is no need to parse binary files.  We
can use Perl to check if file is binary and skip parsing it if so.

Do not parse binary files.

Signed-off-by: Tobin C. Harding <me@tobin.cc>
---
 scripts/leaking_addresses.pl | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/scripts/leaking_addresses.pl b/scripts/leaking_addresses.pl
index 05906f6cf6b9..3d5c3096aac8 100755
--- a/scripts/leaking_addresses.pl
+++ b/scripts/leaking_addresses.pl
@@ -462,6 +462,10 @@ sub parse_file
 		return;
 	}
 
+	if (! -T $file) {
+		return;
+	}
+
 	if (skip_parse($file)) {
 		dprint "skipping file: $file\n";
 		return;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/4] leaking_addresses: simplify path skipping
  2018-02-19  2:50 [PATCH 0/4] leaking_addresses: simplify and optimize Tobin C. Harding
  2018-02-19  2:50 ` [PATCH 1/4] leaking_addresses: do not parse binary files Tobin C. Harding
@ 2018-02-19  2:50 ` Tobin C. Harding
  2018-02-26  1:26   ` Tycho Andersen
  2018-02-19  2:50 ` [PATCH 3/4] leaking_addresses: cache architecture name Tobin C. Harding
  2018-02-19  2:50 ` [PATCH 4/4] leaking_addresses: add scan_once array Tobin C. Harding
  3 siblings, 1 reply; 9+ messages in thread
From: Tobin C. Harding @ 2018-02-19  2:50 UTC (permalink / raw)
  To: Kernel Hardening; +Cc: Tobin C. Harding, Tycho Andersen, LKML

Currently script has multiple configuration arrays.  This is confusing,
evident by the fact that a bunch of the entries are in the wrong place.
We can simplify the code by just having a single array for absolute
paths to skip and a single array for file names to skip wherever they
appear in the scanned directory tree.  There are also currently multiple
subroutines to handle the different arrays, we can reduce these to a
single subroutine also.

Simplify the path skipping code.

Signed-off-by: Tobin C. Harding <me@tobin.cc>
---
 scripts/leaking_addresses.pl | 90 ++++++++++++++------------------------------
 1 file changed, 29 insertions(+), 61 deletions(-)

diff --git a/scripts/leaking_addresses.pl b/scripts/leaking_addresses.pl
index 3d5c3096aac8..e7bf15a45a69 100755
--- a/scripts/leaking_addresses.pl
+++ b/scripts/leaking_addresses.pl
@@ -48,41 +48,26 @@ my $kernel_config_file = "";	# Kernel configuration file.
 my $opt_32bit = 0;		# Scan 32-bit kernel.
 my $page_offset_32bit = 0;	# Page offset for 32-bit kernel.
 
-# Do not parse these files (absolute path).
-my @skip_parse_files_abs = ('/proc/kmsg',
-			    '/proc/kcore',
-			    '/proc/fs/ext4/sdb1/mb_groups',
-			    '/proc/1/fd/3',
-			    '/sys/firmware/devicetree',
-			    '/proc/device-tree',
-			    '/sys/kernel/debug/tracing/trace_pipe',
-			    '/sys/kernel/security/apparmor/revision');
-
-# Do not parse these files under any subdirectory.
-my @skip_parse_files_any = ('0',
-			    '1',
-			    '2',
-			    'pagemap',
-			    'events',
-			    'access',
-			    'registers',
-			    'snapshot_raw',
-			    'trace_pipe_raw',
-			    'ptmx',
-			    'trace_pipe');
-
-# Do not walk these directories (absolute path).
-my @skip_walk_dirs_abs = ();
-
-# Do not walk these directories under any subdirectory.
-my @skip_walk_dirs_any = ('self',
-			  'thread-self',
-			  'cwd',
-			  'fd',
-			  'usbmon',
-			  'stderr',
-			  'stdin',
-			  'stdout');
+# Skip these absolute paths.
+my @skip_abs = (
+	'/proc/kmsg',
+	'/sys/firmware/devicetree',
+	'/proc/device-tree',
+	'/sys/kernel/debug/tracing/trace_pipe',
+	'/sys/kernel/security/apparmor/revision');
+
+# Skip these under any subdirectory.
+my @skip_any = (
+	'pagemap',
+	'events',
+	'access',
+	'registers',
+	'snapshot_raw',
+	'trace_pipe_raw',
+	'ptmx',
+	'trace_pipe',
+	'fd',
+	'usbmon');
 
 sub help
 {
@@ -417,26 +402,20 @@ sub parse_dmesg
 # True if we should skip this path.
 sub skip
 {
-	my ($path, $paths_abs, $paths_any) = @_;
+	my ($path) = @_;
 
-	foreach (@$paths_abs) {
+	foreach (@skip_abs) {
 		return 1 if (/^$path$/);
 	}
 
 	my($filename, $dirs, $suffix) = fileparse($path);
-	foreach (@$paths_any) {
+	foreach (@skip_any) {
 		return 1 if (/^$filename$/);
 	}
 
 	return 0;
 }
 
-sub skip_parse
-{
-	my ($path) = @_;
-	return skip($path, \@skip_parse_files_abs, \@skip_parse_files_any);
-}
-
 sub timed_parse_file
 {
 	my ($file) = @_;
@@ -466,12 +445,6 @@ sub parse_file
 		return;
 	}
 
-	if (skip_parse($file)) {
-		dprint "skipping file: $file\n";
-		return;
-	}
-	dprint "parsing: $file\n";
-
 	open my $fh, "<", $file or return;
 	while ( <$fh> ) {
 		if (may_leak_address($_)) {
@@ -481,21 +454,12 @@ sub parse_file
 	close $fh;
 }
 
-
-# True if we should skip walking this directory.
-sub skip_walk
-{
-	my ($path) = @_;
-	return skip($path, \@skip_walk_dirs_abs, \@skip_walk_dirs_any)
-}
-
 # Recursively walk directory tree.
 sub walk
 {
 	my @dirs = @_;
 
 	while (my $pwd = shift @dirs) {
-		next if (skip_walk($pwd));
 		next if (!opendir(DIR, $pwd));
 		my @files = readdir(DIR);
 		closedir(DIR);
@@ -506,11 +470,15 @@ sub walk
 			my $path = "$pwd/$file";
 			next if (-l $path);
 
+			next if (skip($path));
+
 			if (-d $path) {
 				push @dirs, $path;
-			} else {
-				timed_parse_file($path);
+				next;
 			}
+
+			dprint "parsing: $file\n";
+			timed_parse_file($path);
 		}
 	}
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/4] leaking_addresses: cache architecture name
  2018-02-19  2:50 [PATCH 0/4] leaking_addresses: simplify and optimize Tobin C. Harding
  2018-02-19  2:50 ` [PATCH 1/4] leaking_addresses: do not parse binary files Tobin C. Harding
  2018-02-19  2:50 ` [PATCH 2/4] leaking_addresses: simplify path skipping Tobin C. Harding
@ 2018-02-19  2:50 ` Tobin C. Harding
  2018-02-19  2:50 ` [PATCH 4/4] leaking_addresses: add scan_once array Tobin C. Harding
  3 siblings, 0 replies; 9+ messages in thread
From: Tobin C. Harding @ 2018-02-19  2:50 UTC (permalink / raw)
  To: Kernel Hardening; +Cc: Tobin C. Harding, Tycho Andersen, LKML

Currently we are repeatedly calling `uname -m`.  This is causing the
script to take a long time to run (more than 10 seconds to parse
/proc/kallsyms).  We can use Perl state variables to cache the result of
the first call to `uname -m`.  With this change in place the script
scans the whole kernel in under a minute.

Cache machine architecture in state variable.

Signed-off-by: Tobin C. Harding <me@tobin.cc>
---
 scripts/leaking_addresses.pl | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/scripts/leaking_addresses.pl b/scripts/leaking_addresses.pl
index e7bf15a45a69..f52e91ef7d5c 100755
--- a/scripts/leaking_addresses.pl
+++ b/scripts/leaking_addresses.pl
@@ -175,7 +175,7 @@ sub is_32bit
 
 sub is_ix86_32
 {
-       my $arch = `uname -m`;
+       state $arch = `uname -m`;
 
        chomp $arch;
        if ($arch =~ m/i[3456]86/) {
@@ -198,12 +198,14 @@ sub is_arch
 
 sub is_x86_64
 {
-	return is_arch('x86_64');
+	state $is = is_arch('x86_64');
+	return $is;
 }
 
 sub is_ppc64
 {
-	return is_arch('ppc64');
+	state $is = is_arch('ppc64');
+	return $is;
 }
 
 # Gets config option value from kernel config file.
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 4/4] leaking_addresses: add scan_once array
  2018-02-19  2:50 [PATCH 0/4] leaking_addresses: simplify and optimize Tobin C. Harding
                   ` (2 preceding siblings ...)
  2018-02-19  2:50 ` [PATCH 3/4] leaking_addresses: cache architecture name Tobin C. Harding
@ 2018-02-19  2:50 ` Tobin C. Harding
  2018-02-26  1:09   ` Tycho Andersen
  3 siblings, 1 reply; 9+ messages in thread
From: Tobin C. Harding @ 2018-02-19  2:50 UTC (permalink / raw)
  To: Kernel Hardening; +Cc: Tobin C. Harding, Tycho Andersen, LKML

There are files under /proc that have the same format for each PID, e.g
'smaps'.  We need only scan these files a single time to verify that
they are not leaking addresses.  This reduces the work the script must
do.

Add once_only array.

Signed-off-by: Tobin C. Harding <me@tobin.cc>
---
 scripts/leaking_addresses.pl | 29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)

diff --git a/scripts/leaking_addresses.pl b/scripts/leaking_addresses.pl
index f52e91ef7d5c..ab4e70d9efde 100755
--- a/scripts/leaking_addresses.pl
+++ b/scripts/leaking_addresses.pl
@@ -69,6 +69,12 @@ my @skip_any = (
 	'fd',
 	'usbmon');
 
+# These files are the same format under each PID that they appear.
+# We need only pass them once.
+my @once_only = (
+	'smaps',
+	'mb_groups');
+
 sub help
 {
 	my ($exitcode) = @_;
@@ -401,6 +407,25 @@ sub parse_dmesg
 	close $cmd;
 }
 
+sub already_scanned
+{
+	my ($filename) = @_;
+	state %seen;
+
+	foreach (@once_only) {
+		if (/^$filename$/) {
+			if ($seen{$_} == 1) {
+				return 1;
+			}
+			$seen{$_} = 1;
+
+			return 0;
+		}
+	}
+
+	return 0;
+}
+
 # True if we should skip this path.
 sub skip
 {
@@ -415,6 +440,10 @@ sub skip
 		return 1 if (/^$filename$/);
 	}
 
+	if (already_scanned($filename)) {
+		return 1;
+	}
+
 	return 0;
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 4/4] leaking_addresses: add scan_once array
  2018-02-19  2:50 ` [PATCH 4/4] leaking_addresses: add scan_once array Tobin C. Harding
@ 2018-02-26  1:09   ` Tycho Andersen
  2018-02-26  2:01     ` Tobin C. Harding
  0 siblings, 1 reply; 9+ messages in thread
From: Tycho Andersen @ 2018-02-26  1:09 UTC (permalink / raw)
  To: Tobin C. Harding; +Cc: Kernel Hardening, LKML

Hi Tobin,

On Mon, Feb 19, 2018 at 01:50:49PM +1100, Tobin C. Harding wrote:
> +sub already_scanned
> +{
> +	my ($filename) = @_;
> +	state %seen;
> +
> +	foreach (@once_only) {
> +		if (/^$filename$/) {
> +			if ($seen{$_} == 1) {

This should be something like,

if (($seen{$_} //= 0) == 1) {

otherwise I get a bunch of uninitialized warnings,

Use of uninitialized value in pattern match (m//) at /usr/share/perl/5.26/Math/BigInt.pm line 1199.
    Math::BigInt::bcmp(Math::BigInt=HASH(0x55dc2f7e4580), undef) called at /usr/share/perl/5.26/Math/BigInt.pm line 1257
    Math::BigInt::beq(Math::BigInt=HASH(0x55dc2f7e4580), undef) called at /usr/share/perl/5.26/Math/BigInt.pm line 105
    Math::BigInt::__ANON__(Math::BigInt=HASH(0x55dc2f7e4580), undef, 1) called at ./leaking_addresses.pl line 422
    main::already_scanned("smaps") called at ./leaking_addresses.pl line 448
    main::skip("/proc/1/smaps") called at ./leaking_addresses.pl line 509
    main::walk("/proc", "/sys") called at ./leaking_addresses.pl line 159

Tycho

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/4] leaking_addresses: simplify path skipping
  2018-02-19  2:50 ` [PATCH 2/4] leaking_addresses: simplify path skipping Tobin C. Harding
@ 2018-02-26  1:26   ` Tycho Andersen
  2018-02-26  2:00     ` Tobin C. Harding
  0 siblings, 1 reply; 9+ messages in thread
From: Tycho Andersen @ 2018-02-26  1:26 UTC (permalink / raw)
  To: Tobin C. Harding; +Cc: Kernel Hardening, LKML

Hi Tobin,

On Mon, Feb 19, 2018 at 01:50:47PM +1100, Tobin C. Harding wrote:
> -# Do not parse these files under any subdirectory.
> -my @skip_parse_files_any = ('0',
> -			    '1',
> -			    '2',
> -			    'pagemap',
> -			    'events',
> -			    'access',
> -			    'registers',
> -			    'snapshot_raw',
> -			    'trace_pipe_raw',
> -			    'ptmx',
> -			    'trace_pipe');

It might be worth adding 'syscall' here; the pointers listed are user
pointers, and negative syscall args will show up like kernel pointers,
e.g. I get this output, which is spurious:

/proc/31808/syscall: 0 0x3 0x55b107a38180 0x2000 0xffffffffffffffb0 0x55b107a302d0 0x55b107a38180 0x7fffa313b8e8 0x7ff098560d11

Cheers,

Tycho

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/4] leaking_addresses: simplify path skipping
  2018-02-26  1:26   ` Tycho Andersen
@ 2018-02-26  2:00     ` Tobin C. Harding
  0 siblings, 0 replies; 9+ messages in thread
From: Tobin C. Harding @ 2018-02-26  2:00 UTC (permalink / raw)
  To: Tycho Andersen; +Cc: Kernel Hardening, LKML

On Sun, Feb 25, 2018 at 06:26:31PM -0700, Tycho Andersen wrote:
> Hi Tobin,
> 
> On Mon, Feb 19, 2018 at 01:50:47PM +1100, Tobin C. Harding wrote:
> > -# Do not parse these files under any subdirectory.
> > -my @skip_parse_files_any = ('0',
> > -			    '1',
> > -			    '2',
> > -			    'pagemap',
> > -			    'events',
> > -			    'access',
> > -			    'registers',
> > -			    'snapshot_raw',
> > -			    'trace_pipe_raw',
> > -			    'ptmx',
> > -			    'trace_pipe');
> 
> It might be worth adding 'syscall' here; the pointers listed are user
> pointers, and negative syscall args will show up like kernel pointers,
> e.g. I get this output, which is spurious:
> 
> /proc/31808/syscall: 0 0x3 0x55b107a38180 0x2000 0xffffffffffffffb0 0x55b107a302d0 0x55b107a38180 0x7fffa313b8e8 0x7ff098560d11

Nice.  Will add.

thanks,
Tobin.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 4/4] leaking_addresses: add scan_once array
  2018-02-26  1:09   ` Tycho Andersen
@ 2018-02-26  2:01     ` Tobin C. Harding
  0 siblings, 0 replies; 9+ messages in thread
From: Tobin C. Harding @ 2018-02-26  2:01 UTC (permalink / raw)
  To: Tycho Andersen; +Cc: Kernel Hardening, LKML

On Sun, Feb 25, 2018 at 06:09:53PM -0700, Tycho Andersen wrote:
> Hi Tobin,
> 
> On Mon, Feb 19, 2018 at 01:50:49PM +1100, Tobin C. Harding wrote:
> > +sub already_scanned
> > +{
> > +	my ($filename) = @_;
> > +	state %seen;
> > +
> > +	foreach (@once_only) {
> > +		if (/^$filename$/) {
> > +			if ($seen{$_} == 1) {
> 
> This should be something like,
> 
> if (($seen{$_} //= 0) == 1) {
> 
> otherwise I get a bunch of uninitialized warnings,
> 
> Use of uninitialized value in pattern match (m//) at /usr/share/perl/5.26/Math/BigInt.pm line 1199.
>     Math::BigInt::bcmp(Math::BigInt=HASH(0x55dc2f7e4580), undef) called at /usr/share/perl/5.26/Math/BigInt.pm line 1257
>     Math::BigInt::beq(Math::BigInt=HASH(0x55dc2f7e4580), undef) called at /usr/share/perl/5.26/Math/BigInt.pm line 105
>     Math::BigInt::__ANON__(Math::BigInt=HASH(0x55dc2f7e4580), undef, 1) called at ./leaking_addresses.pl line 422
>     main::already_scanned("smaps") called at ./leaking_addresses.pl line 448
>     main::skip("/proc/1/smaps") called at ./leaking_addresses.pl line 509
>     main::walk("/proc", "/sys") called at ./leaking_addresses.pl line 159
> 
> Tycho

Cool, thanks.  Will fix and re-spin.


	Tobin

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2018-02-26  2:01 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-19  2:50 [PATCH 0/4] leaking_addresses: simplify and optimize Tobin C. Harding
2018-02-19  2:50 ` [PATCH 1/4] leaking_addresses: do not parse binary files Tobin C. Harding
2018-02-19  2:50 ` [PATCH 2/4] leaking_addresses: simplify path skipping Tobin C. Harding
2018-02-26  1:26   ` Tycho Andersen
2018-02-26  2:00     ` Tobin C. Harding
2018-02-19  2:50 ` [PATCH 3/4] leaking_addresses: cache architecture name Tobin C. Harding
2018-02-19  2:50 ` [PATCH 4/4] leaking_addresses: add scan_once array Tobin C. Harding
2018-02-26  1:09   ` Tycho Andersen
2018-02-26  2:01     ` Tobin C. Harding

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).