All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kees Cook <keescook@chromium.org>
To: John Wood <john.wood@gmx.com>
Cc: Jann Horn <jannh@google.com>,
	Randy Dunlap <rdunlap@infradead.org>,
	Jonathan Corbet <corbet@lwn.net>,
	James Morris <jmorris@namei.org>, Shuah Khan <shuah@kernel.org>,
	"Serge E. Hallyn" <serge@hallyn.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Andi Kleen <ak@linux.intel.com>,
	kernel test robot <oliver.sang@intel.com>,
	linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org,
	linux-kselftest@vger.kernel.org,
	kernel-hardening@lists.openwall.com
Subject: Re: [PATCH v6 7/8] Documentation: Add documentation for the Brute LSM
Date: Wed, 17 Mar 2021 21:10:05 -0700	[thread overview]
Message-ID: <202103172108.404F9B6ED2@keescook> (raw)
In-Reply-To: <20210307113031.11671-8-john.wood@gmx.com>

On Sun, Mar 07, 2021 at 12:30:30PM +0100, John Wood wrote:
> Add some info detailing what is the Brute LSM, its motivation, weak
> points of existing implementations, proposed solutions, enabling,
> disabling and self-tests.
> 
> Signed-off-by: John Wood <john.wood@gmx.com>
> ---
>  Documentation/admin-guide/LSM/Brute.rst | 278 ++++++++++++++++++++++++
>  Documentation/admin-guide/LSM/index.rst |   1 +
>  security/brute/Kconfig                  |   3 +-
>  3 files changed, 281 insertions(+), 1 deletion(-)
>  create mode 100644 Documentation/admin-guide/LSM/Brute.rst
> 
> diff --git a/Documentation/admin-guide/LSM/Brute.rst b/Documentation/admin-guide/LSM/Brute.rst
> new file mode 100644
> index 000000000000..ca80aef9aa67
> --- /dev/null
> +++ b/Documentation/admin-guide/LSM/Brute.rst
> @@ -0,0 +1,278 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +===========================================================
> +Brute: Fork brute force attack detection and mitigation LSM
> +===========================================================
> +
> +Attacks against vulnerable userspace applications with the purpose to break ASLR
> +or bypass canaries traditionally use some level of brute force with the help of
> +the fork system call. This is possible since when creating a new process using
> +fork its memory contents are the same as those of the parent process (the
> +process that called the fork system call). So, the attacker can test the memory
> +infinite times to find the correct memory values or the correct memory addresses
> +without worrying about crashing the application.
> +
> +Based on the above scenario it would be nice to have this detected and
> +mitigated, and this is the goal of this implementation. Specifically the
> +following attacks are expected to be detected:
> +
> +1.- Launching (fork()/exec()) a setuid/setgid process repeatedly until a
> +    desirable memory layout is got (e.g. Stack Clash).
> +2.- Connecting to an exec()ing network daemon (e.g. xinetd) repeatedly until a
> +    desirable memory layout is got (e.g. what CTFs do for simple network
> +    service).
> +3.- Launching processes without exec() (e.g. Android Zygote) and exposing state
> +    to attack a sibling.
> +4.- Connecting to a fork()ing network daemon (e.g. apache) repeatedly until the
> +    previously shared memory layout of all the other children is exposed (e.g.
> +    kind of related to HeartBleed).
> +
> +In each case, a privilege boundary has been crossed:
> +
> +Case 1: setuid/setgid process
> +Case 2: network to local
> +Case 3: privilege changes
> +Case 4: network to local
> +
> +So, what really needs to be detected are fork/exec brute force attacks that
> +cross any of the commented bounds.
> +
> +
> +Other implementations
> +=====================
> +
> +The public version of grsecurity, as a summary, is based on the idea of delaying
> +the fork system call if a child died due to some fatal signal (SIGSEGV, SIGBUS,
> +SIGKILL or SIGILL). This has some issues:
> +
> +Bad practices
> +-------------
> +
> +Adding delays to the kernel is, in general, a bad idea.
> +
> +Scenarios not detected (false negatives)
> +----------------------------------------
> +
> +This protection acts only when the fork system call is called after a child has
> +crashed. So, it would still be possible for an attacker to fork a big amount of
> +children (in the order of thousands), then probe all of them, and finally wait
> +the protection time before repeating the steps.
> +
> +Moreover, this method is based on the idea that the protection doesn't act if
> +the parent crashes. So, it would still be possible for an attacker to fork a
> +process and probe itself. Then, fork the child process and probe itself again.
> +This way, these steps can be repeated infinite times without any mitigation.
> +
> +Scenarios detected (false positives)
> +------------------------------------
> +
> +Scenarios where an application rarely fails for reasons unrelated to a real
> +attack.
> +
> +
> +This implementation
> +===================
> +
> +The main idea behind this implementation is to improve the existing ones
> +focusing on the weak points annotated before. Basically, the adopted solution is
> +to detect a fast crash rate instead of only one simple crash and to detect both
> +the crash of parent and child processes. Also, fine tune the detection focusing
> +on privilege boundary crossing. And finally, as a mitigation method, kill all
> +the offending tasks involved in the attack instead of using delays.
> +
> +To achieve this goal, and going into more details, this implementation is based
> +on the use of some statistical data shared across all the processes that can
> +have the same memory contents. Or in other words, a statistical data shared
> +between all the fork hierarchy processes after an execve system call.
> +
> +The purpose of these statistics is, basically, collect all the necessary info
> +to compute the application crash period in order to detect an attack. This crash
> +period is the time between the execve system call and the first fault or the
> +time between two consecutive faults, but this has a drawback. If an application
> +crashes twice in a short period of time for some reason unrelated to a real
> +attack, a false positive will be triggered. To avoid this scenario the
> +exponential moving average (EMA) is used. This way, the application crash period
> +will be a value that is not prone to change due to spurious data and follows the
> +real crash period.
> +
> +To detect a brute force attack it is necessary that the statistics shared by all
> +the fork hierarchy processes be updated in every fatal crash and the most
> +important data to update is the application crash period.
> +
> +These statistics are hold by the brute_stats struct.
> +
> +struct brute_cred {
> +	kuid_t uid;
> +	kgid_t gid;
> +	kuid_t suid;
> +	kgid_t sgid;
> +	kuid_t euid;
> +	kgid_t egid;
> +	kuid_t fsuid;
> +	kgid_t fsgid;
> +};
> +
> +struct brute_stats {
> +	spinlock_t lock;
> +	refcount_t refc;
> +	unsigned char faults;
> +	u64 jiffies;
> +	u64 period;
> +	struct brute_cred saved_cred;
> +	unsigned char network : 1;
> +	unsigned char bounds_crossed : 1;
> +};

Instead of open-coding this, just use the kerndoc references you've
already built in the .c files:

.. kernel-doc:: security/brute/brute.c

> +
> +This is a fixed sized struct, so the memory usage will be based on the current
> +number of processes exec()ing. The previous sentence is true since in every fork
> +system call the parent's statistics are shared with the child process and in
> +every execve system call a new brute_stats struct is allocated. So, only one
> +brute_stats struct is used for every fork hierarchy (hierarchy of processes from
> +the execve system call).
> +
> +There are two types of brute force attacks that need to be detected. The first
> +one is an attack that happens through the fork system call and the second one is
> +an attack that happens through the execve system call. The first type uses the
> +statistics shared by all the fork hierarchy processes, but the second type
> +cannot use this statistical data due to these statistics dissapear when the
> +involved tasks finished. In this last scenario the attack info should be tracked
> +by the statistics of a higher fork hierarchy (the hierarchy that contains the
> +process that forks before the execve system call).
> +
> +Moreover, these two attack types have two variants. A slow brute force attack
> +that is detected if a maximum number of faults per fork hierarchy is reached and
> +a fast brute force attack that is detected if the application crash period falls
> +below a certain threshold.
> +
> +Once an attack has been detected, this is mitigated killing all the offending
> +tasks involved. Or in other words, once an attack has been detected, this is
> +mitigated killing all the processes that share the same statistics (the stats
> +that show an slow or fast brute force attack).
> +
> +Fine tuning the attack detection
> +--------------------------------
> +
> +To avoid false positives during the attack detection it is necessary to narrow
> +the possible cases. To do so, and based on the threat scenarios that we want to
> +detect, this implementation also focuses on the crossing of privilege bounds.
> +
> +To be precise, only the following privilege bounds are taken into account:
> +
> +1.- setuid/setgid process
> +2.- network to local
> +3.- privilege changes
> +
> +Moreover, only the fatal signals delivered by the kernel are taken into account
> +avoiding the fatal signals sent by userspace applications (with the exception of
> +the SIGABRT user signal since this is used by glibc for stack canary, malloc,
> +etc. failures, which may indicate that a mitigation has been triggered).
> +
> +Exponential moving average (EMA)
> +--------------------------------
> +
> +This kind of average defines a weight (between 0 and 1) for the new value to add
> +and applies the remainder of the weight to the current average value. This way,
> +some spurious data will not excessively modify the average and only if the new
> +values are persistent, the moving average will tend towards them.
> +
> +Mathematically the application crash period's EMA can be expressed as follows:
> +
> +period_ema = period * weight + period_ema * (1 - weight)
> +
> +Related to the attack detection, the EMA must guarantee that not many crashes
> +are needed. To demonstrate this, the scenario where an application has been
> +running without any crashes for a month will be used.
> +
> +The period's EMA can be written now as:
> +
> +period_ema[i] = period[i] * weight + period_ema[i - 1] * (1 - weight)
> +
> +If the new crash periods have insignificant values related to the first crash
> +period (a month in this case), the formula can be rewritten as:
> +
> +period_ema[i] = period_ema[i - 1] * (1 - weight)
> +
> +And by extension:
> +
> +period_ema[i - 1] = period_ema[i - 2] * (1 - weight)
> +period_ema[i - 2] = period_ema[i - 3] * (1 - weight)
> +period_ema[i - 3] = period_ema[i - 4] * (1 - weight)
> +
> +So, if the substitution is made:
> +
> +period_ema[i] = period_ema[i - 1] * (1 - weight)
> +period_ema[i] = period_ema[i - 2] * pow((1 - weight) , 2)
> +period_ema[i] = period_ema[i - 3] * pow((1 - weight) , 3)
> +period_ema[i] = period_ema[i - 4] * pow((1 - weight) , 4)
> +
> +And in a more generic form:
> +
> +period_ema[i] = period_ema[i - n] * pow((1 - weight) , n)
> +
> +Where n represents the number of iterations to obtain an EMA value. Or in other
> +words, the number of crashes to detect an attack.
> +
> +So, if we isolate the number of crashes:
> +
> +period_ema[i] / period_ema[i - n] = pow((1 - weight), n)
> +log(period_ema[i] / period_ema[i - n]) = log(pow((1 - weight), n))
> +log(period_ema[i] / period_ema[i - n]) = n * log(1 - weight)
> +n = log(period_ema[i] / period_ema[i - n]) / log(1 - weight)
> +
> +Then, in the commented scenario (an application has been running without any
> +crashes for a month), the approximate number of crashes to detect an attack
> +(using the implementation values for the weight and the crash period threshold)
> +is:
> +
> +weight = 7 / 10
> +crash_period_threshold = 30 seconds
> +
> +n = log(crash_period_threshold / seconds_per_month) / log(1 - weight)
> +n = log(30 / (30 * 24 * 3600)) / log(1 - 0.7)
> +n = 9.44
> +
> +So, with 10 crashes for this scenario an attack will be detected. If these steps
> +are repeated for different scenarios and the results are collected:
> +
> +1 month without any crashes ----> 9.44 crashes to detect an attack
> +1 year without any crashes -----> 11.50 crashes to detect an attack
> +10 years without any crashes ---> 13.42 crashes to detect an attack
> +
> +However, this computation has a drawback. The first data added to the EMA not
> +obtains a real average showing a trend. So the solution is simple, the EMA needs
> +a minimum number of data to be able to be interpreted. This way, the case where
> +a few first faults are fast enough followed by no crashes is avoided.
> +
> +Per system enabling/disabling
> +-----------------------------
> +
> +This feature can be enabled at build time using the CONFIG_SECURITY_FORK_BRUTE
> +option or using the visual config application under the following menu:
> +
> +Security options  --->  Fork brute force attack detection and mitigation
> +
> +Also, at boot time, this feature can be disable too, by changing the "lsm=" boot
> +parameter.
> +
> +Kernel selftests
> +----------------
> +
> +To validate all the expectations about this implementation, there is a set of
> +selftests. This tests cover fork/exec brute force attacks crossing the following
> +privilege boundaries:
> +
> +1.- setuid process
> +2.- privilege changes
> +3.- network to local
> +
> +Also, there are some tests to check that fork/exec brute force attacks without
> +crossing any privilege boundariy already commented doesn't trigger the detection
> +and mitigation stage.
> +
> +To build the tests:
> +make -C tools/testing/selftests/ TARGETS=brute
> +
> +To run the tests:
> +make -C tools/testing/selftests TARGETS=brute run_tests
> +
> +To package the tests:
> +make -C tools/testing/selftests TARGETS=brute gen_tar
> diff --git a/Documentation/admin-guide/LSM/index.rst b/Documentation/admin-guide/LSM/index.rst
> index a6ba95fbaa9f..1f68982bb330 100644
> --- a/Documentation/admin-guide/LSM/index.rst
> +++ b/Documentation/admin-guide/LSM/index.rst
> @@ -41,6 +41,7 @@ subdirectories.
>     :maxdepth: 1
> 
>     apparmor
> +   Brute
>     LoadPin
>     SELinux
>     Smack
> diff --git a/security/brute/Kconfig b/security/brute/Kconfig
> index 1bd2df1e2dec..334d7e88d27f 100644
> --- a/security/brute/Kconfig
> +++ b/security/brute/Kconfig
> @@ -7,6 +7,7 @@ config SECURITY_FORK_BRUTE
>  	  vulnerable userspace processes. The detection method is based on
>  	  the application crash period and as a mitigation procedure all the
>  	  offending tasks are killed. Like capabilities, this security module
> -	  stacks with other LSMs.
> +	  stacks with other LSMs. Further information can be found in
> +	  Documentation/admin-guide/LSM/Brute.rst.
> 
>  	  If you are unsure how to answer this question, answer N.
> --
> 2.25.1
> 

-- 
Kees Cook

  reply	other threads:[~2021-03-18  4:10 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-07 11:30 [PATCH v6 0/8] Fork brute force attack mitigation John Wood
2021-03-07 11:30 ` [PATCH v6 1/8] security: Add LSM hook at the point where a task gets a fatal signal John Wood
2021-03-18  1:22   ` Kees Cook
2021-03-07 11:30 ` [PATCH v6 2/8] security/brute: Define a LSM and manage statistical data John Wood
2021-03-18  2:00   ` Kees Cook
2021-03-20 15:01     ` John Wood
2021-03-21 17:37       ` Kees Cook
2021-03-07 11:30 ` [PATCH v6 3/8] securtiy/brute: Detect a brute force attack John Wood
2021-03-18  2:57   ` Kees Cook
2021-03-20 15:34     ` John Wood
2021-03-21 18:28       ` Kees Cook
2021-03-21 15:01     ` John Wood
2021-03-21 18:45       ` Kees Cook
2021-03-22 18:32         ` John Wood
2021-03-07 11:30 ` [PATCH v6 4/8] security/brute: Fine tuning the attack detection John Wood
2021-03-18  4:00   ` Kees Cook
2021-03-20 15:46     ` John Wood
2021-03-21 18:01       ` Kees Cook
2021-03-07 11:30 ` [PATCH v6 5/8] security/brute: Mitigate a brute force attack John Wood
2021-03-18  4:04   ` Kees Cook
2021-03-20 15:48     ` John Wood
2021-03-21 18:06       ` Kees Cook
2021-03-07 11:30 ` [PATCH v6 6/8] selftests/brute: Add tests for the Brute LSM John Wood
2021-03-18  4:08   ` Kees Cook
2021-03-20 15:49     ` John Wood
2021-03-07 11:30 ` [PATCH v6 7/8] Documentation: Add documentation " John Wood
2021-03-18  4:10   ` Kees Cook [this message]
2021-03-20 15:50     ` John Wood
2021-03-21 18:50   ` Jonathan Corbet
2021-03-26 15:41     ` John Wood
2021-03-07 11:30 ` [PATCH v6 8/8] MAINTAINERS: Add a new entry " John Wood

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=202103172108.404F9B6ED2@keescook \
    --to=keescook@chromium.org \
    --cc=ak@linux.intel.com \
    --cc=corbet@lwn.net \
    --cc=gregkh@linuxfoundation.org \
    --cc=jannh@google.com \
    --cc=jmorris@namei.org \
    --cc=john.wood@gmx.com \
    --cc=kernel-hardening@lists.openwall.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=oliver.sang@intel.com \
    --cc=rdunlap@infradead.org \
    --cc=serge@hallyn.com \
    --cc=shuah@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.