linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2 v5] SELinux: Reduce overhead of mls_level_isvalid() function call
@ 2013-06-10 17:55 Waiman Long
  2013-06-11 11:49 ` Stephen Smalley
  0 siblings, 1 reply; 6+ messages in thread
From: Waiman Long @ 2013-06-10 17:55 UTC (permalink / raw)
  To: Stephen Smalley, James Morris, Eric Paris
  Cc: Waiman Long, linux-security-module, linux-kernel,
	Chandramouleeswaran, Aswin, Norton, Scott J

v4->v5:
  - Fix scripts/checkpatch.pl warning.

v3->v4:
  - Merge the 2 separate while loops in ebitmap_contains() into
    a single one.

v2->v3:
  - Remove unused local variables i, node from mls_level_isvalid().

v1->v2:
 - Move the new ebitmap comparison logic from mls_level_isvalid()
   into the ebitmap_contains() helper function.
 - Rerun perf and performance tests on the latest v3.10-rc4 kernel.

While running the high_systime workload of the AIM7 benchmark on
a 2-socket 12-core Westmere x86-64 machine running 3.10-rc4 kernel
(with HT on), it was found that a pretty sizable amount of time was
spent in the SELinux code. Below was the perf trace of the "perf
record -a -s" of a test run at 1500 users:

  5.04%            ls  [kernel.kallsyms]     [k] ebitmap_get_bit
  1.96%            ls  [kernel.kallsyms]     [k] mls_level_isvalid
  1.95%            ls  [kernel.kallsyms]     [k] find_next_bit

The ebitmap_get_bit() was the hottest function in the perf-report
output.  Both the ebitmap_get_bit() and find_next_bit() functions
were, in fact, called by mls_level_isvalid(). As a result, the
mls_level_isvalid() call consumed 8.95% of the total CPU time of
all the 24 virtual CPUs which is quite a lot. The majority of the
mls_level_isvalid() function invocations come from the socket creation
system call.

Looking at the mls_level_isvalid() function, it is checking to see
if all the bits set in one of the ebitmap structure are also set in
another one as well as the highest set bit is no bigger than the one
specified by the given policydb data structure. It is doing it in
a bit-by-bit manner. So if the ebitmap structure has many bits set,
the iteration loop will be done many times.

The current code can be rewritten to use a similar algorithm as the
ebitmap_contains() function with an additional check for the
highest set bit. The ebitmap_contains() function was extended to
cover an optional additional check for the highest set bit, and the
mls_level_isvalid() function was modified to call ebitmap_contains().

With that change, the perf trace showed that the used CPU time drop
down to just 0.08% (ebitmap_contains + mls_level_isvalid) of the
total which is about 100X less than before.

  0.07%            ls  [kernel.kallsyms]     [k] ebitmap_contains
  0.05%            ls  [kernel.kallsyms]     [k] ebitmap_get_bit
  0.01%            ls  [kernel.kallsyms]     [k] mls_level_isvalid
  0.01%            ls  [kernel.kallsyms]     [k] find_next_bit

The remaining ebitmap_get_bit() and find_next_bit() functions calls
are made by other kernel routines as the new mls_level_isvalid()
function will not call them anymore.

This patch also improves the high_systime AIM7 benchmark result,
though the improvement is not as impressive as is suggested by the
reduction in CPU time spent in the ebitmap functions. The table below
shows the performance change on the 2-socket x86-64 system (with HT
on) mentioned above.

+--------------+---------------+----------------+-----------------+
|   Workload   | mean % change | mean % change  | mean % change   |
|              | 10-100 users  | 200-1000 users | 1100-2000 users |
+--------------+---------------+----------------+-----------------+
| high_systime |     +0.1%     |     +0.9%      |     +2.6%       |
+--------------+---------------+----------------+-----------------+

Signed-off-by: Waiman Long <Waiman.Long@hp.com>
---
 security/selinux/ss/ebitmap.c   |   20 ++++++++++++++++++--
 security/selinux/ss/ebitmap.h   |    2 +-
 security/selinux/ss/mls.c       |   22 +++++++---------------
 security/selinux/ss/mls_types.h |    2 +-
 4 files changed, 27 insertions(+), 19 deletions(-)

diff --git a/security/selinux/ss/ebitmap.c b/security/selinux/ss/ebitmap.c
index 30f119b..820313a 100644
--- a/security/selinux/ss/ebitmap.c
+++ b/security/selinux/ss/ebitmap.c
@@ -213,7 +213,12 @@ netlbl_import_failure:
 }
 #endif /* CONFIG_NETLABEL */
 
-int ebitmap_contains(struct ebitmap *e1, struct ebitmap *e2)
+/*
+ * Check to see if all the bits set in e2 are also set in e1. Optionally,
+ * if last_e2bit is non-zero, the highest set bit in e2 cannot exceed
+ * last_e2bit.
+ */
+int ebitmap_contains(struct ebitmap *e1, struct ebitmap *e2, u32 last_e2bit)
 {
 	struct ebitmap_node *n1, *n2;
 	int i;
@@ -223,14 +228,25 @@ int ebitmap_contains(struct ebitmap *e1, struct ebitmap *e2)
 
 	n1 = e1->node;
 	n2 = e2->node;
+
 	while (n1 && n2 && (n1->startbit <= n2->startbit)) {
 		if (n1->startbit < n2->startbit) {
 			n1 = n1->next;
 			continue;
 		}
-		for (i = 0; i < EBITMAP_UNIT_NUMS; i++) {
+		for (i = EBITMAP_UNIT_NUMS - 1; (i >= 0) && !n2->maps[i]; )
+			i--;	/* Skip trailing NULL map entries */
+		if (last_e2bit && (i >= 0)) {
+			u32 lastsetbit = n2->startbit + i * EBITMAP_UNIT_SIZE +
+					 __fls(n2->maps[i]);
+			if (lastsetbit > last_e2bit)
+				return 0;
+		}
+
+		while (i >= 0) {
 			if ((n1->maps[i] & n2->maps[i]) != n2->maps[i])
 				return 0;
+			i--;
 		}
 
 		n1 = n1->next;
diff --git a/security/selinux/ss/ebitmap.h b/security/selinux/ss/ebitmap.h
index 922f8af..e7eb3a9 100644
--- a/security/selinux/ss/ebitmap.h
+++ b/security/selinux/ss/ebitmap.h
@@ -117,7 +117,7 @@ static inline void ebitmap_node_clr_bit(struct ebitmap_node *n,
 
 int ebitmap_cmp(struct ebitmap *e1, struct ebitmap *e2);
 int ebitmap_cpy(struct ebitmap *dst, struct ebitmap *src);
-int ebitmap_contains(struct ebitmap *e1, struct ebitmap *e2);
+int ebitmap_contains(struct ebitmap *e1, struct ebitmap *e2, u32 last_e2bit);
 int ebitmap_get_bit(struct ebitmap *e, unsigned long bit);
 int ebitmap_set_bit(struct ebitmap *e, unsigned long bit, int value);
 void ebitmap_destroy(struct ebitmap *e);
diff --git a/security/selinux/ss/mls.c b/security/selinux/ss/mls.c
index 40de8d3..c85bc1e 100644
--- a/security/selinux/ss/mls.c
+++ b/security/selinux/ss/mls.c
@@ -160,8 +160,6 @@ void mls_sid_to_context(struct context *context,
 int mls_level_isvalid(struct policydb *p, struct mls_level *l)
 {
 	struct level_datum *levdatum;
-	struct ebitmap_node *node;
-	int i;
 
 	if (!l->sens || l->sens > p->p_levels.nprim)
 		return 0;
@@ -170,19 +168,13 @@ int mls_level_isvalid(struct policydb *p, struct mls_level *l)
 	if (!levdatum)
 		return 0;
 
-	ebitmap_for_each_positive_bit(&l->cat, node, i) {
-		if (i > p->p_cats.nprim)
-			return 0;
-		if (!ebitmap_get_bit(&levdatum->level->cat, i)) {
-			/*
-			 * Category may not be associated with
-			 * sensitivity.
-			 */
-			return 0;
-		}
-	}
-
-	return 1;
+	/*
+	 * Return 1 iff all the bits set in l->cat are also be set in
+	 * levdatum->level->cat and no bit in l->cat is larger than
+	 * p->p_cats.nprim.
+	 */
+	return ebitmap_contains(&levdatum->level->cat, &l->cat,
+				p->p_cats.nprim);
 }
 
 int mls_range_isvalid(struct policydb *p, struct mls_range *r)
diff --git a/security/selinux/ss/mls_types.h b/security/selinux/ss/mls_types.h
index 03bed52..e936487 100644
--- a/security/selinux/ss/mls_types.h
+++ b/security/selinux/ss/mls_types.h
@@ -35,7 +35,7 @@ static inline int mls_level_eq(struct mls_level *l1, struct mls_level *l2)
 static inline int mls_level_dom(struct mls_level *l1, struct mls_level *l2)
 {
 	return ((l1->sens >= l2->sens) &&
-		ebitmap_contains(&l1->cat, &l2->cat));
+		ebitmap_contains(&l1->cat, &l2->cat, 0));
 }
 
 #define mls_level_incomp(l1, l2) \
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2 v5] SELinux: Reduce overhead of mls_level_isvalid() function call
  2013-06-10 17:55 [PATCH 1/2 v5] SELinux: Reduce overhead of mls_level_isvalid() function call Waiman Long
@ 2013-06-11 11:49 ` Stephen Smalley
  2013-07-05 17:10   ` Waiman Long
  0 siblings, 1 reply; 6+ messages in thread
From: Stephen Smalley @ 2013-06-11 11:49 UTC (permalink / raw)
  To: Waiman Long
  Cc: James Morris, Eric Paris, linux-security-module, linux-kernel,
	Chandramouleeswaran, Aswin, Norton, Scott J

On 06/10/2013 01:55 PM, Waiman Long wrote:
> v4->v5:
>    - Fix scripts/checkpatch.pl warning.
>
> v3->v4:
>    - Merge the 2 separate while loops in ebitmap_contains() into
>      a single one.
>
> v2->v3:
>    - Remove unused local variables i, node from mls_level_isvalid().
>
> v1->v2:
>   - Move the new ebitmap comparison logic from mls_level_isvalid()
>     into the ebitmap_contains() helper function.
>   - Rerun perf and performance tests on the latest v3.10-rc4 kernel.
>
> While running the high_systime workload of the AIM7 benchmark on
> a 2-socket 12-core Westmere x86-64 machine running 3.10-rc4 kernel
> (with HT on), it was found that a pretty sizable amount of time was
> spent in the SELinux code. Below was the perf trace of the "perf
> record -a -s" of a test run at 1500 users:
>
>    5.04%            ls  [kernel.kallsyms]     [k] ebitmap_get_bit
>    1.96%            ls  [kernel.kallsyms]     [k] mls_level_isvalid
>    1.95%            ls  [kernel.kallsyms]     [k] find_next_bit
>
> The ebitmap_get_bit() was the hottest function in the perf-report
> output.  Both the ebitmap_get_bit() and find_next_bit() functions
> were, in fact, called by mls_level_isvalid(). As a result, the
> mls_level_isvalid() call consumed 8.95% of the total CPU time of
> all the 24 virtual CPUs which is quite a lot. The majority of the
> mls_level_isvalid() function invocations come from the socket creation
> system call.
>
> Looking at the mls_level_isvalid() function, it is checking to see
> if all the bits set in one of the ebitmap structure are also set in
> another one as well as the highest set bit is no bigger than the one
> specified by the given policydb data structure. It is doing it in
> a bit-by-bit manner. So if the ebitmap structure has many bits set,
> the iteration loop will be done many times.
>
> The current code can be rewritten to use a similar algorithm as the
> ebitmap_contains() function with an additional check for the
> highest set bit. The ebitmap_contains() function was extended to
> cover an optional additional check for the highest set bit, and the
> mls_level_isvalid() function was modified to call ebitmap_contains().
>
> With that change, the perf trace showed that the used CPU time drop
> down to just 0.08% (ebitmap_contains + mls_level_isvalid) of the
> total which is about 100X less than before.
>
>    0.07%            ls  [kernel.kallsyms]     [k] ebitmap_contains
>    0.05%            ls  [kernel.kallsyms]     [k] ebitmap_get_bit
>    0.01%            ls  [kernel.kallsyms]     [k] mls_level_isvalid
>    0.01%            ls  [kernel.kallsyms]     [k] find_next_bit
>
> The remaining ebitmap_get_bit() and find_next_bit() functions calls
> are made by other kernel routines as the new mls_level_isvalid()
> function will not call them anymore.
>
> This patch also improves the high_systime AIM7 benchmark result,
> though the improvement is not as impressive as is suggested by the
> reduction in CPU time spent in the ebitmap functions. The table below
> shows the performance change on the 2-socket x86-64 system (with HT
> on) mentioned above.
>
> +--------------+---------------+----------------+-----------------+
> |   Workload   | mean % change | mean % change  | mean % change   |
> |              | 10-100 users  | 200-1000 users | 1100-2000 users |
> +--------------+---------------+----------------+-----------------+
> | high_systime |     +0.1%     |     +0.9%      |     +2.6%       |
> +--------------+---------------+----------------+-----------------+
>
> Signed-off-by: Waiman Long <Waiman.Long@hp.com>

Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>

> ---
>   security/selinux/ss/ebitmap.c   |   20 ++++++++++++++++++--
>   security/selinux/ss/ebitmap.h   |    2 +-
>   security/selinux/ss/mls.c       |   22 +++++++---------------
>   security/selinux/ss/mls_types.h |    2 +-
>   4 files changed, 27 insertions(+), 19 deletions(-)
>
> diff --git a/security/selinux/ss/ebitmap.c b/security/selinux/ss/ebitmap.c
> index 30f119b..820313a 100644
> --- a/security/selinux/ss/ebitmap.c
> +++ b/security/selinux/ss/ebitmap.c
> @@ -213,7 +213,12 @@ netlbl_import_failure:
>   }
>   #endif /* CONFIG_NETLABEL */
>
> -int ebitmap_contains(struct ebitmap *e1, struct ebitmap *e2)
> +/*
> + * Check to see if all the bits set in e2 are also set in e1. Optionally,
> + * if last_e2bit is non-zero, the highest set bit in e2 cannot exceed
> + * last_e2bit.
> + */
> +int ebitmap_contains(struct ebitmap *e1, struct ebitmap *e2, u32 last_e2bit)
>   {
>   	struct ebitmap_node *n1, *n2;
>   	int i;
> @@ -223,14 +228,25 @@ int ebitmap_contains(struct ebitmap *e1, struct ebitmap *e2)
>
>   	n1 = e1->node;
>   	n2 = e2->node;
> +
>   	while (n1 && n2 && (n1->startbit <= n2->startbit)) {
>   		if (n1->startbit < n2->startbit) {
>   			n1 = n1->next;
>   			continue;
>   		}
> -		for (i = 0; i < EBITMAP_UNIT_NUMS; i++) {
> +		for (i = EBITMAP_UNIT_NUMS - 1; (i >= 0) && !n2->maps[i]; )
> +			i--;	/* Skip trailing NULL map entries */
> +		if (last_e2bit && (i >= 0)) {
> +			u32 lastsetbit = n2->startbit + i * EBITMAP_UNIT_SIZE +
> +					 __fls(n2->maps[i]);
> +			if (lastsetbit > last_e2bit)
> +				return 0;
> +		}
> +
> +		while (i >= 0) {
>   			if ((n1->maps[i] & n2->maps[i]) != n2->maps[i])
>   				return 0;
> +			i--;
>   		}
>
>   		n1 = n1->next;
> diff --git a/security/selinux/ss/ebitmap.h b/security/selinux/ss/ebitmap.h
> index 922f8af..e7eb3a9 100644
> --- a/security/selinux/ss/ebitmap.h
> +++ b/security/selinux/ss/ebitmap.h
> @@ -117,7 +117,7 @@ static inline void ebitmap_node_clr_bit(struct ebitmap_node *n,
>
>   int ebitmap_cmp(struct ebitmap *e1, struct ebitmap *e2);
>   int ebitmap_cpy(struct ebitmap *dst, struct ebitmap *src);
> -int ebitmap_contains(struct ebitmap *e1, struct ebitmap *e2);
> +int ebitmap_contains(struct ebitmap *e1, struct ebitmap *e2, u32 last_e2bit);
>   int ebitmap_get_bit(struct ebitmap *e, unsigned long bit);
>   int ebitmap_set_bit(struct ebitmap *e, unsigned long bit, int value);
>   void ebitmap_destroy(struct ebitmap *e);
> diff --git a/security/selinux/ss/mls.c b/security/selinux/ss/mls.c
> index 40de8d3..c85bc1e 100644
> --- a/security/selinux/ss/mls.c
> +++ b/security/selinux/ss/mls.c
> @@ -160,8 +160,6 @@ void mls_sid_to_context(struct context *context,
>   int mls_level_isvalid(struct policydb *p, struct mls_level *l)
>   {
>   	struct level_datum *levdatum;
> -	struct ebitmap_node *node;
> -	int i;
>
>   	if (!l->sens || l->sens > p->p_levels.nprim)
>   		return 0;
> @@ -170,19 +168,13 @@ int mls_level_isvalid(struct policydb *p, struct mls_level *l)
>   	if (!levdatum)
>   		return 0;
>
> -	ebitmap_for_each_positive_bit(&l->cat, node, i) {
> -		if (i > p->p_cats.nprim)
> -			return 0;
> -		if (!ebitmap_get_bit(&levdatum->level->cat, i)) {
> -			/*
> -			 * Category may not be associated with
> -			 * sensitivity.
> -			 */
> -			return 0;
> -		}
> -	}
> -
> -	return 1;
> +	/*
> +	 * Return 1 iff all the bits set in l->cat are also be set in
> +	 * levdatum->level->cat and no bit in l->cat is larger than
> +	 * p->p_cats.nprim.
> +	 */
> +	return ebitmap_contains(&levdatum->level->cat, &l->cat,
> +				p->p_cats.nprim);
>   }
>
>   int mls_range_isvalid(struct policydb *p, struct mls_range *r)
> diff --git a/security/selinux/ss/mls_types.h b/security/selinux/ss/mls_types.h
> index 03bed52..e936487 100644
> --- a/security/selinux/ss/mls_types.h
> +++ b/security/selinux/ss/mls_types.h
> @@ -35,7 +35,7 @@ static inline int mls_level_eq(struct mls_level *l1, struct mls_level *l2)
>   static inline int mls_level_dom(struct mls_level *l1, struct mls_level *l2)
>   {
>   	return ((l1->sens >= l2->sens) &&
> -		ebitmap_contains(&l1->cat, &l2->cat));
> +		ebitmap_contains(&l1->cat, &l2->cat, 0));
>   }
>
>   #define mls_level_incomp(l1, l2) \
>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2 v5] SELinux: Reduce overhead of mls_level_isvalid() function call
  2013-06-11 11:49 ` Stephen Smalley
@ 2013-07-05 17:10   ` Waiman Long
  2013-07-08 14:09     ` Stephen Smalley
  2013-07-08 16:30     ` Paul Moore
  0 siblings, 2 replies; 6+ messages in thread
From: Waiman Long @ 2013-07-05 17:10 UTC (permalink / raw)
  To: Stephen Smalley
  Cc: James Morris, Eric Paris, linux-security-module, linux-kernel,
	Chandramouleeswaran, Aswin, Norton, Scott J

On 06/11/2013 07:49 AM, Stephen Smalley wrote:
> On 06/10/2013 01:55 PM, Waiman Long wrote:
>> v4->v5:
>>    - Fix scripts/checkpatch.pl warning.
>>
>> v3->v4:
>>    - Merge the 2 separate while loops in ebitmap_contains() into
>>      a single one.
>>
>> v2->v3:
>>    - Remove unused local variables i, node from mls_level_isvalid().
>>
>> v1->v2:
>>   - Move the new ebitmap comparison logic from mls_level_isvalid()
>>     into the ebitmap_contains() helper function.
>>   - Rerun perf and performance tests on the latest v3.10-rc4 kernel.
>>
>> While running the high_systime workload of the AIM7 benchmark on
>> a 2-socket 12-core Westmere x86-64 machine running 3.10-rc4 kernel
>> (with HT on), it was found that a pretty sizable amount of time was
>> spent in the SELinux code. Below was the perf trace of the "perf
>> record -a -s" of a test run at 1500 users:
>>
>>    5.04%            ls  [kernel.kallsyms]     [k] ebitmap_get_bit
>>    1.96%            ls  [kernel.kallsyms]     [k] mls_level_isvalid
>>    1.95%            ls  [kernel.kallsyms]     [k] find_next_bit
>>
>> The ebitmap_get_bit() was the hottest function in the perf-report
>> output.  Both the ebitmap_get_bit() and find_next_bit() functions
>> were, in fact, called by mls_level_isvalid(). As a result, the
>> mls_level_isvalid() call consumed 8.95% of the total CPU time of
>> all the 24 virtual CPUs which is quite a lot. The majority of the
>> mls_level_isvalid() function invocations come from the socket creation
>> system call.
>>
>> Looking at the mls_level_isvalid() function, it is checking to see
>> if all the bits set in one of the ebitmap structure are also set in
>> another one as well as the highest set bit is no bigger than the one
>> specified by the given policydb data structure. It is doing it in
>> a bit-by-bit manner. So if the ebitmap structure has many bits set,
>> the iteration loop will be done many times.
>>
>> The current code can be rewritten to use a similar algorithm as the
>> ebitmap_contains() function with an additional check for the
>> highest set bit. The ebitmap_contains() function was extended to
>> cover an optional additional check for the highest set bit, and the
>> mls_level_isvalid() function was modified to call ebitmap_contains().
>>
>> With that change, the perf trace showed that the used CPU time drop
>> down to just 0.08% (ebitmap_contains + mls_level_isvalid) of the
>> total which is about 100X less than before.
>>
>>    0.07%            ls  [kernel.kallsyms]     [k] ebitmap_contains
>>    0.05%            ls  [kernel.kallsyms]     [k] ebitmap_get_bit
>>    0.01%            ls  [kernel.kallsyms]     [k] mls_level_isvalid
>>    0.01%            ls  [kernel.kallsyms]     [k] find_next_bit
>>
>> The remaining ebitmap_get_bit() and find_next_bit() functions calls
>> are made by other kernel routines as the new mls_level_isvalid()
>> function will not call them anymore.
>>
>> This patch also improves the high_systime AIM7 benchmark result,
>> though the improvement is not as impressive as is suggested by the
>> reduction in CPU time spent in the ebitmap functions. The table below
>> shows the performance change on the 2-socket x86-64 system (with HT
>> on) mentioned above.
>>
>> +--------------+---------------+----------------+-----------------+
>> |   Workload   | mean % change | mean % change  | mean % change   |
>> |              | 10-100 users  | 200-1000 users | 1100-2000 users |
>> +--------------+---------------+----------------+-----------------+
>> | high_systime |     +0.1%     |     +0.9%      |     +2.6%       |
>> +--------------+---------------+----------------+-----------------+
>>
>> Signed-off-by: Waiman Long <Waiman.Long@hp.com>
>
> Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
>

Thank for the Ack. Will that patch go into v3.11?

Regards,
Longman

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2 v5] SELinux: Reduce overhead of mls_level_isvalid() function call
  2013-07-05 17:10   ` Waiman Long
@ 2013-07-08 14:09     ` Stephen Smalley
  2013-07-08 16:30     ` Paul Moore
  1 sibling, 0 replies; 6+ messages in thread
From: Stephen Smalley @ 2013-07-08 14:09 UTC (permalink / raw)
  To: Waiman Long
  Cc: James Morris, Eric Paris, linux-security-module, linux-kernel,
	Chandramouleeswaran, Aswin, Norton, Scott J

On 07/05/2013 01:10 PM, Waiman Long wrote:
> On 06/11/2013 07:49 AM, Stephen Smalley wrote:
>> On 06/10/2013 01:55 PM, Waiman Long wrote:
>>> v4->v5:
>>>    - Fix scripts/checkpatch.pl warning.
>>>
>>> v3->v4:
>>>    - Merge the 2 separate while loops in ebitmap_contains() into
>>>      a single one.
>>>
>>> v2->v3:
>>>    - Remove unused local variables i, node from mls_level_isvalid().
>>>
>>> v1->v2:
>>>   - Move the new ebitmap comparison logic from mls_level_isvalid()
>>>     into the ebitmap_contains() helper function.
>>>   - Rerun perf and performance tests on the latest v3.10-rc4 kernel.
>>>
>>> While running the high_systime workload of the AIM7 benchmark on
>>> a 2-socket 12-core Westmere x86-64 machine running 3.10-rc4 kernel
>>> (with HT on), it was found that a pretty sizable amount of time was
>>> spent in the SELinux code. Below was the perf trace of the "perf
>>> record -a -s" of a test run at 1500 users:
>>>
>>>    5.04%            ls  [kernel.kallsyms]     [k] ebitmap_get_bit
>>>    1.96%            ls  [kernel.kallsyms]     [k] mls_level_isvalid
>>>    1.95%            ls  [kernel.kallsyms]     [k] find_next_bit
>>>
>>> The ebitmap_get_bit() was the hottest function in the perf-report
>>> output.  Both the ebitmap_get_bit() and find_next_bit() functions
>>> were, in fact, called by mls_level_isvalid(). As a result, the
>>> mls_level_isvalid() call consumed 8.95% of the total CPU time of
>>> all the 24 virtual CPUs which is quite a lot. The majority of the
>>> mls_level_isvalid() function invocations come from the socket creation
>>> system call.
>>>
>>> Looking at the mls_level_isvalid() function, it is checking to see
>>> if all the bits set in one of the ebitmap structure are also set in
>>> another one as well as the highest set bit is no bigger than the one
>>> specified by the given policydb data structure. It is doing it in
>>> a bit-by-bit manner. So if the ebitmap structure has many bits set,
>>> the iteration loop will be done many times.
>>>
>>> The current code can be rewritten to use a similar algorithm as the
>>> ebitmap_contains() function with an additional check for the
>>> highest set bit. The ebitmap_contains() function was extended to
>>> cover an optional additional check for the highest set bit, and the
>>> mls_level_isvalid() function was modified to call ebitmap_contains().
>>>
>>> With that change, the perf trace showed that the used CPU time drop
>>> down to just 0.08% (ebitmap_contains + mls_level_isvalid) of the
>>> total which is about 100X less than before.
>>>
>>>    0.07%            ls  [kernel.kallsyms]     [k] ebitmap_contains
>>>    0.05%            ls  [kernel.kallsyms]     [k] ebitmap_get_bit
>>>    0.01%            ls  [kernel.kallsyms]     [k] mls_level_isvalid
>>>    0.01%            ls  [kernel.kallsyms]     [k] find_next_bit
>>>
>>> The remaining ebitmap_get_bit() and find_next_bit() functions calls
>>> are made by other kernel routines as the new mls_level_isvalid()
>>> function will not call them anymore.
>>>
>>> This patch also improves the high_systime AIM7 benchmark result,
>>> though the improvement is not as impressive as is suggested by the
>>> reduction in CPU time spent in the ebitmap functions. The table below
>>> shows the performance change on the 2-socket x86-64 system (with HT
>>> on) mentioned above.
>>>
>>> +--------------+---------------+----------------+-----------------+
>>> |   Workload   | mean % change | mean % change  | mean % change   |
>>> |              | 10-100 users  | 200-1000 users | 1100-2000 users |
>>> +--------------+---------------+----------------+-----------------+
>>> | high_systime |     +0.1%     |     +0.9%      |     +2.6%       |
>>> +--------------+---------------+----------------+-----------------+
>>>
>>> Signed-off-by: Waiman Long <Waiman.Long@hp.com>
>>
>> Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
>>
>
> Thank for the Ack. Will that patch go into v3.11?

I hope so, but that's up to Eric Paris, who maintains the kernel tree 
for SELinux changes these days.



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2 v5] SELinux: Reduce overhead of mls_level_isvalid() function call
  2013-07-05 17:10   ` Waiman Long
  2013-07-08 14:09     ` Stephen Smalley
@ 2013-07-08 16:30     ` Paul Moore
  2013-07-08 20:05       ` Waiman Long
  1 sibling, 1 reply; 6+ messages in thread
From: Paul Moore @ 2013-07-08 16:30 UTC (permalink / raw)
  To: Waiman Long, Eric Paris
  Cc: Stephen Smalley, James Morris, linux-security-module,
	linux-kernel, Chandramouleeswaran, Aswin, Norton, Scott J,
	selinux

On Friday, July 05, 2013 01:10:32 PM Waiman Long wrote:
> On 06/11/2013 07:49 AM, Stephen Smalley wrote:
> > On 06/10/2013 01:55 PM, Waiman Long wrote:

...

> >> Signed-off-by: Waiman Long <Waiman.Long@hp.com>
> > 
> > Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
> 
> Thank for the Ack. Will that patch go into v3.11?

[NOTE: I add the SELinux list to the CC line, for future reference, be sure to 
send your SELinux patches there.]

Your patch looked reasonable to me and Stephen ACK'd it so I went ahead and 
pulled the 1/2 patch into my lblnet-next tree.  It is probably an abuse of the 
system, but as you noted it in the description, it does have an impact on 
socket creation so it isn't completely unrelated ;)

If you don't want me to include your patch let me know and I'll drop it.

Now, being in my lblnet-next tree means pretty much nothing in terms of 
actually getting upstream, but it will at least get the patch into tomorrow's 
spin of the linux-next tree.  I think is a good thing as it allows you to say 
"my patch has been in linux-next for the past X weeks!" whenever Eric gets 
around to merging patches again.

Here are the details for the lblnet-next tree:

 * git://git.infradead.org/users/pcmoore/lblnet-2.6_next
 * http://git.infradead.org/users/pcmoore/lblnet-2.6_next

Also, a snapshot of what currently resides there:

Paul Moore (9):
      selinux: fix problems in netnode when BUG() is compiled out
      lsm: split the xfrm_state_alloc_security() hook implementation
      selinux: cleanup and consolidate the XFRM alloc/clone/delete/free code
      selinux: cleanup selinux_xfrm_policy_lookup() ... 
      selinux: cleanup selinux_xfrm_sock_rcv_skb() ... 
      selinux: cleanup some comment and whitespace issues in the XFRM code
      selinux: cleanup selinux_xfrm_decode_session()
      selinux: cleanup the XFRM header
      selinux: remove the BUG_ON() from selinux_skb_xfrm_sid()

Waiman Long (1):
      SELinux: Reduce overhead of mls_level_isvalid() function call

 include/linux/security.h        |   26 ++
 security/capability.c           |   15 +
 security/security.c             |   13 -
 security/selinux/hooks.c        |   11 +
 security/selinux/include/xfrm.h |   45 ++--
 security/selinux/netnode.c      |    2
 security/selinux/ss/ebitmap.c   |   20 ++
 security/selinux/ss/ebitmap.h   |    2
 security/selinux/ss/mls.c       |   22 +-
 security/selinux/ss/mls_types.h |    2
 security/selinux/xfrm.c         |  453 ++++++++++++++++---------------------
 11 files changed, 291 insertions(+), 320 deletions(-)

-- 
paul moore
www.paul-moore.com


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2 v5] SELinux: Reduce overhead of mls_level_isvalid() function call
  2013-07-08 16:30     ` Paul Moore
@ 2013-07-08 20:05       ` Waiman Long
  0 siblings, 0 replies; 6+ messages in thread
From: Waiman Long @ 2013-07-08 20:05 UTC (permalink / raw)
  To: Paul Moore
  Cc: Eric Paris, Stephen Smalley, James Morris, linux-security-module,
	linux-kernel, Chandramouleeswaran, Aswin, Norton, Scott J,
	selinux

On 07/08/2013 12:30 PM, Paul Moore wrote:
> On Friday, July 05, 2013 01:10:32 PM Waiman Long wrote:
>> On 06/11/2013 07:49 AM, Stephen Smalley wrote:
>>> On 06/10/2013 01:55 PM, Waiman Long wrote:
> ...
>
>>>> Signed-off-by: Waiman Long<Waiman.Long@hp.com>
>>> Acked-by:  Stephen Smalley<sds@tycho.nsa.gov>
>> Thank for the Ack. Will that patch go into v3.11?
> [NOTE: I add the SELinux list to the CC line, for future reference, be sure to
> send your SELinux patches there.]
>
> Your patch looked reasonable to me and Stephen ACK'd it so I went ahead and
> pulled the 1/2 patch into my lblnet-next tree.  It is probably an abuse of the
> system, but as you noted it in the description, it does have an impact on
> socket creation so it isn't completely unrelated ;)
>
> If you don't want me to include your patch let me know and I'll drop it.

Sure. I would like to have my patch included. Thank for letting me know.

Regards,
Longman

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2013-07-08 20:07 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-06-10 17:55 [PATCH 1/2 v5] SELinux: Reduce overhead of mls_level_isvalid() function call Waiman Long
2013-06-11 11:49 ` Stephen Smalley
2013-07-05 17:10   ` Waiman Long
2013-07-08 14:09     ` Stephen Smalley
2013-07-08 16:30     ` Paul Moore
2013-07-08 20:05       ` Waiman Long

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).