All of lore.kernel.org
 help / color / mirror / Atom feed
* Resource usage of CIL compared to HLL
@ 2020-08-17 13:46 bauen1
  2020-08-17 17:49 ` James Carter
  0 siblings, 1 reply; 4+ messages in thread
From: bauen1 @ 2020-08-17 13:46 UTC (permalink / raw)
  To: selinux

Hi,

I usually test all my patches against refpolicy and my own cil policy (https://gitlab.com/bauen1/bauen1-policy/) on small VMs in the range of 1 vcpu, 512mb memory and a few gb of disk space (Comparable to the cheapest VPS plan you can get and still run reasonable stuff on).
Recently I've started hitting the memory limit while building my cil policy using semodule / secilc.

I've found that secilc can easily hit ~400mb memory usage while building dssp3 or ~260mb while building my policy. 
semodule invokes the same functions as secilc to build the policy but requires somewhere between 100mb - 200mb for whatever it is doing.
Running semodule against a normal refpolicy installation only requires ~160mb memory total.
This means that installing refpolicy on my VMs is not an issue, but even my CIL policy that is far from complete will easily OOM the machine.
While adding additional memory isn't really an issue, I'm a bit annoyed that building an incomplete CIL policy requires ~2.8 times the memory that a complete refpolicy requires.

After a bit of testing using valgrind, I believe this is mostly due to the way CIL handles blockinherit by duplicating the entire AST of the original block into the target.
This works very well and is very simple, but also doesn't scale very well.
For example my policy has a few "base templates", e.g. `file.template` that contain a lot of general use macros, e.g. `relabel_files`, `manage_blk_files`. A similar approach is taken by grift in dssp3.
All of these macros (~130) are copied to every block containing a file type (only ~470) resulting in a lot of duplicate memory.

Is it even possible to change libsepol, e.g. to use a COW for copy_ast_tree (and similiar) or is this behavior required e.g. for `in` or would a change not be worth it due to additional complexity ?

-- 
bauen1
https://dn42.bauen1.xyz/

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Resource usage of CIL compared to HLL
  2020-08-17 13:46 Resource usage of CIL compared to HLL bauen1
@ 2020-08-17 17:49 ` James Carter
  2020-08-17 18:08   ` Dominick Grift
  0 siblings, 1 reply; 4+ messages in thread
From: James Carter @ 2020-08-17 17:49 UTC (permalink / raw)
  To: bauen1; +Cc: selinux

On Mon, Aug 17, 2020 at 9:48 AM bauen1 <j2468h@googlemail.com> wrote:
>
> Hi,
>
> I usually test all my patches against refpolicy and my own cil policy (https://gitlab.com/bauen1/bauen1-policy/) on small VMs in the range of 1 vcpu, 512mb memory and a few gb of disk space (Comparable to the cheapest VPS plan you can get and still run reasonable stuff on).
> Recently I've started hitting the memory limit while building my cil policy using semodule / secilc.
>
> I've found that secilc can easily hit ~400mb memory usage while building dssp3 or ~260mb while building my policy.
> semodule invokes the same functions as secilc to build the policy but requires somewhere between 100mb - 200mb for whatever it is doing.
> Running semodule against a normal refpolicy installation only requires ~160mb memory total.
> This means that installing refpolicy on my VMs is not an issue, but even my CIL policy that is far from complete will easily OOM the machine.
> While adding additional memory isn't really an issue, I'm a bit annoyed that building an incomplete CIL policy requires ~2.8 times the memory that a complete refpolicy requires.
>
> After a bit of testing using valgrind, I believe this is mostly due to the way CIL handles blockinherit by duplicating the entire AST of the original block into the target.
> This works very well and is very simple, but also doesn't scale very well.
> For example my policy has a few "base templates", e.g. `file.template` that contain a lot of general use macros, e.g. `relabel_files`, `manage_blk_files`. A similar approach is taken by grift in dssp3.
> All of these macros (~130) are copied to every block containing a file type (only ~470) resulting in a lot of duplicate memory.
>
> Is it even possible to change libsepol, e.g. to use a COW for copy_ast_tree (and similiar) or is this behavior required e.g. for `in` or would a change not be worth it due to additional complexity ?
>

Long before we developed CIL I had experimented with parsing Refpolicy
with a lua program that I created. I was really worried about memory
usage when developing that, so I did not copy anything. When it was
proposed to copy the AST for CIL I was sceptical and reworked my lua
program to see what the impact would be. It turned out to be easier to
do, faster, and did not require any more memory. The memory lost due
to copying the AST was made up by not having as many symbol tables.

If a lot of the macros that are being inherited are not used, then it
might be worthwhile to add a step to remove unused macros. Of course,
to really save the memory usage only the macros that are going to be
used should be copied, but I don't think that would be easy to do.

I will admit that I am not a big user of inheritance. What is gained
from inheriting all of the macros like that?

Thanks for the report. I will take a look to see if there might be a
fairly easy way to improve the situation.
Jim

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Resource usage of CIL compared to HLL
  2020-08-17 17:49 ` James Carter
@ 2020-08-17 18:08   ` Dominick Grift
  2020-08-17 18:51     ` Dominick Grift
  0 siblings, 1 reply; 4+ messages in thread
From: Dominick Grift @ 2020-08-17 18:08 UTC (permalink / raw)
  To: James Carter; +Cc: bauen1, selinux

James Carter <jwcart2@gmail.com> writes:

> On Mon, Aug 17, 2020 at 9:48 AM bauen1 <j2468h@googlemail.com> wrote:
>>
>> Hi,
>>
>> I usually test all my patches against refpolicy and my own cil
>> policy (https://gitlab.com/bauen1/bauen1-policy/) on small VMs in
>> the range of 1 vcpu, 512mb memory and a few gb of disk space
>> (Comparable to the cheapest VPS plan you can get and still run
>> reasonable stuff on).
>> Recently I've started hitting the memory limit while building my cil policy using semodule / secilc.
>>
>> I've found that secilc can easily hit ~400mb memory usage while building dssp3 or ~260mb while building my policy.
>> semodule invokes the same functions as secilc to build the policy
>> but requires somewhere between 100mb - 200mb for whatever it is
>> doing.
>> Running semodule against a normal refpolicy installation only requires ~160mb memory total.
>> This means that installing refpolicy on my VMs is not an issue, but
>> even my CIL policy that is far from complete will easily OOM the
>> machine.
>> While adding additional memory isn't really an issue, I'm a bit
>> annoyed that building an incomplete CIL policy requires ~2.8 times
>> the memory that a complete refpolicy requires.
>>
>> After a bit of testing using valgrind, I believe this is mostly due
>> to the way CIL handles blockinherit by duplicating the entire AST of
>> the original block into the target.
>> This works very well and is very simple, but also doesn't scale very well.
>> For example my policy has a few "base templates",
>> e.g. `file.template` that contain a lot of general use macros,
>> e.g. `relabel_files`, `manage_blk_files`. A similar approach is
>> taken by grift in dssp3.
>> All of these macros (~130) are copied to every block containing a file type (only ~470) resulting in a lot of duplicate memory.
>>
>> Is it even possible to change libsepol, e.g. to use a COW for
>> copy_ast_tree (and similiar) or is this behavior required e.g. for
>> `in` or would a change not be worth it due to additional complexity
>> ?
>>
>
> Long before we developed CIL I had experimented with parsing Refpolicy
> with a lua program that I created. I was really worried about memory
> usage when developing that, so I did not copy anything. When it was
> proposed to copy the AST for CIL I was sceptical and reworked my lua
> program to see what the impact would be. It turned out to be easier to
> do, faster, and did not require any more memory. The memory lost due
> to copying the AST was made up by not having as many symbol tables.
>
> If a lot of the macros that are being inherited are not used, then it
> might be worthwhile to add a step to remove unused macros. Of course,
> to really save the memory usage only the macros that are going to be
> used should be copied, but I don't think that would be easy to do.
>
> I will admit that I am not a big user of inheritance. What is gained
> from inheriting all of the macros like that?

consistency and comprehensiveness.

In reffpolicy based policy its tempting to quickly copy and paste macros
when you need them, leading to all kinds of inconsistencies ranging from
descriptions that are wrong because one forgot to edit it after a copy
paste to inconsistent macro names because it can be hard to be
consistent with naming. Consistency is very important as there is almost
nothing as annoying as guessing an interface/macro name wrong time after
time because of an inconsistency.

Having a comphrensive collection of inherited macros means that most of
the time you dont have to deal with/worry about creating macros. It might also come
in handy later if at some point an CIL-aware audit2allow -R type
functionality arrives.

That at one point was a pain point with refpolicy I believe were
audit2allow -R wouldnt suggest an interface to use because the interface did not
exist. By predefining all macros you ensure that audit2allow -R finds
something.

>
> Thanks for the report. I will take a look to see if there might be a
> fairly easy way to improve the situation.
> Jim

-- 
gpg --locate-keys dominick.grift@defensec.nl
Key fingerprint = FCD2 3660 5D6B 9D27 7FC6  E0FF DA7E 521F 10F6 4098
https://sks-keyservers.net/pks/lookup?op=get&search=0xDA7E521F10F64098
Dominick Grift

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Resource usage of CIL compared to HLL
  2020-08-17 18:08   ` Dominick Grift
@ 2020-08-17 18:51     ` Dominick Grift
  0 siblings, 0 replies; 4+ messages in thread
From: Dominick Grift @ 2020-08-17 18:51 UTC (permalink / raw)
  To: James Carter; +Cc: bauen1, selinux

Dominick Grift <dominick.grift@defensec.nl> writes:

> James Carter <jwcart2@gmail.com> writes:
>
>> On Mon, Aug 17, 2020 at 9:48 AM bauen1 <j2468h@googlemail.com> wrote:
>>>
>>> Hi,
>>>
>>> I usually test all my patches against refpolicy and my own cil
>>> policy (https://gitlab.com/bauen1/bauen1-policy/) on small VMs in
>>> the range of 1 vcpu, 512mb memory and a few gb of disk space
>>> (Comparable to the cheapest VPS plan you can get and still run
>>> reasonable stuff on).
>>> Recently I've started hitting the memory limit while building my cil policy using semodule / secilc.
>>>
>>> I've found that secilc can easily hit ~400mb memory usage while building dssp3 or ~260mb while building my policy.
>>> semodule invokes the same functions as secilc to build the policy
>>> but requires somewhere between 100mb - 200mb for whatever it is
>>> doing.
>>> Running semodule against a normal refpolicy installation only requires ~160mb memory total.
>>> This means that installing refpolicy on my VMs is not an issue, but
>>> even my CIL policy that is far from complete will easily OOM the
>>> machine.
>>> While adding additional memory isn't really an issue, I'm a bit
>>> annoyed that building an incomplete CIL policy requires ~2.8 times
>>> the memory that a complete refpolicy requires.
>>>
>>> After a bit of testing using valgrind, I believe this is mostly due
>>> to the way CIL handles blockinherit by duplicating the entire AST of
>>> the original block into the target.
>>> This works very well and is very simple, but also doesn't scale very well.
>>> For example my policy has a few "base templates",
>>> e.g. `file.template` that contain a lot of general use macros,
>>> e.g. `relabel_files`, `manage_blk_files`. A similar approach is
>>> taken by grift in dssp3.
>>> All of these macros (~130) are copied to every block containing a file type (only ~470) resulting in a lot of duplicate memory.
>>>
>>> Is it even possible to change libsepol, e.g. to use a COW for
>>> copy_ast_tree (and similiar) or is this behavior required e.g. for
>>> `in` or would a change not be worth it due to additional complexity
>>> ?
>>>
>>
>> Long before we developed CIL I had experimented with parsing Refpolicy
>> with a lua program that I created. I was really worried about memory
>> usage when developing that, so I did not copy anything. When it was
>> proposed to copy the AST for CIL I was sceptical and reworked my lua
>> program to see what the impact would be. It turned out to be easier to
>> do, faster, and did not require any more memory. The memory lost due
>> to copying the AST was made up by not having as many symbol tables.
>>
>> If a lot of the macros that are being inherited are not used, then it
>> might be worthwhile to add a step to remove unused macros. Of course,
>> to really save the memory usage only the macros that are going to be
>> used should be copied, but I don't think that would be easy to do.
>>
>> I will admit that I am not a big user of inheritance. What is gained
>> from inheriting all of the macros like that?
>
> consistency and comprehensiveness.
>
> In reffpolicy based policy its tempting to quickly copy and paste macros
> when you need them, leading to all kinds of inconsistencies ranging from
> descriptions that are wrong because one forgot to edit it after a copy
> paste to inconsistent macro names because it can be hard to be
> consistent with naming. Consistency is very important as there is almost
> nothing as annoying as guessing an interface/macro name wrong time after
> time because of an inconsistency.
>
> Having a comphrensive collection of inherited macros means that most of
> the time you dont have to deal with/worry about creating macros. It might also come
> in handy later if at some point an CIL-aware audit2allow -R type
> functionality arrives.
>
> That at one point was a pain point with refpolicy I believe were
> audit2allow -R wouldnt suggest an interface to use because the interface did not
> exist. By predefining all macros you ensure that audit2allow -R finds
> something.

Or it returned a interface that was broader than needed because a more
fitting interface did not exist. Thereby needlessly opening up the
policy.

By templating and using the templates you ensure that the macros are
there and that they are what you expect they are

>
>>
>> Thanks for the report. I will take a look to see if there might be a
>> fairly easy way to improve the situation.
>> Jim

-- 
gpg --locate-keys dominick.grift@defensec.nl
Key fingerprint = FCD2 3660 5D6B 9D27 7FC6  E0FF DA7E 521F 10F6 4098
https://sks-keyservers.net/pks/lookup?op=get&search=0xDA7E521F10F64098
Dominick Grift

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-08-17 18:52 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-17 13:46 Resource usage of CIL compared to HLL bauen1
2020-08-17 17:49 ` James Carter
2020-08-17 18:08   ` Dominick Grift
2020-08-17 18:51     ` Dominick Grift

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.