linux-security-module.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Preferred subj= with multiple LSMs
@ 2019-07-12 16:33 Casey Schaufler
       [not found] ` <c46932ec-e38e-ba15-7ceb-70e0fe0ef5dc@schaufler-ca.com>
                   ` (2 more replies)
  0 siblings, 3 replies; 39+ messages in thread
From: Casey Schaufler @ 2019-07-12 16:33 UTC (permalink / raw)
  To: linux-audit, Linux Security Module list, Paul Moore, rgb, Steve Grubb
  Cc: casey

Which of these options would be preferred for audit records
when there are multiple active security modules? I'm not asking
if we should do it, I'm asking which of these options I should
implement when I do do it. I've prototyped #1 and #2. #4 is a
minor variant of #1 that is either better for compatibility or
worse, depending on how you want to look at it. I understand
that each of these offer challenges. If I've missed something
obvious, I'd be delighted to consider #5.

Thank you.

Option 1:

	subj=selinux='x:y:z:s:c',apparmor='a'

Option 2:

	subj=x:y:z:s:c subj=a

Option 3:

	lsms=selinux,apparmor subj=x:y:z:s:c subj=a

Option 4:

	subjs=selinux='x:y:z:s:c',apparmor='a'

Option 5:

	Something else.



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Preferred subj= with multiple LSMs
  2019-07-12 16:33 Preferred subj= with multiple LSMs Casey Schaufler
       [not found] ` <c46932ec-e38e-ba15-7ceb-70e0fe0ef5dc@schaufler-ca.com>
@ 2019-07-13 15:08 ` Steve Grubb
  2019-07-15 19:04   ` Richard Guy Briggs
       [not found] ` <1979804.kRvuSoDnao@x2>
  2 siblings, 1 reply; 39+ messages in thread
From: Steve Grubb @ 2019-07-13 15:08 UTC (permalink / raw)
  To: Casey Schaufler; +Cc: linux-audit, Linux Security Module list, Paul Moore, rgb

Hello,

On Friday, July 12, 2019 12:33:55 PM EDT Casey Schaufler wrote:
> Which of these options would be preferred for audit records
> when there are multiple active security modules?

I'd like to start out with what is the underlying problem that results in 
this? For example, we have pam. It has multiple modules each having a vote. 
If a module votes no, then we need to know who voted no and maybe why. We 
normally do not need to know who voted yes.

So, in a stacked situation, shouldn't each module make its own event, if 
required, just like pam? And then log the attributes as it knows them? Also, 
what model is being used? Does first module voting no end access voting? Or 
does each module get a vote even if one has already said no?

Also, we try to keep LSM subsystems separated by record type numbers. So, 
apparmour and selinux events are entirely different record numbers and 
formats. Combining everything into one record is going to be problematic for 
reporting.

-Steve

> I'm not asking
> if we should do it, I'm asking which of these options I should
> implement when I do do it. I've prototyped #1 and #2. #4 is a
> minor variant of #1 that is either better for compatibility or
> worse, depending on how you want to look at it. I understand
> that each of these offer challenges. If I've missed something
> obvious, I'd be delighted to consider #5.
> 
> Thank you.
> 
> Option 1:
> 
> 	subj=selinux='x:y:z:s:c',apparmor='a'
> 
> Option 2:
> 
> 	subj=x:y:z:s:c subj=a
> 
> Option 3:
> 
> 	lsms=selinux,apparmor subj=x:y:z:s:c subj=a
> 
> Option 4:
> 
> 	subjs=selinux='x:y:z:s:c',apparmor='a'
> 
> Option 5:
> 
> 	Something else.





^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Preferred subj= with multiple LSMs
  2019-07-13 15:08 ` Steve Grubb
@ 2019-07-15 19:04   ` Richard Guy Briggs
  0 siblings, 0 replies; 39+ messages in thread
From: Richard Guy Briggs @ 2019-07-15 19:04 UTC (permalink / raw)
  To: Steve Grubb
  Cc: Casey Schaufler, linux-audit, Linux Security Module list, Paul Moore

On 2019-07-13 11:08, Steve Grubb wrote:
> Hello,
> 
> On Friday, July 12, 2019 12:33:55 PM EDT Casey Schaufler wrote:
> > Which of these options would be preferred for audit records
> > when there are multiple active security modules?
> 
> I'd like to start out with what is the underlying problem that results in 
> this? For example, we have pam. It has multiple modules each having a vote. 
> If a module votes no, then we need to know who voted no and maybe why. We 
> normally do not need to know who voted yes.
> 
> So, in a stacked situation, shouldn't each module make its own event, if 
> required, just like pam? And then log the attributes as it knows them? Also, 
> what model is being used? Does first module voting no end access voting? Or 
> does each module get a vote even if one has already said no?
> 
> Also, we try to keep LSM subsystems separated by record type numbers. So, 
> apparmour and selinux events are entirely different record numbers and 
> formats. Combining everything into one record is going to be problematic for 
> reporting.

I was wrestling with the options below and was uncomfortable with all of
them because none of them was guaranteed not to break existing parsers.

Steve's answer is the obvious one, ideally allocating a seperate range
to each LSM with each message type having its own well defined format.

> -Steve
> 
> > I'm not asking
> > if we should do it, I'm asking which of these options I should
> > implement when I do do it. I've prototyped #1 and #2. #4 is a
> > minor variant of #1 that is either better for compatibility or
> > worse, depending on how you want to look at it. I understand
> > that each of these offer challenges. If I've missed something
> > obvious, I'd be delighted to consider #5.
> > 
> > Thank you.
> > 
> > Option 1:
> > 
> > 	subj=selinux='x:y:z:s:c',apparmor='a'
> > 
> > Option 2:
> > 
> > 	subj=x:y:z:s:c subj=a
> > 
> > Option 3:
> > 
> > 	lsms=selinux,apparmor subj=x:y:z:s:c subj=a
> > 
> > Option 4:
> > 
> > 	subjs=selinux='x:y:z:s:c',apparmor='a'
> > 
> > Option 5:
> > 
> > 	Something else.

- RGB

--
Richard Guy Briggs <rgb@redhat.com>
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Preferred subj= with multiple LSMs
       [not found]     ` <3577098.oGDFHdoSSQ@x2>
@ 2019-07-16 17:16       ` Casey Schaufler
  0 siblings, 0 replies; 39+ messages in thread
From: Casey Schaufler @ 2019-07-16 17:16 UTC (permalink / raw)
  To: Steve Grubb
  Cc: Paul Moore, Richard Guy Briggs, linux-audit, casey,
	Linux Security Module list

On 7/16/2019 9:14 AM, Steve Grubb wrote:
> On Tuesday, July 16, 2019 12:00:05 PM EDT Casey Schaufler wrote:
>>
>> Unless there's an objection I will use this format with
>> a slight modification. Smack allows commas in labels, so
>> using a bare comma can lead to ambiguity.
>>
>> lsms=smack,apparmor subj="TS/Alpha,Beta","a"

Oops! '/' isn't allowed in a Smack label. How embarrassing is that?

>>
>> It's more code change than some of the other options,
>> but if it has the best chance of working with user space
>> I'm game.
> Quoting has a specific meaning in audit fields. So, we really shouldn't do 
> that. We can simply pick another field delimiter. I really don't care which it 
> is as long as its illegal for use in a label. For example, we use 
>
> #define AUDIT_KEY_SEPARATOR 0x01
>
> to separate key fields. We can pick almost anything. (exclamation mark, semi-
> colon, hash, plus symbol, tilde, 0x02, whatever) But it will need to be 
> documented and put into the API so that everyone is aware of the convention.

Unless there's objection I'll document and use '/',

lsms=selinux,apparmor subj=a:b:c:d/a

If there is objection without alternative presented I'll use 0x02,
because no one (I hope) is going to allow that in their label, and
keys have set precedence for unprintable characters.

>
> -Steve
>
>


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Preferred subj= with multiple LSMs
       [not found]   ` <CAHC9VhSELVZN8feH56zsANqoHu16mPMD04Ww60W=r6tWs+8WnQ@mail.gmail.com>
@ 2019-07-16 17:29     ` Casey Schaufler
  2019-07-16 17:43       ` Paul Moore
  0 siblings, 1 reply; 39+ messages in thread
From: Casey Schaufler @ 2019-07-16 17:29 UTC (permalink / raw)
  To: Paul Moore, Steve Grubb
  Cc: Richard Guy Briggs, linux-audit, Linux Security Module list, casey

On 7/16/2019 10:12 AM, Paul Moore wrote:
> On Mon, Jul 15, 2019 at 6:56 PM Steve Grubb <sgrubb@redhat.com> wrote:
>> On Monday, July 15, 2019 5:28:56 PM EDT Paul Moore wrote:
>>> On Mon, Jul 15, 2019 at 3:37 PM Casey Schaufler <casey@schaufler-ca.com>
>> wrote:
>>>> On 7/15/2019 12:04 PM, Richard Guy Briggs wrote:
>>>>> On 2019-07-13 11:08, Steve Grubb wrote:
> ...
>
>>>>> Steve's answer is the obvious one, ideally allocating a seperate range
>>>>> to each LSM with each message type having its own well defined format.
>>>> It doesn't address the issue of success records, or records
>>>> generated outside the security modules.
>>> Yes, exactly.  The individual LSM will presumably will continue to
>>> generate their own audit records as they do today and I would imagine
>>> that the subject and object fields could remain as they do today for
>>> the LSM specific records.
>>>
>>> The trick is the other records which are not LSM specific but still
>>> want to include subject and/or object information.  Unfortunately we
>>> are stuck with some tough limitations given the current audit record
>>> format and Steve's audit userspace tools;
>> Not really. We just need to approach the problem thinking about how to make
>> it work based on how things currently work.
> I suppose it is all somewhat "subjective" - bad joke fully intended :)
> - with respect to what one considers good/bad/limiting.  My personal
> view is that an ideal solution would allow for multiple independent
> subj/obj labels without having to multiplex on a single subj/obj
> field.  My gut feeling is that this would confuse your tools, yes?
>
>> For example Casey had a list of possible formats. Like this one:
>>
>> Option 3:
>>         lsms=selinux,apparmor subj=x:y:z:s:c subj=a
>>
>> I'd suggest something almost like that. The first field could be a map to
>> decipher the labels. Then we could have a comma separated list of labels.
>>
>> lsms=selinux,apparmor subj=x:y:z:s:c,a
> Some quick comments:
>
> * My usual reminder that new fields for existing audit records must be
> added to the end of the record.
>
> * If we are going to multiplex the labels on a single field (more on
> that below) I might suggest using "subj_lsms" instead of "lsms" so we
> leave ourself some wiggle room in the future.
>
> * Multiplexing on a single "subj" field is going to be difficult
> because picking the label delimiter is going to be a pain.  For
> example, in the example above a comma is used, which at the very least
> is a valid part of a SELinux label and I suspect for Smack as well
> (I'm not sure about the other LSMs).  I suspect the only way to parse
> out the component labels would be to have knowledge of the LSMs in
> use, as well as the policies loaded at the time the audit record was
> generated.
>
> This may be a faulty assumption, but assuming your tools will fall
> over if they see multiple "subj" fields, could we do something like
> the following (something between option #2 and #3):
>
>   subj1_lsm=smack subj1=<smack_label> subj2_lsm=selinux
> subj2=<selinux_label> ...

If it's not a subj= field why use the indirection?

	subj_smack=<smack_label> subj_selinux=<selinux_label>

would be easier. 


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Preferred subj= with multiple LSMs
  2019-07-16 17:29     ` Casey Schaufler
@ 2019-07-16 17:43       ` Paul Moore
  2019-07-16 17:58         ` Casey Schaufler
  2019-07-16 18:06         ` Steve Grubb
  0 siblings, 2 replies; 39+ messages in thread
From: Paul Moore @ 2019-07-16 17:43 UTC (permalink / raw)
  To: Casey Schaufler
  Cc: Steve Grubb, Richard Guy Briggs, linux-audit, Linux Security Module list

On Tue, Jul 16, 2019 at 1:30 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
> On 7/16/2019 10:12 AM, Paul Moore wrote:
> > On Mon, Jul 15, 2019 at 6:56 PM Steve Grubb <sgrubb@redhat.com> wrote:
> >> On Monday, July 15, 2019 5:28:56 PM EDT Paul Moore wrote:
> >>> On Mon, Jul 15, 2019 at 3:37 PM Casey Schaufler <casey@schaufler-ca.com>
> >> wrote:
> >>>> On 7/15/2019 12:04 PM, Richard Guy Briggs wrote:
> >>>>> On 2019-07-13 11:08, Steve Grubb wrote:
> > ...
> >
> >>>>> Steve's answer is the obvious one, ideally allocating a seperate range
> >>>>> to each LSM with each message type having its own well defined format.
> >>>> It doesn't address the issue of success records, or records
> >>>> generated outside the security modules.
> >>> Yes, exactly.  The individual LSM will presumably will continue to
> >>> generate their own audit records as they do today and I would imagine
> >>> that the subject and object fields could remain as they do today for
> >>> the LSM specific records.
> >>>
> >>> The trick is the other records which are not LSM specific but still
> >>> want to include subject and/or object information.  Unfortunately we
> >>> are stuck with some tough limitations given the current audit record
> >>> format and Steve's audit userspace tools;
> >> Not really. We just need to approach the problem thinking about how to make
> >> it work based on how things currently work.
> > I suppose it is all somewhat "subjective" - bad joke fully intended :)
> > - with respect to what one considers good/bad/limiting.  My personal
> > view is that an ideal solution would allow for multiple independent
> > subj/obj labels without having to multiplex on a single subj/obj
> > field.  My gut feeling is that this would confuse your tools, yes?
> >
> >> For example Casey had a list of possible formats. Like this one:
> >>
> >> Option 3:
> >>         lsms=selinux,apparmor subj=x:y:z:s:c subj=a
> >>
> >> I'd suggest something almost like that. The first field could be a map to
> >> decipher the labels. Then we could have a comma separated list of labels.
> >>
> >> lsms=selinux,apparmor subj=x:y:z:s:c,a
> > Some quick comments:
> >
> > * My usual reminder that new fields for existing audit records must be
> > added to the end of the record.
> >
> > * If we are going to multiplex the labels on a single field (more on
> > that below) I might suggest using "subj_lsms" instead of "lsms" so we
> > leave ourself some wiggle room in the future.
> >
> > * Multiplexing on a single "subj" field is going to be difficult
> > because picking the label delimiter is going to be a pain.  For
> > example, in the example above a comma is used, which at the very least
> > is a valid part of a SELinux label and I suspect for Smack as well
> > (I'm not sure about the other LSMs).  I suspect the only way to parse
> > out the component labels would be to have knowledge of the LSMs in
> > use, as well as the policies loaded at the time the audit record was
> > generated.
> >
> > This may be a faulty assumption, but assuming your tools will fall
> > over if they see multiple "subj" fields, could we do something like
> > the following (something between option #2 and #3):
> >
> >   subj1_lsm=smack subj1=<smack_label> subj2_lsm=selinux
> > subj2=<selinux_label> ...
>
> If it's not a subj= field why use the indirection?
>
>         subj_smack=<smack_label> subj_selinux=<selinux_label>
>
> would be easier.

Good point, that looks reasonable to me.

-- 
paul moore
www.paul-moore.com

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Preferred subj= with multiple LSMs
  2019-07-16 17:43       ` Paul Moore
@ 2019-07-16 17:58         ` Casey Schaufler
  2019-07-16 18:06         ` Steve Grubb
  1 sibling, 0 replies; 39+ messages in thread
From: Casey Schaufler @ 2019-07-16 17:58 UTC (permalink / raw)
  To: Paul Moore
  Cc: Steve Grubb, Richard Guy Briggs, linux-audit,
	Linux Security Module list, casey

On 7/16/2019 10:43 AM, Paul Moore wrote:
> On Tue, Jul 16, 2019 at 1:30 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
>> On 7/16/2019 10:12 AM, Paul Moore wrote:
>>> On Mon, Jul 15, 2019 at 6:56 PM Steve Grubb <sgrubb@redhat.com> wrote:
>>>> On Monday, July 15, 2019 5:28:56 PM EDT Paul Moore wrote:
>>>>> On Mon, Jul 15, 2019 at 3:37 PM Casey Schaufler <casey@schaufler-ca.com>
>>>> wrote:
>>>>>> On 7/15/2019 12:04 PM, Richard Guy Briggs wrote:
>>>>>>> On 2019-07-13 11:08, Steve Grubb wrote:
>>> ...
>>>
>>>>>>> Steve's answer is the obvious one, ideally allocating a seperate range
>>>>>>> to each LSM with each message type having its own well defined format.
>>>>>> It doesn't address the issue of success records, or records
>>>>>> generated outside the security modules.
>>>>> Yes, exactly.  The individual LSM will presumably will continue to
>>>>> generate their own audit records as they do today and I would imagine
>>>>> that the subject and object fields could remain as they do today for
>>>>> the LSM specific records.
>>>>>
>>>>> The trick is the other records which are not LSM specific but still
>>>>> want to include subject and/or object information.  Unfortunately we
>>>>> are stuck with some tough limitations given the current audit record
>>>>> format and Steve's audit userspace tools;
>>>> Not really. We just need to approach the problem thinking about how to make
>>>> it work based on how things currently work.
>>> I suppose it is all somewhat "subjective" - bad joke fully intended :)
>>> - with respect to what one considers good/bad/limiting.  My personal
>>> view is that an ideal solution would allow for multiple independent
>>> subj/obj labels without having to multiplex on a single subj/obj
>>> field.  My gut feeling is that this would confuse your tools, yes?
>>>
>>>> For example Casey had a list of possible formats. Like this one:
>>>>
>>>> Option 3:
>>>>         lsms=selinux,apparmor subj=x:y:z:s:c subj=a
>>>>
>>>> I'd suggest something almost like that. The first field could be a map to
>>>> decipher the labels. Then we could have a comma separated list of labels.
>>>>
>>>> lsms=selinux,apparmor subj=x:y:z:s:c,a
>>> Some quick comments:
>>>
>>> * My usual reminder that new fields for existing audit records must be
>>> added to the end of the record.
>>>
>>> * If we are going to multiplex the labels on a single field (more on
>>> that below) I might suggest using "subj_lsms" instead of "lsms" so we
>>> leave ourself some wiggle room in the future.
>>>
>>> * Multiplexing on a single "subj" field is going to be difficult
>>> because picking the label delimiter is going to be a pain.  For
>>> example, in the example above a comma is used, which at the very least
>>> is a valid part of a SELinux label and I suspect for Smack as well
>>> (I'm not sure about the other LSMs).  I suspect the only way to parse
>>> out the component labels would be to have knowledge of the LSMs in
>>> use, as well as the policies loaded at the time the audit record was
>>> generated.
>>>
>>> This may be a faulty assumption, but assuming your tools will fall
>>> over if they see multiple "subj" fields, could we do something like
>>> the following (something between option #2 and #3):
>>>
>>>   subj1_lsm=smack subj1=<smack_label> subj2_lsm=selinux
>>> subj2=<selinux_label> ...
>> If it's not a subj= field why use the indirection?
>>
>>         subj_smack=<smack_label> subj_selinux=<selinux_label>
>>
>> would be easier.
> Good point, that looks reasonable to me.

Which raises the question of what to do with the subj= :

	- omit it
	- subj=?
	- subj=some-special-message
	- subj=label-of-first-lsm



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Preferred subj= with multiple LSMs
  2019-07-16 17:43       ` Paul Moore
  2019-07-16 17:58         ` Casey Schaufler
@ 2019-07-16 18:06         ` Steve Grubb
  2019-07-16 18:41           ` Casey Schaufler
  1 sibling, 1 reply; 39+ messages in thread
From: Steve Grubb @ 2019-07-16 18:06 UTC (permalink / raw)
  To: Paul Moore
  Cc: Casey Schaufler, Richard Guy Briggs, linux-audit,
	Linux Security Module list

On Tuesday, July 16, 2019 1:43:18 PM EDT Paul Moore wrote:
> On Tue, Jul 16, 2019 at 1:30 PM Casey Schaufler <casey@schaufler-ca.com> 
wrote:
> > On 7/16/2019 10:12 AM, Paul Moore wrote:
> > > On Mon, Jul 15, 2019 at 6:56 PM Steve Grubb <sgrubb@redhat.com> wrote:
> > >> On Monday, July 15, 2019 5:28:56 PM EDT Paul Moore wrote:
> > >>> On Mon, Jul 15, 2019 at 3:37 PM Casey Schaufler
> > >>> <casey@schaufler-ca.com>
> > >> 
> > >> wrote:
> > >>>> On 7/15/2019 12:04 PM, Richard Guy Briggs wrote:
> > >>>>> On 2019-07-13 11:08, Steve Grubb wrote:
> > > ...
> > > 
> > >>>>> Steve's answer is the obvious one, ideally allocating a seperate
> > >>>>> range
> > >>>>> to each LSM with each message type having its own well defined
> > >>>>> format.
> > >>>> 
> > >>>> It doesn't address the issue of success records, or records
> > >>>> generated outside the security modules.
> > >>> 
> > >>> Yes, exactly.  The individual LSM will presumably will continue to
> > >>> generate their own audit records as they do today and I would imagine
> > >>> that the subject and object fields could remain as they do today for
> > >>> the LSM specific records.
> > >>> 
> > >>> The trick is the other records which are not LSM specific but still
> > >>> want to include subject and/or object information.  Unfortunately we
> > >>> are stuck with some tough limitations given the current audit record
> > >>> format and Steve's audit userspace tools;
> > >> 
> > >> Not really. We just need to approach the problem thinking about how to
> > >> make it work based on how things currently work.
> > > 
> > > I suppose it is all somewhat "subjective" - bad joke fully intended :)
> > > - with respect to what one considers good/bad/limiting.  My personal
> > > view is that an ideal solution would allow for multiple independent
> > > subj/obj labels without having to multiplex on a single subj/obj
> > > field.  My gut feeling is that this would confuse your tools, yes?
> > > 
> > >> For example Casey had a list of possible formats. Like this one:
> > >> 
> > >> Option 3:
> > >>         lsms=selinux,apparmor subj=x:y:z:s:c subj=a
> > >> 
> > >> I'd suggest something almost like that. The first field could be a map
> > >> to
> > >> decipher the labels. Then we could have a comma separated list of
> > >> labels.
> > >> 
> > >> lsms=selinux,apparmor subj=x:y:z:s:c,a
> > > 
> > > Some quick comments:
> > > 
> > > * My usual reminder that new fields for existing audit records must be
> > > added to the end of the record.
> > > 
> > > * If we are going to multiplex the labels on a single field (more on
> > > that below) I might suggest using "subj_lsms" instead of "lsms" so we
> > > leave ourself some wiggle room in the future.
> > > 
> > > * Multiplexing on a single "subj" field is going to be difficult
> > > because picking the label delimiter is going to be a pain.  For
> > > example, in the example above a comma is used, which at the very least
> > > is a valid part of a SELinux label and I suspect for Smack as well
> > > (I'm not sure about the other LSMs).  I suspect the only way to parse
> > > out the component labels would be to have knowledge of the LSMs in
> > > use, as well as the policies loaded at the time the audit record was
> > > generated.
> > > 
> > > This may be a faulty assumption, but assuming your tools will fall
> > > over if they see multiple "subj" fields, could we do something like
> > > 
> > > the following (something between option #2 and #3):
> > >   subj1_lsm=smack subj1=<smack_label> subj2_lsm=selinux
> > > 
> > > subj2=<selinux_label> ...
> > 
> > If it's not a subj= field why use the indirection?
> > 
> >         subj_smack=<smack_label> subj_selinux=<selinux_label>
> > 
> > would be easier.
> 
> Good point, that looks reasonable to me.

But doing something like this will totally break all parsers. To be honest, I 
don't know if I'll ever see more than one labeled security system running at 
the same time. And this would be a big penalty to pay for the flexibility that 
someone, somewhere just might possibly do this.

-Steve




^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Preferred subj= with multiple LSMs
  2019-07-16 18:06         ` Steve Grubb
@ 2019-07-16 18:41           ` Casey Schaufler
  2019-07-16 21:25             ` Paul Moore
  0 siblings, 1 reply; 39+ messages in thread
From: Casey Schaufler @ 2019-07-16 18:41 UTC (permalink / raw)
  To: Steve Grubb, Paul Moore
  Cc: Richard Guy Briggs, linux-audit, Linux Security Module list, casey

On 7/16/2019 11:06 AM, Steve Grubb wrote:
> On Tuesday, July 16, 2019 1:43:18 PM EDT Paul Moore wrote:
>> On Tue, Jul 16, 2019 at 1:30 PM Casey Schaufler <casey@schaufler-ca.com> 
> wrote:
>>> On 7/16/2019 10:12 AM, Paul Moore wrote:
>>>> On Mon, Jul 15, 2019 at 6:56 PM Steve Grubb <sgrubb@redhat.com> wrote:
>>>>> On Monday, July 15, 2019 5:28:56 PM EDT Paul Moore wrote:
>>>>>> On Mon, Jul 15, 2019 at 3:37 PM Casey Schaufler
>>>>>> <casey@schaufler-ca.com>
>>>>> wrote:
>>>>>>> On 7/15/2019 12:04 PM, Richard Guy Briggs wrote:
>>>>>>>> On 2019-07-13 11:08, Steve Grubb wrote:
>>>> ...
>>>>
>>>>>>>> Steve's answer is the obvious one, ideally allocating a seperate
>>>>>>>> range
>>>>>>>> to each LSM with each message type having its own well defined
>>>>>>>> format.
>>>>>>> It doesn't address the issue of success records, or records
>>>>>>> generated outside the security modules.
>>>>>> Yes, exactly.  The individual LSM will presumably will continue to
>>>>>> generate their own audit records as they do today and I would imagine
>>>>>> that the subject and object fields could remain as they do today for
>>>>>> the LSM specific records.
>>>>>>
>>>>>> The trick is the other records which are not LSM specific but still
>>>>>> want to include subject and/or object information.  Unfortunately we
>>>>>> are stuck with some tough limitations given the current audit record
>>>>>> format and Steve's audit userspace tools;
>>>>> Not really. We just need to approach the problem thinking about how to
>>>>> make it work based on how things currently work.
>>>> I suppose it is all somewhat "subjective" - bad joke fully intended :)
>>>> - with respect to what one considers good/bad/limiting.  My personal
>>>> view is that an ideal solution would allow for multiple independent
>>>> subj/obj labels without having to multiplex on a single subj/obj
>>>> field.  My gut feeling is that this would confuse your tools, yes?
>>>>
>>>>> For example Casey had a list of possible formats. Like this one:
>>>>>
>>>>> Option 3:
>>>>>         lsms=selinux,apparmor subj=x:y:z:s:c subj=a
>>>>>
>>>>> I'd suggest something almost like that. The first field could be a map
>>>>> to
>>>>> decipher the labels. Then we could have a comma separated list of
>>>>> labels.
>>>>>
>>>>> lsms=selinux,apparmor subj=x:y:z:s:c,a
>>>> Some quick comments:
>>>>
>>>> * My usual reminder that new fields for existing audit records must be
>>>> added to the end of the record.
>>>>
>>>> * If we are going to multiplex the labels on a single field (more on
>>>> that below) I might suggest using "subj_lsms" instead of "lsms" so we
>>>> leave ourself some wiggle room in the future.
>>>>
>>>> * Multiplexing on a single "subj" field is going to be difficult
>>>> because picking the label delimiter is going to be a pain.  For
>>>> example, in the example above a comma is used, which at the very least
>>>> is a valid part of a SELinux label and I suspect for Smack as well
>>>> (I'm not sure about the other LSMs).  I suspect the only way to parse
>>>> out the component labels would be to have knowledge of the LSMs in
>>>> use, as well as the policies loaded at the time the audit record was
>>>> generated.
>>>>
>>>> This may be a faulty assumption, but assuming your tools will fall
>>>> over if they see multiple "subj" fields, could we do something like
>>>>
>>>> the following (something between option #2 and #3):
>>>>   subj1_lsm=smack subj1=<smack_label> subj2_lsm=selinux
>>>>
>>>> subj2=<selinux_label> ...
>>> If it's not a subj= field why use the indirection?
>>>
>>>         subj_smack=<smack_label> subj_selinux=<selinux_label>
>>>
>>> would be easier.
>> Good point, that looks reasonable to me.
> But doing something like this will totally break all parsers. To be honest, I 
> don't know if I'll ever see more than one labeled security system running at 
> the same time. And this would be a big penalty to pay for the flexibility that 
> someone, somewhere just might possibly do this.

While I have never seen multiple-LSM plans from RedHat/IBM I
have seen them from Ubuntu. This isn't hypothetical. I know that
it's a hard problem, which is why we need to get it as right as
possible.



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Preferred subj= with multiple LSMs
  2019-07-16 18:41           ` Casey Schaufler
@ 2019-07-16 21:25             ` Paul Moore
  2019-07-16 21:46               ` Steve Grubb
  2019-07-18 15:01               ` William Roberts
  0 siblings, 2 replies; 39+ messages in thread
From: Paul Moore @ 2019-07-16 21:25 UTC (permalink / raw)
  To: Casey Schaufler
  Cc: Steve Grubb, Richard Guy Briggs, linux-audit, Linux Security Module list

On Tue, Jul 16, 2019 at 2:41 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
> On 7/16/2019 11:06 AM, Steve Grubb wrote:
> > On Tuesday, July 16, 2019 1:43:18 PM EDT Paul Moore wrote:
> >> On Tue, Jul 16, 2019 at 1:30 PM Casey Schaufler <casey@schaufler-ca.com>
> > wrote:
> >>> On 7/16/2019 10:12 AM, Paul Moore wrote:
> >>>> On Mon, Jul 15, 2019 at 6:56 PM Steve Grubb <sgrubb@redhat.com> wrote:
> >>>>> On Monday, July 15, 2019 5:28:56 PM EDT Paul Moore wrote:
> >>>>>> On Mon, Jul 15, 2019 at 3:37 PM Casey Schaufler
> >>>>>> <casey@schaufler-ca.com>
> >>>>> wrote:
> >>>>>>> On 7/15/2019 12:04 PM, Richard Guy Briggs wrote:
> >>>>>>>> On 2019-07-13 11:08, Steve Grubb wrote:
> >>>> ...
> >>>>
> >>>>>>>> Steve's answer is the obvious one, ideally allocating a seperate
> >>>>>>>> range
> >>>>>>>> to each LSM with each message type having its own well defined
> >>>>>>>> format.
> >>>>>>> It doesn't address the issue of success records, or records
> >>>>>>> generated outside the security modules.
> >>>>>> Yes, exactly.  The individual LSM will presumably will continue to
> >>>>>> generate their own audit records as they do today and I would imagine
> >>>>>> that the subject and object fields could remain as they do today for
> >>>>>> the LSM specific records.
> >>>>>>
> >>>>>> The trick is the other records which are not LSM specific but still
> >>>>>> want to include subject and/or object information.  Unfortunately we
> >>>>>> are stuck with some tough limitations given the current audit record
> >>>>>> format and Steve's audit userspace tools;
> >>>>> Not really. We just need to approach the problem thinking about how to
> >>>>> make it work based on how things currently work.
> >>>> I suppose it is all somewhat "subjective" - bad joke fully intended :)
> >>>> - with respect to what one considers good/bad/limiting.  My personal
> >>>> view is that an ideal solution would allow for multiple independent
> >>>> subj/obj labels without having to multiplex on a single subj/obj
> >>>> field.  My gut feeling is that this would confuse your tools, yes?
> >>>>
> >>>>> For example Casey had a list of possible formats. Like this one:
> >>>>>
> >>>>> Option 3:
> >>>>>         lsms=selinux,apparmor subj=x:y:z:s:c subj=a
> >>>>>
> >>>>> I'd suggest something almost like that. The first field could be a map
> >>>>> to
> >>>>> decipher the labels. Then we could have a comma separated list of
> >>>>> labels.
> >>>>>
> >>>>> lsms=selinux,apparmor subj=x:y:z:s:c,a
> >>>> Some quick comments:
> >>>>
> >>>> * My usual reminder that new fields for existing audit records must be
> >>>> added to the end of the record.
> >>>>
> >>>> * If we are going to multiplex the labels on a single field (more on
> >>>> that below) I might suggest using "subj_lsms" instead of "lsms" so we
> >>>> leave ourself some wiggle room in the future.
> >>>>
> >>>> * Multiplexing on a single "subj" field is going to be difficult
> >>>> because picking the label delimiter is going to be a pain.  For
> >>>> example, in the example above a comma is used, which at the very least
> >>>> is a valid part of a SELinux label and I suspect for Smack as well
> >>>> (I'm not sure about the other LSMs).  I suspect the only way to parse
> >>>> out the component labels would be to have knowledge of the LSMs in
> >>>> use, as well as the policies loaded at the time the audit record was
> >>>> generated.
> >>>>
> >>>> This may be a faulty assumption, but assuming your tools will fall
> >>>> over if they see multiple "subj" fields, could we do something like
> >>>>
> >>>> the following (something between option #2 and #3):
> >>>>   subj1_lsm=smack subj1=<smack_label> subj2_lsm=selinux
> >>>>
> >>>> subj2=<selinux_label> ...
> >>> If it's not a subj= field why use the indirection?
> >>>
> >>>         subj_smack=<smack_label> subj_selinux=<selinux_label>
> >>>
> >>> would be easier.
> >>
> >> Good point, that looks reasonable to me.
> >
> > But doing something like this will totally break all parsers. To be honest, I
> > don't know if I'll ever see more than one labeled security system running at
> > the same time. And this would be a big penalty to pay for the flexibility that
> > someone, somewhere just might possibly do this.
>
> While I have never seen multiple-LSM plans from RedHat/IBM I
> have seen them from Ubuntu. This isn't hypothetical. I know that
> it's a hard problem, which is why we need to get it as right as
> possible.

Agreed.  While I'm not going to be on a specific Linux release, I do
believe that at some point in the future the LSM stacking work is
going to land in Linus' tree.  Perhaps you'll never see it Steve, but
we need to prepare the code to handle it when it happens.

For my own sanity, here is a quick summary of the constraints as I
currently see them, please feel free to add/disagree:

* We can't have multiple "subj" fields in a single audit record.
* The different LSMs all have different label formats and allowed
characters.  Further, a given label format may not be unique for a
given LSM; for example, Smack could be configured with a subset of
SELinux labels.
* Steve's audit tools appear to require a "subj" and "obj" fields for
LSM information or else they break into tiny little pieces.

What if we preserved the existing subj/obj fields in the case where
there is only one "major" LSM (SELinux, Smack, AppArmor, etc.):

  subj=<lsm_label>

... and in the case of multiple major LSMs we set the subj value to
"?" and introduce new subj_X fields (as necessary) as discussed above:

  subj=? subj_smack=<smack_label> subj_selinux=<selinux_label> ...

... I believe that Steve's old/existing userspace tools would simply
report "?"/unknown LSM credentials where new multi-LSM tools could
report the multiple different labels.  While this may not be perfect,
it avoids having to multiplex the different labels into a single field
(which is a big win IMHO) with the only issue being that multi-LSM
solutions will need an updated audit toolset to see the new labels
(which seems like a reasonable requirement).

-- 
paul moore
www.paul-moore.com

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Preferred subj= with multiple LSMs
  2019-07-16 21:25             ` Paul Moore
@ 2019-07-16 21:46               ` Steve Grubb
  2019-07-16 22:18                 ` Casey Schaufler
  2019-07-16 23:09                 ` Paul Moore
  2019-07-18 15:01               ` William Roberts
  1 sibling, 2 replies; 39+ messages in thread
From: Steve Grubb @ 2019-07-16 21:46 UTC (permalink / raw)
  To: Paul Moore
  Cc: Casey Schaufler, Richard Guy Briggs, linux-audit,
	Linux Security Module list

On Tuesday, July 16, 2019 5:25:21 PM EDT Paul Moore wrote:
> On Tue, Jul 16, 2019 at 2:41 PM Casey Schaufler <casey@schaufler-ca.com> 
wrote:
> > On 7/16/2019 11:06 AM, Steve Grubb wrote:
> > > On Tuesday, July 16, 2019 1:43:18 PM EDT Paul Moore wrote:
> > >> On Tue, Jul 16, 2019 at 1:30 PM Casey Schaufler
> > >> <casey@schaufler-ca.com>
> > > 
> > > wrote:
> > >>> On 7/16/2019 10:12 AM, Paul Moore wrote:
> > >>>> On Mon, Jul 15, 2019 at 6:56 PM Steve Grubb <sgrubb@redhat.com> 
wrote:
> > >>>>> On Monday, July 15, 2019 5:28:56 PM EDT Paul Moore wrote:
> > >>>>>> On Mon, Jul 15, 2019 at 3:37 PM Casey Schaufler
> > >>>>>> <casey@schaufler-ca.com>
> > >>>>> 
> > >>>>> wrote:
> > >>>>>>> On 7/15/2019 12:04 PM, Richard Guy Briggs wrote:
> > >>>>>>>> On 2019-07-13 11:08, Steve Grubb wrote:
> > >>>> ...
> > >>>> 
> > >>>>>>>> Steve's answer is the obvious one, ideally allocating a seperate
> > >>>>>>>> range
> > >>>>>>>> to each LSM with each message type having its own well defined
> > >>>>>>>> format.
> > >>>>>>> 
> > >>>>>>> It doesn't address the issue of success records, or records
> > >>>>>>> generated outside the security modules.
> > >>>>>> 
> > >>>>>> Yes, exactly.  The individual LSM will presumably will continue to
> > >>>>>> generate their own audit records as they do today and I would
> > >>>>>> imagine
> > >>>>>> that the subject and object fields could remain as they do today
> > >>>>>> for
> > >>>>>> the LSM specific records.
> > >>>>>> 
> > >>>>>> The trick is the other records which are not LSM specific but
> > >>>>>> still
> > >>>>>> want to include subject and/or object information.  Unfortunately
> > >>>>>> we
> > >>>>>> are stuck with some tough limitations given the current audit
> > >>>>>> record
> > >>>>>> format and Steve's audit userspace tools;
> > >>>>> 
> > >>>>> Not really. We just need to approach the problem thinking about how
> > >>>>> to
> > >>>>> make it work based on how things currently work.
> > >>>> 
> > >>>> I suppose it is all somewhat "subjective" - bad joke fully intended
> > >>>> :)
> > >>>> - with respect to what one considers good/bad/limiting.  My personal
> > >>>> view is that an ideal solution would allow for multiple independent
> > >>>> subj/obj labels without having to multiplex on a single subj/obj
> > >>>> field.  My gut feeling is that this would confuse your tools, yes?
> > >>>> 
> > >>>>> For example Casey had a list of possible formats. Like this one:
> > >>>>> 
> > >>>>> Option 3:
> > >>>>>         lsms=selinux,apparmor subj=x:y:z:s:c subj=a
> > >>>>> 
> > >>>>> I'd suggest something almost like that. The first field could be a
> > >>>>> map
> > >>>>> to
> > >>>>> decipher the labels. Then we could have a comma separated list of
> > >>>>> labels.
> > >>>>> 
> > >>>>> lsms=selinux,apparmor subj=x:y:z:s:c,a
> > >>>> 
> > >>>> Some quick comments:
> > >>>> 
> > >>>> * My usual reminder that new fields for existing audit records must
> > >>>> be
> > >>>> added to the end of the record.
> > >>>> 
> > >>>> * If we are going to multiplex the labels on a single field (more on
> > >>>> that below) I might suggest using "subj_lsms" instead of "lsms" so
> > >>>> we
> > >>>> leave ourself some wiggle room in the future.
> > >>>> 
> > >>>> * Multiplexing on a single "subj" field is going to be difficult
> > >>>> because picking the label delimiter is going to be a pain.  For
> > >>>> example, in the example above a comma is used, which at the very
> > >>>> least
> > >>>> is a valid part of a SELinux label and I suspect for Smack as well
> > >>>> (I'm not sure about the other LSMs).  I suspect the only way to
> > >>>> parse
> > >>>> out the component labels would be to have knowledge of the LSMs in
> > >>>> use, as well as the policies loaded at the time the audit record was
> > >>>> generated.
> > >>>> 
> > >>>> This may be a faulty assumption, but assuming your tools will fall
> > >>>> over if they see multiple "subj" fields, could we do something like
> > >>>> 
> > >>>> the following (something between option #2 and #3):
> > >>>>   subj1_lsm=smack subj1=<smack_label> subj2_lsm=selinux
> > >>>> 
> > >>>> subj2=<selinux_label> ...
> > >>> 
> > >>> If it's not a subj= field why use the indirection?
> > >>> 
> > >>>         subj_smack=<smack_label> subj_selinux=<selinux_label>
> > >>> 
> > >>> would be easier.
> > >> 
> > >> Good point, that looks reasonable to me.
> > > 
> > > But doing something like this will totally break all parsers. To be
> > > honest, I don't know if I'll ever see more than one labeled security
> > > system running at the same time. And this would be a big penalty to
> > > pay for the flexibility that someone, somewhere just might possibly do
> > > this.
> > 
> > While I have never seen multiple-LSM plans from RedHat/IBM I
> > have seen them from Ubuntu. This isn't hypothetical. I know that
> > it's a hard problem, which is why we need to get it as right as
> > possible.
> 
> Agreed.  While I'm not going to be on a specific Linux release, I do
> believe that at some point in the future the LSM stacking work is
> going to land in Linus' tree.  Perhaps you'll never see it Steve, but
> we need to prepare the code to handle it when it happens.

And I agree with that. I'm saying that if we push it all in subj= then it is 
not a big penalty. It saves major breakage. Every single event is required to 
have a subj= field if its a MAC system. By changing it to lsm_subj= it changes 
the layout of every single event. And it make more to parse. And searching 
the labels is worse because it has to iterate over a list of *_subj to match 
it. This will hurt performance because it is for every single event.

> For my own sanity, here is a quick summary of the constraints as I
> currently see them, please feel free to add/disagree:
> 
> * We can't have multiple "subj" fields in a single audit record.
> * The different LSMs all have different label formats and allowed
> characters.  Further, a given label format may not be unique for a
> given LSM; for example, Smack could be configured with a subset of
> SELinux labels.
> * Steve's audit tools appear to require a "subj" and "obj" fields for
> LSM information or else they break into tiny little pieces.

It changes all knowledge of where to look for things. And considering 
considering that events could be aggregated from systems of different ages/
distributions, audit userspace will always have to be backwards compatible.
 
> What if we preserved the existing subj/obj fields in the case where
> there is only one "major" LSM (SELinux, Smack, AppArmor, etc.):
> 
>   subj=<lsm_label>
> 
> ... and in the case of multiple major LSMs we set the subj value to
> "?" and introduce new subj_X fields (as necessary) as discussed above:
> 
>   subj=? subj_smack=<smack_label> subj_selinux=<selinux_label> ...
> 
> ... I believe that Steve's old/existing userspace tools would simply
> report "?"/unknown LSM credentials where new multi-LSM tools could
> report the multiple different labels. 

Common Criteria as well as other standards require subject labels to be 
searchable. So, changing behavior based on how many modules will still cause 
problems with performance because I'll always have to assume it could be 
either way and try both.

> While this may not be perfect,
> it avoids having to multiplex the different labels into a single field
> (which is a big win IMHO) with the only issue being that multi-LSM
> solutions will need an updated audit toolset to see the new labels
> (which seems like a reasonable requirement).

Why would not multiplexing different labels in the same field be a big win? Its 
a big loss in my mind. Using the same field preserves backward compatibility, 
is more compact in bytes, creates performance problems, changes all mapping 
of what things means, etc. IOW, this makes things much worse.

-Steve



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Preferred subj= with multiple LSMs
  2019-07-16 21:46               ` Steve Grubb
@ 2019-07-16 22:18                 ` Casey Schaufler
  2019-07-16 23:13                   ` Paul Moore
  2019-07-16 23:09                 ` Paul Moore
  1 sibling, 1 reply; 39+ messages in thread
From: Casey Schaufler @ 2019-07-16 22:18 UTC (permalink / raw)
  To: Steve Grubb, Paul Moore
  Cc: Richard Guy Briggs, linux-audit, Linux Security Module list, casey

On 7/16/2019 2:46 PM, Steve Grubb wrote:
> On Tuesday, July 16, 2019 5:25:21 PM EDT Paul Moore wrote:
>> On Tue, Jul 16, 2019 at 2:41 PM Casey Schaufler <casey@schaufler-ca.com> 
> wrote:
>>> On 7/16/2019 11:06 AM, Steve Grubb wrote:
>>>> On Tuesday, July 16, 2019 1:43:18 PM EDT Paul Moore wrote:
>>>>> On Tue, Jul 16, 2019 at 1:30 PM Casey Schaufler
>>>>> <casey@schaufler-ca.com>
>>>> wrote:
>>>>>> On 7/16/2019 10:12 AM, Paul Moore wrote:
>>>>>>> On Mon, Jul 15, 2019 at 6:56 PM Steve Grubb <sgrubb@redhat.com> 
> wrote:
>>>>>>>> On Monday, July 15, 2019 5:28:56 PM EDT Paul Moore wrote:
>>>>>>>>> On Mon, Jul 15, 2019 at 3:37 PM Casey Schaufler
>>>>>>>>> <casey@schaufler-ca.com>
>>>>>>>> wrote:
>>>>>>>>>> On 7/15/2019 12:04 PM, Richard Guy Briggs wrote:
>>>>>>>>>>> On 2019-07-13 11:08, Steve Grubb wrote:
>>>>>>> ...
>>>>>>>
>>>>>>>>>>> Steve's answer is the obvious one, ideally allocating a seperate
>>>>>>>>>>> range
>>>>>>>>>>> to each LSM with each message type having its own well defined
>>>>>>>>>>> format.
>>>>>>>>>> It doesn't address the issue of success records, or records
>>>>>>>>>> generated outside the security modules.
>>>>>>>>> Yes, exactly.  The individual LSM will presumably will continue to
>>>>>>>>> generate their own audit records as they do today and I would
>>>>>>>>> imagine
>>>>>>>>> that the subject and object fields could remain as they do today
>>>>>>>>> for
>>>>>>>>> the LSM specific records.
>>>>>>>>>
>>>>>>>>> The trick is the other records which are not LSM specific but
>>>>>>>>> still
>>>>>>>>> want to include subject and/or object information.  Unfortunately
>>>>>>>>> we
>>>>>>>>> are stuck with some tough limitations given the current audit
>>>>>>>>> record
>>>>>>>>> format and Steve's audit userspace tools;
>>>>>>>> Not really. We just need to approach the problem thinking about how
>>>>>>>> to
>>>>>>>> make it work based on how things currently work.
>>>>>>> I suppose it is all somewhat "subjective" - bad joke fully intended
>>>>>>> :)
>>>>>>> - with respect to what one considers good/bad/limiting.  My personal
>>>>>>> view is that an ideal solution would allow for multiple independent
>>>>>>> subj/obj labels without having to multiplex on a single subj/obj
>>>>>>> field.  My gut feeling is that this would confuse your tools, yes?
>>>>>>>
>>>>>>>> For example Casey had a list of possible formats. Like this one:
>>>>>>>>
>>>>>>>> Option 3:
>>>>>>>>         lsms=selinux,apparmor subj=x:y:z:s:c subj=a
>>>>>>>>
>>>>>>>> I'd suggest something almost like that. The first field could be a
>>>>>>>> map
>>>>>>>> to
>>>>>>>> decipher the labels. Then we could have a comma separated list of
>>>>>>>> labels.
>>>>>>>>
>>>>>>>> lsms=selinux,apparmor subj=x:y:z:s:c,a
>>>>>>> Some quick comments:
>>>>>>>
>>>>>>> * My usual reminder that new fields for existing audit records must
>>>>>>> be
>>>>>>> added to the end of the record.
>>>>>>>
>>>>>>> * If we are going to multiplex the labels on a single field (more on
>>>>>>> that below) I might suggest using "subj_lsms" instead of "lsms" so
>>>>>>> we
>>>>>>> leave ourself some wiggle room in the future.
>>>>>>>
>>>>>>> * Multiplexing on a single "subj" field is going to be difficult
>>>>>>> because picking the label delimiter is going to be a pain.  For
>>>>>>> example, in the example above a comma is used, which at the very
>>>>>>> least
>>>>>>> is a valid part of a SELinux label and I suspect for Smack as well
>>>>>>> (I'm not sure about the other LSMs).  I suspect the only way to
>>>>>>> parse
>>>>>>> out the component labels would be to have knowledge of the LSMs in
>>>>>>> use, as well as the policies loaded at the time the audit record was
>>>>>>> generated.
>>>>>>>
>>>>>>> This may be a faulty assumption, but assuming your tools will fall
>>>>>>> over if they see multiple "subj" fields, could we do something like
>>>>>>>
>>>>>>> the following (something between option #2 and #3):
>>>>>>>   subj1_lsm=smack subj1=<smack_label> subj2_lsm=selinux
>>>>>>>
>>>>>>> subj2=<selinux_label> ...
>>>>>> If it's not a subj= field why use the indirection?
>>>>>>
>>>>>>         subj_smack=<smack_label> subj_selinux=<selinux_label>
>>>>>>
>>>>>> would be easier.
>>>>> Good point, that looks reasonable to me.
>>>> But doing something like this will totally break all parsers. To be
>>>> honest, I don't know if I'll ever see more than one labeled security
>>>> system running at the same time. And this would be a big penalty to
>>>> pay for the flexibility that someone, somewhere just might possibly do
>>>> this.
>>> While I have never seen multiple-LSM plans from RedHat/IBM I
>>> have seen them from Ubuntu. This isn't hypothetical. I know that
>>> it's a hard problem, which is why we need to get it as right as
>>> possible.
>> Agreed.  While I'm not going to be on a specific Linux release, I do
>> believe that at some point in the future the LSM stacking work is
>> going to land in Linus' tree.  Perhaps you'll never see it Steve, but
>> we need to prepare the code to handle it when it happens.
> And I agree with that. I'm saying that if we push it all in subj= then it is 
> not a big penalty. It saves major breakage. Every single event is required to 
> have a subj= field if its a MAC system. By changing it to lsm_subj= it changes 
> the layout of every single event. And it make more to parse. And searching 
> the labels is worse because it has to iterate over a list of *_subj to match 
> it. This will hurt performance because it is for every single event.
>
>> For my own sanity, here is a quick summary of the constraints as I
>> currently see them, please feel free to add/disagree:
>>
>> * We can't have multiple "subj" fields in a single audit record.
>> * The different LSMs all have different label formats and allowed
>> characters.  Further, a given label format may not be unique for a
>> given LSM; for example, Smack could be configured with a subset of
>> SELinux labels.
>> * Steve's audit tools appear to require a "subj" and "obj" fields for
>> LSM information or else they break into tiny little pieces.
> It changes all knowledge of where to look for things. And considering 
> considering that events could be aggregated from systems of different ages/
> distributions, audit userspace will always have to be backwards compatible.
>  
>> What if we preserved the existing subj/obj fields in the case where
>> there is only one "major" LSM (SELinux, Smack, AppArmor, etc.):
>>
>>   subj=<lsm_label>
>>
>> ... and in the case of multiple major LSMs we set the subj value to
>> "?" and introduce new subj_X fields (as necessary) as discussed above:
>>
>>   subj=? subj_smack=<smack_label> subj_selinux=<selinux_label> ...
>>
>> ... I believe that Steve's old/existing userspace tools would simply
>> report "?"/unknown LSM credentials where new multi-LSM tools could
>> report the multiple different labels. 
> Common Criteria as well as other standards require subject labels to be 
> searchable. So, changing behavior based on how many modules will still cause 
> problems with performance because I'll always have to assume it could be 
> either way and try both.
>
>> While this may not be perfect,
>> it avoids having to multiplex the different labels into a single field
>> (which is a big win IMHO) with the only issue being that multi-LSM
>> solutions will need an updated audit toolset to see the new labels
>> (which seems like a reasonable requirement).
> Why would not multiplexing different labels in the same field be a big win? Its 
> a big loss in my mind. Using the same field preserves backward compatibility, 
> is more compact in bytes, creates performance problems, changes all mapping 
> of what things means, etc. IOW, this makes things much worse.

It sounds as if some variant of the Hideous format:

	subj=selinux='a:b:c:d',apparmor='z'
	subj=selinux/a:b:c:d/apparmor/z
	subj=(selinux)a:b:c:d/(apparmor)z

would meet Steve's searchability requirements, but with significant
parsing performance penalties. 

>
> -Steve
>
>


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Preferred subj= with multiple LSMs
  2019-07-16 21:46               ` Steve Grubb
  2019-07-16 22:18                 ` Casey Schaufler
@ 2019-07-16 23:09                 ` Paul Moore
  2019-07-17  4:36                   ` James Morris
  1 sibling, 1 reply; 39+ messages in thread
From: Paul Moore @ 2019-07-16 23:09 UTC (permalink / raw)
  To: Steve Grubb
  Cc: Casey Schaufler, Richard Guy Briggs, linux-audit,
	Linux Security Module list

On Tue, Jul 16, 2019 at 5:46 PM Steve Grubb <sgrubb@redhat.com> wrote:
> On Tuesday, July 16, 2019 5:25:21 PM EDT Paul Moore wrote:

...

> > Agreed.  While I'm not going to be on a specific Linux release, I do
> > believe that at some point in the future the LSM stacking work is
> > going to land in Linus' tree.  Perhaps you'll never see it Steve, but
> > we need to prepare the code to handle it when it happens.
>
> And I agree with that. I'm saying that if we push it all in subj= then it is
> not a big penalty.

I'm going to disagree on that quite severely.  As I mentioned
previously there isn't an easy or sane way to delimit between the
different LSM labels which means sorting out the multiplexed "subj"
field is going to be a post processing nightmare.

> It saves major breakage. Every single event is required to
> have a subj= field if its a MAC system.

All of the options we've discussed still record the LSM credentials in
the audit record; no one is talking about *not* recording the LSM
credentials.  What we are discussing is *how* they are recorded.

> By changing it to lsm_subj= it changes
> the layout of every single event. And it make more to parse. And searching
> the labels is worse because it has to iterate over a list of *_subj to match
> it. This will hurt performance because it is for every single event.

I can almost guarantee that looking for subj/subj_X is going to be
much easier than safely parsing a multiplexed subj field.  Not to
mention the multiplexed approach is just awful to read compared to
some of the other suggestions.

> > For my own sanity, here is a quick summary of the constraints as I
> > currently see them, please feel free to add/disagree:
> >
> > * We can't have multiple "subj" fields in a single audit record.
> > * The different LSMs all have different label formats and allowed
> > characters.  Further, a given label format may not be unique for a
> > given LSM; for example, Smack could be configured with a subset of
> > SELinux labels.
> > * Steve's audit tools appear to require a "subj" and "obj" fields for
> > LSM information or else they break into tiny little pieces.
>
> It changes all knowledge of where to look for things. And considering
> considering that events could be aggregated from systems of different ages/
> distributions, audit userspace will always have to be backwards compatible.

The subj_X approach is still backwards compatible, the difference is
that old versions of the tools get a "?" for the LSM creds which is a
rather sane way of indicating something is different.  The multiplexed
approach would result in effectively garbage for the LSM creds unless
the higher layers of audit tooling are updated to understand the new
multiplexed format *and* *continuously* *updated* as new LSMs are
stacked because you need to understand each LSMs label format if you
are going to safely parse the multiplexed format.  With the subj_X
approach the higher layer tooling simply needs to look for subj_X
fields when it sees "subj=?", and then it can safely extract/parse
each LSM's label without needing to understand or inspect the labels
themselves.

> > What if we preserved the existing subj/obj fields in the case where
> > there is only one "major" LSM (SELinux, Smack, AppArmor, etc.):
> >
> >   subj=<lsm_label>
> >
> > ... and in the case of multiple major LSMs we set the subj value to
> > "?" and introduce new subj_X fields (as necessary) as discussed above:
> >
> >   subj=? subj_smack=<smack_label> subj_selinux=<selinux_label> ...
> >
> > ... I believe that Steve's old/existing userspace tools would simply
> > report "?"/unknown LSM credentials where new multi-LSM tools could
> > report the multiple different labels.
>
> Common Criteria as well as other standards require subject labels to be
> searchable. So, changing behavior based on how many modules will still cause
> problems with performance because I'll always have to assume it could be
> either way and try both.

Once again, I believe that the subj_X approach is going to be faster
than safely parsing the multiplexed format.

-- 
paul moore
www.paul-moore.com

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Preferred subj= with multiple LSMs
  2019-07-16 22:18                 ` Casey Schaufler
@ 2019-07-16 23:13                   ` Paul Moore
  2019-07-16 23:47                     ` Casey Schaufler
  0 siblings, 1 reply; 39+ messages in thread
From: Paul Moore @ 2019-07-16 23:13 UTC (permalink / raw)
  To: Casey Schaufler
  Cc: Steve Grubb, Richard Guy Briggs, linux-audit, Linux Security Module list

On Tue, Jul 16, 2019 at 6:18 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
> It sounds as if some variant of the Hideous format:
>
>         subj=selinux='a:b:c:d',apparmor='z'
>         subj=selinux/a:b:c:d/apparmor/z
>         subj=(selinux)a:b:c:d/(apparmor)z
>
> would meet Steve's searchability requirements, but with significant
> parsing performance penalties.

I think "hideous format" sums it up nicely.  Whatever we choose here
we are likely going to be stuck with for some time and I'm near to
100% that multiplexing the labels onto a single field is going to be a
disaster.

-- 
paul moore
www.paul-moore.com

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Preferred subj= with multiple LSMs
  2019-07-16 23:13                   ` Paul Moore
@ 2019-07-16 23:47                     ` Casey Schaufler
  2019-07-17 12:14                       ` Paul Moore
  0 siblings, 1 reply; 39+ messages in thread
From: Casey Schaufler @ 2019-07-16 23:47 UTC (permalink / raw)
  To: Paul Moore
  Cc: Steve Grubb, Richard Guy Briggs, linux-audit,
	Linux Security Module list, casey

On 7/16/2019 4:13 PM, Paul Moore wrote:
> On Tue, Jul 16, 2019 at 6:18 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
>> It sounds as if some variant of the Hideous format:
>>
>>         subj=selinux='a:b:c:d',apparmor='z'
>>         subj=selinux/a:b:c:d/apparmor/z
>>         subj=(selinux)a:b:c:d/(apparmor)z
>>
>> would meet Steve's searchability requirements, but with significant
>> parsing performance penalties.
> I think "hideous format" sums it up nicely.  Whatever we choose here
> we are likely going to be stuck with for some time and I'm near to
> 100% that multiplexing the labels onto a single field is going to be a
> disaster.

If the requirement is that subj= be searchable I don't see much of
an alternative to a Hideous format. If we can get past that, and say
that all subj_* have to be searchable we can avoid that set of issues.
Instead of:

	s = strstr(source, "subj=")
	search_after_subj(s, ...);

we have

	s = source
	for (i = 0; i < lsm_slots ; i++) {
		s = strstr(s, "subj_")
		if (!s)
			break;
		s = search_after_subj_(s, lsm_slot_name[i], ...)
	}

There's enough ugly to go around either way.
And I'm not partial to either approach, but do would very
much like to get the code done so I can get on to the next
set of amazing challenges.

Oh, and I don't want to pick on subj= as obj= has the exact same issues.



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Preferred subj= with multiple LSMs
  2019-07-16 23:09                 ` Paul Moore
@ 2019-07-17  4:36                   ` James Morris
  2019-07-17 12:23                     ` Paul Moore
  0 siblings, 1 reply; 39+ messages in thread
From: James Morris @ 2019-07-17  4:36 UTC (permalink / raw)
  To: Paul Moore
  Cc: Steve Grubb, Casey Schaufler, Richard Guy Briggs, linux-audit,
	Linux Security Module list

On Tue, 16 Jul 2019, Paul Moore wrote:

> The subj_X approach is still backwards compatible, the difference is
> that old versions of the tools get a "?" for the LSM creds which is a
> rather sane way of indicating something is different.

This will still break existing userspace, right?  We can't do that.

> Once again, I believe that the subj_X approach is going to be faster
> than safely parsing the multiplexed format.

What about emitting one audit record for each LSM?

-- 
James Morris
<jmorris@namei.org>


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Preferred subj= with multiple LSMs
  2019-07-16 23:47                     ` Casey Schaufler
@ 2019-07-17 12:14                       ` Paul Moore
  2019-07-17 15:49                         ` Casey Schaufler
  0 siblings, 1 reply; 39+ messages in thread
From: Paul Moore @ 2019-07-17 12:14 UTC (permalink / raw)
  To: Casey Schaufler
  Cc: Steve Grubb, Richard Guy Briggs, linux-audit, Linux Security Module list

On Tue, Jul 16, 2019 at 7:47 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
> On 7/16/2019 4:13 PM, Paul Moore wrote:
> > On Tue, Jul 16, 2019 at 6:18 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
> >> It sounds as if some variant of the Hideous format:
> >>
> >>         subj=selinux='a:b:c:d',apparmor='z'
> >>         subj=selinux/a:b:c:d/apparmor/z
> >>         subj=(selinux)a:b:c:d/(apparmor)z
> >>
> >> would meet Steve's searchability requirements, but with significant
> >> parsing performance penalties.
> > I think "hideous format" sums it up nicely.  Whatever we choose here
> > we are likely going to be stuck with for some time and I'm near to
> > 100% that multiplexing the labels onto a single field is going to be a
> > disaster.
>
> If the requirement is that subj= be searchable I don't see much of
> an alternative to a Hideous format. If we can get past that, and say
> that all subj_* have to be searchable we can avoid that set of issues.
> Instead of:
>
>         s = strstr(source, "subj=")
>         search_after_subj(s, ...);

This example does a lot of hand waving in search_after_subj(...)
regarding parsing the multiplexed LSM label.  Unless we restrict the
LSM label formats (which seems both wrong, and too late IMHO) we have
a parsing nightmare; can you write a safe multiplexed LSM label parser
without knowledge of each LSM label format?  Can you do that for each
LSM without knowing their loaded policy?  What happens when the policy
and/or label format changes?  What happens in a few years when another
LSM is added to the kernel?

> we have
>
>         s = source
>         for (i = 0; i < lsm_slots ; i++) {
>                 s = strstr(s, "subj_")
>                 if (!s)
>                         break;
>                 s = search_after_subj_(s, lsm_slot_name[i], ...)

The hand waving here in search_after_subj_(...) is much less;
essentially you just match "subj_X" and then you can take the field
value as the LSM's label without having to know the format, the policy
loaded, etc.  It is both safer and doesn't require knowledge of the
LSMs (the LSM "name" can be specified as a parameter to the search
tool).

> There's enough ugly to go around either way.
> And I'm not partial to either approach, but do would very
> much like to get the code done so I can get on to the next
> set of amazing challenges.
>
> Oh, and I don't want to pick on subj= as obj= has the exact same issues.

Yes, I stopped talking about both subj and obj some time ago in this
thread because I figure we can use the same approach for both.

-- 
paul moore
www.paul-moore.com

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Preferred subj= with multiple LSMs
  2019-07-17  4:36                   ` James Morris
@ 2019-07-17 12:23                     ` Paul Moore
  0 siblings, 0 replies; 39+ messages in thread
From: Paul Moore @ 2019-07-17 12:23 UTC (permalink / raw)
  To: James Morris
  Cc: Steve Grubb, Casey Schaufler, Richard Guy Briggs, linux-audit,
	Linux Security Module list

On Wed, Jul 17, 2019 at 12:36 AM James Morris <jmorris@namei.org> wrote:
> On Tue, 16 Jul 2019, Paul Moore wrote:
>
> > The subj_X approach is still backwards compatible, the difference is
> > that old versions of the tools get a "?" for the LSM creds which is a
> > rather sane way of indicating something is different.
>
> This will still break existing userspace, right?  We can't do that.

Trust me, I don't want to break userspace, I wouldn't be suggesting that.

The subj_X approach would cause userspace to see a "?" for the LSM
creds when looking at logs from a stacked-LSM system.  I would argue
this is actually safer than the multiplexed approach as "?" is a safe
sentinel used by the audit subsystem when the value can't be
determined; the multiplexed label in the hands of legacy userspace
tools would be confusing at best, and misleading at worst.

> > Once again, I believe that the subj_X approach is going to be faster
> > than safely parsing the multiplexed format.
>
> What about emitting one audit record for each LSM?

In many of the LSM generated audit events that is what would happen,
and should just work.  What we've been discussing in all the cases
where the audit event is generated outside the context of the LSM but
the LSM credentials are still desirable bits of information.  While we
are definitely going in the direction of making multiple record events
more common, duplicating the same record, with only changes to the LSM
creds, may end up confusing Steve's tools.  It would also end up
bloating the audit log, which I know is something everyone wants to
avoid.

-- 
paul moore
www.paul-moore.com

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Preferred subj= with multiple LSMs
  2019-07-17 12:14                       ` Paul Moore
@ 2019-07-17 15:49                         ` Casey Schaufler
  2019-07-17 16:23                           ` Paul Moore
  0 siblings, 1 reply; 39+ messages in thread
From: Casey Schaufler @ 2019-07-17 15:49 UTC (permalink / raw)
  To: Paul Moore
  Cc: Steve Grubb, Richard Guy Briggs, linux-audit,
	Linux Security Module list, casey

On 7/17/2019 5:14 AM, Paul Moore wrote:
> On Tue, Jul 16, 2019 at 7:47 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
>> On 7/16/2019 4:13 PM, Paul Moore wrote:
>>> On Tue, Jul 16, 2019 at 6:18 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
>>>> It sounds as if some variant of the Hideous format:
>>>>
>>>>         subj=selinux='a:b:c:d',apparmor='z'
>>>>         subj=selinux/a:b:c:d/apparmor/z
>>>>         subj=(selinux)a:b:c:d/(apparmor)z
>>>>
>>>> would meet Steve's searchability requirements, but with significant
>>>> parsing performance penalties.
>>> I think "hideous format" sums it up nicely.  Whatever we choose here
>>> we are likely going to be stuck with for some time and I'm near to
>>> 100% that multiplexing the labels onto a single field is going to be a
>>> disaster.
>> If the requirement is that subj= be searchable I don't see much of
>> an alternative to a Hideous format. If we can get past that, and say
>> that all subj_* have to be searchable we can avoid that set of issues.
>> Instead of:
>>
>>         s = strstr(source, "subj=")
>>         search_after_subj(s, ...);
> This example does a lot of hand waving in search_after_subj(...)
> regarding parsing the multiplexed LSM label.  Unless we restrict the
> LSM label formats (which seems both wrong, and too late IMHO)

I don't think it's too late, and I think it would be healthy
to restrict LSM "contexts" to character sets that make command
line specification possible. Embedded newlines? Ewwww.

>  we have
> a parsing nightmare; can you write a safe multiplexed LSM label parser
> without knowledge of each LSM label format?  Can you do that for each
> LSM without knowing their loaded policy?  What happens when the policy
> and/or label format changes?  What happens in a few years when another
> LSM is added to the kernel?

I was intentionally hand-wavy because of those very issues.
Steve says that parsing is limited to "strstr()", so looking for
":s7:" in the subject should work just as well with a Hideous
format as it does today, with the exception of false positives
where LSMs have label string overlaps.

Where is the need to use a module specific label parser coming
from? Does the audit code parse SELinux contexts now? 

>> we have
>>
>>         s = source
>>         for (i = 0; i < lsm_slots ; i++) {
>>                 s = strstr(s, "subj_")
>>                 if (!s)
>>                         break;
>>                 s = search_after_subj_(s, lsm_slot_name[i], ...)
> The hand waving here in search_after_subj_(...) is much less;
> essentially you just match "subj_X" and then you can take the field
> value as the LSM's label without having to know the format, the policy
> loaded, etc.  It is both safer and doesn't require knowledge of the
> LSMs (the LSM "name" can be specified as a parameter to the search
> tool).

You can do that with the Hideous format as well. I wouldn't
say which would be easier without delving into the audit user
space.

>> There's enough ugly to go around either way.
>> And I'm not partial to either approach, but do would very
>> much like to get the code done so I can get on to the next
>> set of amazing challenges.
>>
>> Oh, and I don't want to pick on subj= as obj= has the exact same issues.
> Yes, I stopped talking about both subj and obj some time ago in this
> thread because I figure we can use the same approach for both.
>


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Preferred subj= with multiple LSMs
  2019-07-17 15:49                         ` Casey Schaufler
@ 2019-07-17 16:23                           ` Paul Moore
  2019-07-17 23:02                             ` Casey Schaufler
  0 siblings, 1 reply; 39+ messages in thread
From: Paul Moore @ 2019-07-17 16:23 UTC (permalink / raw)
  To: Casey Schaufler
  Cc: Steve Grubb, Richard Guy Briggs, linux-audit, Linux Security Module list

On Wed, Jul 17, 2019 at 11:49 AM Casey Schaufler <casey@schaufler-ca.com> wrote:
> On 7/17/2019 5:14 AM, Paul Moore wrote:
> > On Tue, Jul 16, 2019 at 7:47 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
> >> On 7/16/2019 4:13 PM, Paul Moore wrote:
> >>> On Tue, Jul 16, 2019 at 6:18 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
> >>>> It sounds as if some variant of the Hideous format:
> >>>>
> >>>>         subj=selinux='a:b:c:d',apparmor='z'
> >>>>         subj=selinux/a:b:c:d/apparmor/z
> >>>>         subj=(selinux)a:b:c:d/(apparmor)z
> >>>>
> >>>> would meet Steve's searchability requirements, but with significant
> >>>> parsing performance penalties.
> >>> I think "hideous format" sums it up nicely.  Whatever we choose here
> >>> we are likely going to be stuck with for some time and I'm near to
> >>> 100% that multiplexing the labels onto a single field is going to be a
> >>> disaster.
> >> If the requirement is that subj= be searchable I don't see much of
> >> an alternative to a Hideous format. If we can get past that, and say
> >> that all subj_* have to be searchable we can avoid that set of issues.
> >> Instead of:
> >>
> >>         s = strstr(source, "subj=")
> >>         search_after_subj(s, ...);
> > This example does a lot of hand waving in search_after_subj(...)
> > regarding parsing the multiplexed LSM label.  Unless we restrict the
> > LSM label formats (which seems both wrong, and too late IMHO)
>
> I don't think it's too late, and I think it would be healthy
> to restrict LSM "contexts" to character sets that make command
> line specification possible. Embedded newlines? Ewwww.

That would imply that the delimiter you would choose for the
multiplexed approach would be something odd (I think you suggested
0x02, or similar, earlier) which would likely require the multiplexed
subj field to become a hex encoded field which would be very
unfortunate in my opinion and would technically break with the current
subj/obj field format spec.  Picking a normal-ish delimiter, and
restricting its use by LSMs seems wrong to me.

It's also worth noting that if you were to move subj/obj to hex
encoded fields, in addition to causing a backwards compatibility
problem, you completely kill the ability to look at the raw log data
and make sense of the fields ... well, unless you can do the ascii hex
conversion in your head on the fly.

> >  we have
> > a parsing nightmare; can you write a safe multiplexed LSM label parser
> > without knowledge of each LSM label format?  Can you do that for each
> > LSM without knowing their loaded policy?  What happens when the policy
> > and/or label format changes?  What happens in a few years when another
> > LSM is added to the kernel?
>
> I was intentionally hand-wavy because of those very issues.

Then you should already realize why this is a terrible idea ;)

> Steve says that parsing is limited to "strstr()", so looking for
> ":s7:" in the subject should work just as well with a Hideous
> format as it does today, with the exception of false positives
> where LSMs have label string overlaps.

Today when you go to search through your audit log you know that a
single LSM is providing subj labels, and you also know which LSM that
happens to be, so searching on a given string, or substring, is easy
and generally safe.  In a multiplexed approach this becomes much more
difficult, and depending on the search being done it could be
misleading, perhaps even dangerous with complicated searches that
exclude label substrings.

It's important to remember that Steve's strstr() comment only reflects
his set of userspace tools.  When you start talking about log
aggregation and analytics, it seems very likely that there are other
tools in use, likely with their own parsers that do much more
complicated searches than a simple strstr() call.

> Where is the need to use a module specific label parser coming
> from? Does the audit code parse SELinux contexts now?

If you can't pick a "safe" delimiter that isn't included in any of the
LSM label formats, how else do you know how to parse the multiplexed
mess?

> >> we have
> >>
> >>         s = source
> >>         for (i = 0; i < lsm_slots ; i++) {
> >>                 s = strstr(s, "subj_")
> >>                 if (!s)
> >>                         break;
> >>                 s = search_after_subj_(s, lsm_slot_name[i], ...)
> > The hand waving here in search_after_subj_(...) is much less;
> > essentially you just match "subj_X" and then you can take the field
> > value as the LSM's label without having to know the format, the policy
> > loaded, etc.  It is both safer and doesn't require knowledge of the
> > LSMs (the LSM "name" can be specified as a parameter to the search
> > tool).
>
> You can do that with the Hideous format as well. I wouldn't
> say which would be easier without delving into the audit user
> space.

No, you can't.  You still need to parse the multiplexed mess, that's
the problem.

-- 
paul moore
www.paul-moore.com

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Preferred subj= with multiple LSMs
  2019-07-17 16:23                           ` Paul Moore
@ 2019-07-17 23:02                             ` Casey Schaufler
  2019-07-18 13:10                               ` Simon McVittie
  2019-07-19 21:21                               ` Preferred subj= with multiple LSMs Paul Moore
  0 siblings, 2 replies; 39+ messages in thread
From: Casey Schaufler @ 2019-07-17 23:02 UTC (permalink / raw)
  To: Paul Moore
  Cc: Steve Grubb, Richard Guy Briggs, linux-audit,
	Linux Security Module list, casey

On 7/17/2019 9:23 AM, Paul Moore wrote:
> On Wed, Jul 17, 2019 at 11:49 AM Casey Schaufler <casey@schaufler-ca.com> wrote:
>> On 7/17/2019 5:14 AM, Paul Moore wrote:
>>> On Tue, Jul 16, 2019 at 7:47 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
>>>> On 7/16/2019 4:13 PM, Paul Moore wrote:
>>>>> On Tue, Jul 16, 2019 at 6:18 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
>>>>>> It sounds as if some variant of the Hideous format:
>>>>>>
>>>>>>         subj=selinux='a:b:c:d',apparmor='z'
>>>>>>         subj=selinux/a:b:c:d/apparmor/z
>>>>>>         subj=(selinux)a:b:c:d/(apparmor)z
>>>>>>
>>>>>> would meet Steve's searchability requirements, but with significant
>>>>>> parsing performance penalties.
>>>>> I think "hideous format" sums it up nicely.  Whatever we choose here
>>>>> we are likely going to be stuck with for some time and I'm near to
>>>>> 100% that multiplexing the labels onto a single field is going to be a
>>>>> disaster.
>>>> If the requirement is that subj= be searchable I don't see much of
>>>> an alternative to a Hideous format. If we can get past that, and say
>>>> that all subj_* have to be searchable we can avoid that set of issues.
>>>> Instead of:
>>>>
>>>>         s = strstr(source, "subj=")
>>>>         search_after_subj(s, ...);
>>> This example does a lot of hand waving in search_after_subj(...)
>>> regarding parsing the multiplexed LSM label.  Unless we restrict the
>>> LSM label formats (which seems both wrong, and too late IMHO)
>> I don't think it's too late, and I think it would be healthy
>> to restrict LSM "contexts" to character sets that make command
>> line specification possible. Embedded newlines? Ewwww.
> That would imply that the delimiter you would choose for the
> multiplexed approach would be something odd (I think you suggested
> 0x02, or similar, earlier) which would likely require the multiplexed
> subj field to become a hex encoded field which would be very
> unfortunate in my opinion and would technically break with the current
> subj/obj field format spec.  Picking a normal-ish delimiter, and
> restricting its use by LSMs seems wrong to me.

Just say "no" to hex encoding! BTW, keys are not hex encoded.

We've never had to think about having general rules on
what security modules do before, because with only one
active each could do whatever it wanted without fear of
conflict. If there is already a character that none of
the existing modules use, how would it be wrong to
reserve it?

Smack disallows the four characters '"/\ because quoting
is too important to ignore and the likelyhood that someone
would confuse labels with paths seemed great. I sniffed
around a little, but couldn't find the sets for SELinux or
AppArmor.

> It's also worth noting that if you were to move subj/obj to hex
> encoded fields, in addition to causing a backwards compatibility
> problem, you completely kill the ability to look at the raw log data
> and make sense of the fields ... well, unless you can do the ascii hex
> conversion in your head on the fly.

Agreed, even though there was a time when I could do
hex decoding in both ASCII and EBCDIC on the fly.

>>>  we have
>>> a parsing nightmare; can you write a safe multiplexed LSM label parser
>>> without knowledge of each LSM label format?  Can you do that for each
>>> LSM without knowing their loaded policy?  What happens when the policy
>>> and/or label format changes?  What happens in a few years when another
>>> LSM is added to the kernel?
>> I was intentionally hand-wavy because of those very issues.
> Then you should already realize why this is a terrible idea ;)

Unfortunately, I'm facing two options, one of which the
kernel maintainer thinks is a bad idea and the other the
user space maintainer thinks is a bad idea. Plus, I'm not
very happy with either, either.

>> Steve says that parsing is limited to "strstr()", so looking for
>> ":s7:" in the subject should work just as well with a Hideous
>> format as it does today, with the exception of false positives
>> where LSMs have label string overlaps.
> Today when you go to search through your audit log you know that a
> single LSM is providing subj labels, and you also know which LSM that
> happens to be, so searching on a given string, or substring, is easy
> and generally safe.  In a multiplexed approach this becomes much more
> difficult, and depending on the search being done it could be
> misleading, perhaps even dangerous with complicated searches that
> exclude label substrings.

I'm aware of this issue, which is one of the reasons I'm
asking about the preferred approach.

> It's important to remember that Steve's strstr() comment only reflects
> his set of userspace tools.  When you start talking about log
> aggregation and analytics, it seems very likely that there are other
> tools in use, likely with their own parsers that do much more
> complicated searches than a simple strstr() call.

Point. But long term, they'll have to be updated to accommodate
whatever we decide on. Which makes the "simple" case, where one
security module is in use all the more important.

>> Where is the need to use a module specific label parser coming
>> from? Does the audit code parse SELinux contexts now?
> If you can't pick a "safe" delimiter that isn't included in any of the
> LSM label formats, how else do you know how to parse the multiplexed
> mess?

Ah, but if we can ...

>>>> we have
>>>>
>>>>         s = source
>>>>         for (i = 0; i < lsm_slots ; i++) {
>>>>                 s = strstr(s, "subj_")
>>>>                 if (!s)
>>>>                         break;
>>>>                 s = search_after_subj_(s, lsm_slot_name[i], ...)
>>> The hand waving here in search_after_subj_(...) is much less;
>>> essentially you just match "subj_X" and then you can take the field
>>> value as the LSM's label without having to know the format, the policy
>>> loaded, etc.  It is both safer and doesn't require knowledge of the
>>> LSMs (the LSM "name" can be specified as a parameter to the search
>>> tool).
>> You can do that with the Hideous format as well. I wouldn't
>> say which would be easier without delving into the audit user
>> space.
> No, you can't.  You still need to parse the multiplexed mess, that's
> the problem.

You move the parsing problem to the record, where you have to
look for subj_selinux= instead of having the parsing problem in
the subj= field, where you look for something like selinux=
within the field. Neither looks like the work of an afternoon to
get right.

It probably looks like I'm arguing for the Hideous format option.
That would require less work and code disruption, so it is tempting
to push for it. But I would have to know the user space side a
whole lot better than I do to feel good about pushing anything that
isn't obviously a good choice. I kind of prefer Paul's "subj=?"
approach, but as it's harder, I don't want to spend too much time
on it if it gets me a big, juicy, well deserved NAK.



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Preferred subj= with multiple LSMs
  2019-07-17 23:02                             ` Casey Schaufler
@ 2019-07-18 13:10                               ` Simon McVittie
  2019-07-18 16:13                                 ` Casey Schaufler
  2019-07-19 21:21                               ` Preferred subj= with multiple LSMs Paul Moore
  1 sibling, 1 reply; 39+ messages in thread
From: Simon McVittie @ 2019-07-18 13:10 UTC (permalink / raw)
  To: Casey Schaufler
  Cc: Paul Moore, Steve Grubb, Richard Guy Briggs, linux-audit,
	Linux Security Module list

On Wed, 17 Jul 2019 at 16:02:16 -0700, Casey Schaufler wrote:
> We've never had to think about having general rules on
> what security modules do before, because with only one
> active each could do whatever it wanted without fear of
> conflict. If there is already a character that none of
> the existing modules use, how would it be wrong to
> reserve it?
> 
> Smack disallows the four characters '"/\ because quoting
> is too important to ignore and the likelyhood that someone
> would confuse labels with paths seemed great. I sniffed
> around a little, but couldn't find the sets for SELinux or
> AppArmor.

It seems we've been here before, when I added LinuxSecurityLabel to
https://dbus.freedesktop.org/doc/dbus-specification.html#bus-messages-get-connection-credentials
in D-Bus.

Recapping the context for those who might have missed it: in D-Bus,
processes communicate in a hub-and-spoke topology via a central message
bus process, which forwards messages between the other processes. Some
other IPC systems would call this a broker. As a result of this
indirection, the message bus is the only process in the overall system
that is in a position to ask the kernel for the identity of the other
processes (credentials(7) and related topics like LSM labels) using
unforgeable kernel-guaranteed socket options like SO_PEERCRED, SO_PEERSEC
and SO_PEERGROUPS. This means that if two processes communicate via D-Bus
and want to know each other's identities, they have to ask the message
bus; so the message bus needs a representation for that information. For
LSM labels, that representation is LinuxSecurityLabel, which is defined
in terms of SO_PEERSEC.

At the time that I defined LinuxSecurityLabel, nobody was willing to
say for sure that the label was guaranteed to be ASCII or UTF-8 (which
is part of the specification for the D-Bus STRING ('s') type), so I
had to encode it as an arbitrary ARRAY of BYTE ('ay') rather than as
a STRING. I was at least told that the label wouldn't contain embedded
'\0', and that if there is a trailing '\0', I can safely canonicalize
the string by removing it.

Also, at the time that I did that, nobody was willing to say for sure
that there was any particular correspondence between the security
label obtained by reading /proc/self/attr/current and the security
label obtained by getting the SO_PEERSEC socket option: in AppArmor,
/proc/self/attr/current is something like "unconfined\n" whereas
SO_PEERSEC is either "unconfined" or "unconfined\0" (I forget which),
but the consensus seemed to be that there is no guarantee that the
presence or absence of a trailing newline wouldn't be significant to
some non-AppArmor LSM.

If LSM stacking is going to lead to syntactic restrictions being imposed
on security labels, please could someone add them to credentials(7)
or some other suitable documentation so user-space developers can know
where we stand, or tell me what the restrictions and guarantees are so
I can propose a documentation patch?

Thanks,
    smcv

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Preferred subj= with multiple LSMs
  2019-07-16 21:25             ` Paul Moore
  2019-07-16 21:46               ` Steve Grubb
@ 2019-07-18 15:01               ` William Roberts
  2019-07-18 18:48                 ` Casey Schaufler
  1 sibling, 1 reply; 39+ messages in thread
From: William Roberts @ 2019-07-18 15:01 UTC (permalink / raw)
  To: Paul Moore
  Cc: Casey Schaufler, Richard Guy Briggs, Linux Security Module list,
	linux-audit

<snip>

> > >>>> the following (something between option #2 and #3):
> > >>>>   subj1_lsm=smack subj1=<smack_label> subj2_lsm=selinux
> > >>>>
> > >>>> subj2=<selinux_label> ...
> > >>> If it's not a subj= field why use the indirection?
> > >>>
> > >>>         subj_smack=<smack_label> subj_selinux=<selinux_label>

FWIW +1 on this approach.

<snip>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Preferred subj= with multiple LSMs
  2019-07-18 13:10                               ` Simon McVittie
@ 2019-07-18 16:13                                 ` Casey Schaufler
  2019-07-19 12:15                                   ` Simon McVittie
  0 siblings, 1 reply; 39+ messages in thread
From: Casey Schaufler @ 2019-07-18 16:13 UTC (permalink / raw)
  To: Simon McVittie
  Cc: Paul Moore, Steve Grubb, Richard Guy Briggs, linux-audit,
	Linux Security Module list, casey

On 7/18/2019 6:10 AM, Simon McVittie wrote:
> On Wed, 17 Jul 2019 at 16:02:16 -0700, Casey Schaufler wrote:
>> We've never had to think about having general rules on
>> what security modules do before, because with only one
>> active each could do whatever it wanted without fear of
>> conflict. If there is already a character that none of
>> the existing modules use, how would it be wrong to
>> reserve it?
>>
>> Smack disallows the four characters '"/\ because quoting
>> is too important to ignore and the likelyhood that someone
>> would confuse labels with paths seemed great. I sniffed
>> around a little, but couldn't find the sets for SELinux or
>> AppArmor.
> It seems we've been here before, when I added LinuxSecurityLabel to
> https://dbus.freedesktop.org/doc/dbus-specification.html#bus-messages-get-connection-credentials
> in D-Bus.
>
> Recapping the context for those who might have missed it: in D-Bus,
> processes communicate in a hub-and-spoke topology via a central message
> bus process, which forwards messages between the other processes. Some
> other IPC systems would call this a broker. As a result of this
> indirection, the message bus is the only process in the overall system
> that is in a position to ask the kernel for the identity of the other
> processes (credentials(7) and related topics like LSM labels) using
> unforgeable kernel-guaranteed socket options like SO_PEERCRED, SO_PEERSEC
> and SO_PEERGROUPS. This means that if two processes communicate via D-Bus
> and want to know each other's identities, they have to ask the message
> bus; so the message bus needs a representation for that information. For
> LSM labels, that representation is LinuxSecurityLabel, which is defined
> in terms of SO_PEERSEC.
>
> At the time that I defined LinuxSecurityLabel, nobody was willing to
> say for sure that the label was guaranteed to be ASCII or UTF-8 (which
> is part of the specification for the D-Bus STRING ('s') type), so I
> had to encode it as an arbitrary ARRAY of BYTE ('ay') rather than as
> a STRING. I was at least told that the label wouldn't contain embedded
> '\0', and that if there is a trailing '\0', I can safely canonicalize
> the string by removing it.
>
> Also, at the time that I did that, nobody was willing to say for sure
> that there was any particular correspondence between the security
> label obtained by reading /proc/self/attr/current and the security
> label obtained by getting the SO_PEERSEC socket option: in AppArmor,
> /proc/self/attr/current is something like "unconfined\n" whereas
> SO_PEERSEC is either "unconfined" or "unconfined\0" (I forget which),
> but the consensus seemed to be that there is no guarantee that the
> presence or absence of a trailing newline wouldn't be significant to
> some non-AppArmor LSM.
>
> If LSM stacking is going to lead to syntactic restrictions being imposed
> on security labels, please could someone add them to credentials(7)
> or some other suitable documentation so user-space developers can know
> where we stand, or tell me what the restrictions and guarantees are so
> I can propose a documentation patch?

Thank you for speaking up. It's good to hear from a concerned user-space
project. 

Have you been following the discussions on setting a "display" value
to specify which LSM data is presented by /proc/self/attr/current and
SO_PEERSEC? Briefly, a process can write the name of the LSM it wants
to see data from to /proc/self/attr/display, and the aforementioned
interfaces will use that LSM. If no value has been set the first LSM
registered that uses any of these interfaces gets the nod.

Does this make sense to you? We have discussed what's currently being
called the "hideous" format, selinux='a:b:c:d',apparmor='x' which
in the past, and concluded that the compatibility issues would be too
great. It's a thorny problem, and your input would be most welcome.

>
> Thanks,
>     smcv


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Preferred subj= with multiple LSMs
  2019-07-18 15:01               ` William Roberts
@ 2019-07-18 18:48                 ` Casey Schaufler
  0 siblings, 0 replies; 39+ messages in thread
From: Casey Schaufler @ 2019-07-18 18:48 UTC (permalink / raw)
  To: William Roberts, Paul Moore, Steve Grubb
  Cc: Richard Guy Briggs, Linux Security Module list, linux-audit,
	casey, SELinux

On 7/18/2019 8:01 AM, William Roberts wrote:
> <snip>
>
>>>>>>> the following (something between option #2 and #3):
>>>>>>>   subj1_lsm=smack subj1=<smack_label> subj2_lsm=selinux
>>>>>>>
>>>>>>> subj2=<selinux_label> ...
>>>>>> If it's not a subj= field why use the indirection?
>>>>>>
>>>>>>         subj_smack=<smack_label> subj_selinux=<selinux_label>
> FWIW +1 on this approach.

Stephen Smalley's original objection was that subj=<context> used
the context from the "display" LSM, and that unprivileged users could
change that. Paul Moore suggested using subj=? and supplying additional
subject data at the end of the record, using what has evolved into the
subj_<lsm>=<context> format. Steve Grubb points out that searching on
subject contexts gets much harder using this scheme.

If instead of using "subj=?" we provide the context used when "display"
is not specified, subj=a:b:c:d when the first registered "display" LSM
in SELinux, and add the subj_<lsm>=<context> entries, we have a reasonably
good chance of getting the right results.

User-space code that does not understand that there may be multiple
contexts will get a consistent set of values. They will either be all
right or all wrong. The irreverent side of me thinks this could be an
interesting fuzz test case.

It will be simple to change applications that only work with one LSM
to check if they can expect data to be from that LSM in the audit records
by reading /sys/kernel/security/lsm to get the stacking order. That
can be done in a wrapper script.

A script could easily replace the subj= value from an LSM you don't want
with the subj_<lsm>= value that you do want:

sed -e 's/\(.*\) subj=[^ ]*\(.*\) subj_apparmor=\([^ ]*\)\(.*\)/\1 subj=\2 \3/

isn't quite right, but isn't far off.

Applications that are truly stack aware can use the subj_<lsm>=<context> values. 

On a "well configured" system (e.g. out of box Fedora or Ubuntu)
everything continues to work properly.

If AppArmor is added to the Fedora system, in the module list after
SELinux, and any applications that are dealing with AppArmor
understand they aren't "display"ed, it will continue to work.

This also works for Ubuntu, where SELinux would be put after AppArmor,
and SELinux applications would have to know they're not "display"ed.

I'm ignoring applications like id(1) that make explicit checks for a
particular LSM rather than handling the general case, and systemd or dbus,
which extend kernel policy into user-space. The topic at hand is audit,
so let's restrict the discussion to that for the moment.



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Preferred subj= with multiple LSMs
  2019-07-18 16:13                                 ` Casey Schaufler
@ 2019-07-19 12:15                                   ` Simon McVittie
  2019-07-19 16:29                                     ` Casey Schaufler
  0 siblings, 1 reply; 39+ messages in thread
From: Simon McVittie @ 2019-07-19 12:15 UTC (permalink / raw)
  To: Casey Schaufler
  Cc: Paul Moore, Steve Grubb, Richard Guy Briggs, linux-audit,
	Linux Security Module list

On Thu, 18 Jul 2019 at 09:13:52 -0700, Casey Schaufler wrote:
> We have discussed what's currently being
> called the "hideous" format, selinux='a:b:c:d',apparmor='x' which
> in the past, and concluded that the compatibility issues would be too
> great.

I agree this might be too big a compat break for existing interfaces that
were designed with the assumption that there can only be one "big" LSM
at a time, like /proc/54321/attr/current and SO_PEERSEC. It would certainly
break the current libapparmor, and presumably libselinux as well.

However, I think it would be great to have multiple-"big"-LSM-aware
replacements for those interfaces, which present the various LSMs as
multiple parallel credentials.

I think it would also be valuable to take this opportunity to pin down
what can and can't be in a label, to an extent where people who want
to represent them in a similar encoding know what they can and can't
assume about their format. For example, when dbus-daemon reports an
unusual event (like rejecting a message due to policy rules or LSMs,
or hitting a resource limit that isn't normally meant to be reached),
the log entry contains miscellaneous information about the process for
debugging purposes, and it would be good if we could include all the LSM
labels in that string without ambiguity. This is essentially the same
problem that the audit subsystem has, but with fewer constraints, since
the audit subsystem has to meet externally-imposed security requirements
but our equivalent is just a nice-to-have for debugging.

> Have you been following the discussions on setting a "display" value
> to specify which LSM data is presented by /proc/self/attr/current and
> SO_PEERSEC? Briefly, a process can write the name of the LSM it wants
> to see data from to /proc/self/attr/display, and the aforementioned
> interfaces will use that LSM. If no value has been set the first LSM
> registered that uses any of these interfaces gets the nod.

I'm vaguely aware of the discussion, but LSMs aren't a big part of my
D-Bus maintainer role, so I'm afraid I can't keep up with all of it.

Do you mean that if process 11111 writes (for example) "apparmor" into
/proc/11111/attr/display, and then reads /proc/22222/attr/current
or queries the SO_PEERSEC of a socket opened by process 22222,
it will specifically see 22222's AppArmor label and not 22222's SELinux
label? Or is the contents of /proc/22222/attr/current controlled
by /proc/22222/attr/display?

How is this meant to work for generic LSM-aware user-space processes? If
(for example) ps -Z 22222 wants to get both the AppArmor label and the
SELinux label for process 22222, is it meant to write "apparmor" into
attr/display, then read /proc/22222/attr/current, then write "selinux"
to attr/display, then read /proc/22222/attr/current again? That sounds
risky if another thread might be manipulating attr/display concurrently.

The D-Bus message bus/broker (reference implementation: dbus-daemon)
is somewhat tricky because it is returning data on behalf of processes
other than itself, so it would be difficult for it to choose a good
value for "display": there's no reason why it wouldn't be responding
to requests from NetworkManager that expect to see SELinux labels, and
also requests from lxd that expect to see AppArmor labels. Obviously
it can't put both in LinuxSecurityLabel without the same compatibility
issues you're discussing.

Also note that dbus-daemon is trusted but mostly unprivileged - it
starts as root, then drops privileges to a system user normally called
messagebus, dbus or _dbus for its normal operation (although it does
retain CAP_AUDIT_WRITE) - so it can't carry out privileged operations
on other processes' /proc entries, if that's what the API requires.

I would strongly prefer it if we could get this information from
the kernel in a way that is Linux-specific but LSM-agnostic, without
having to link to libapparmor, libselinux, libsmack and everyone else's
favourite LSM library. At the moment we only need to link to libraries
for the LSMs where dbus-daemon can carry out mediation (asking the LSM
whether to accept or reject messages), and we don't need the libraries
if we are just passing through identity information.

I would also prefer it if we can get this information from SO_PEERSEC
(or some newer SO_PEERSEC replacement) without having to manipulate
ambient/implicit state like attr/display; but dbus-daemon is
single-threaded, so if we must do that, it wouldn't be *so* horrible.

Ideally I would like to be able to get all the LSM labels in O(1)
syscalls. Perhaps something with the same (buffer,length) kernel <->
user-space API as SO_PEERSEC and SO_PEERGROUPS, but instead of returning
a single \0-terminated string, it could return either the "hideous" format,
or a byte-blob that looks something like this?

    char buffer[ENOUGH_LENGTH] = { 0 };
    socklen_t len = sizeof (buffer);
    char[] expected =
    "apparmor=unconfined\0"
    "selinux=system_u:system_r:init_t:s0\0"
    "\0"
    ;

    getsockopt (fd, SOL_SOCKET, SO_PEERSECLABELS, &buffer, &length);
    /* should return 0 */
    /* now buffer should have the same bytes as expected, ending with
     * "\0\0" */

(Obviously in real life you'd have a retry loop to get the length right,
like the SO_PEERSEC code in dbus does.)

Because GetConnectionCredentials() is extensible, if there is some way
to enumerate all the security labels and get their values individually,
we could have (pseudocode)

GetConnectionCredentials(":1.1") -> {
  "UnixUserID": 0,
  "ProcessID": 1,
  "LinuxSecurityLabel.apparmor": "unconfined",
  "LinuxSecurityLabel.selinux": "system_u:system_r:init_t:s0",
  "LinuxSecurityLabel": "unconfined",    /* deprecated */
}

or (using D-Bus' structured type system)

GetConnectionCredentials(":1.1") -> {
  "UnixUserID": 0,
  "ProcessID": 1,
  "LinuxSecurityLabels": {
    "apparmor": "unconfined",
    "selinux": "system_u:system_r:init_t:s0",
  },
  "LinuxSecurityLabel": "unconfined",    /* deprecated */
}

with LinuxSecurityLabel showing the first LSM registered for backwards
compatibility? Or we could make LinuxSecurityLabel always be in the
"hideous" format if you chose to go that way in the kernel interfaces:
it's defined in terms of SO_PEERSEC, so whatever you do at the kernel
level, D-Bus should mimic that.

    smcv

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Preferred subj= with multiple LSMs
  2019-07-19 12:15                                   ` Simon McVittie
@ 2019-07-19 16:29                                     ` Casey Schaufler
  2019-07-19 18:47                                       ` Simon McVittie
  0 siblings, 1 reply; 39+ messages in thread
From: Casey Schaufler @ 2019-07-19 16:29 UTC (permalink / raw)
  To: Simon McVittie
  Cc: Paul Moore, Steve Grubb, Richard Guy Briggs, linux-audit,
	Linux Security Module list, casey

On 7/19/2019 5:15 AM, Simon McVittie wrote:
> On Thu, 18 Jul 2019 at 09:13:52 -0700, Casey Schaufler wrote:
>> We have discussed what's currently being
>> called the "hideous" format, selinux='a:b:c:d',apparmor='x' which
>> in the past, and concluded that the compatibility issues would be too
>> great.
> I agree this might be too big a compat break for existing interfaces that
> were designed with the assumption that there can only be one "big" LSM
> at a time, like /proc/54321/attr/current and SO_PEERSEC. It would certainly
> break the current libapparmor, and presumably libselinux as well.
>
> However, I think it would be great to have multiple-"big"-LSM-aware
> replacements for those interfaces, which present the various LSMs as
> multiple parallel credentials.

Defining what would go into liblsm* is a task that has fallen to
the chicken/egg paradox. We can't really define how the user-space
should work without knowing how the kernel will work, and we can't
solidify how the kernel will work until we know what user-space
can use.

---
* I absolutely refuse to allow this to be libsecurity!

> I think it would also be valuable to take this opportunity to pin down
> what can and can't be in a label, to an extent where people who want
> to represent them in a similar encoding know what they can and can't
> assume about their format. For example, when dbus-daemon reports an
> unusual event (like rejecting a message due to policy rules or LSMs,
> or hitting a resource limit that isn't normally meant to be reached),
> the log entry contains miscellaneous information about the process for
> debugging purposes, and it would be good if we could include all the LSM
> labels in that string without ambiguity. This is essentially the same
> problem that the audit subsystem has, but with fewer constraints, since
> the audit subsystem has to meet externally-imposed security requirements
> but our equivalent is just a nice-to-have for debugging.

Sounds like the Hideous format, or a variant thereof, would be
fine for you, especially if you never parse it.

>> Have you been following the discussions on setting a "display" value
>> to specify which LSM data is presented by /proc/self/attr/current and
>> SO_PEERSEC? Briefly, a process can write the name of the LSM it wants
>> to see data from to /proc/self/attr/display, and the aforementioned
>> interfaces will use that LSM. If no value has been set the first LSM
>> registered that uses any of these interfaces gets the nod.
> I'm vaguely aware of the discussion, but LSMs aren't a big part of my
> D-Bus maintainer role, so I'm afraid I can't keep up with all of it.
>
> Do you mean that if process 11111 writes (for example) "apparmor" into
> /proc/11111/attr/display, and then reads /proc/22222/attr/current
> or queries the SO_PEERSEC of a socket opened by process 22222,
> it will specifically see 22222's AppArmor label and not 22222's SELinux
> label? Or is the contents of /proc/22222/attr/current controlled
> by /proc/22222/attr/display?

Process 11111 would see the AppArmor label when reading
/proc/22222/attr/current. The display value is controlled
by process 11111 so that it can control what data it wants
to see.

> How is this meant to work for generic LSM-aware user-space processes? If
> (for example) ps -Z 22222 wants to get both the AppArmor label and the
> SELinux label for process 22222, is it meant to write "apparmor" into
> attr/display, then read /proc/22222/attr/current, then write "selinux"
> to attr/display, then read /proc/22222/attr/current again? That sounds
> risky if another thread might be manipulating attr/display concurrently.

The display is set at the task level, so should be thread safe.

> The D-Bus message bus/broker (reference implementation: dbus-daemon)
> is somewhat tricky because it is returning data on behalf of processes
> other than itself, so it would be difficult for it to choose a good
> value for "display": there's no reason why it wouldn't be responding
> to requests from NetworkManager that expect to see SELinux labels, and
> also requests from lxd that expect to see AppArmor labels. Obviously
> it can't put both in LinuxSecurityLabel without the same compatibility
> issues you're discussing.

Just so.

> Also note that dbus-daemon is trusted but mostly unprivileged - it
> starts as root, then drops privileges to a system user normally called
> messagebus, dbus or _dbus for its normal operation (although it does
> retain CAP_AUDIT_WRITE) - so it can't carry out privileged operations
> on other processes' /proc entries, if that's what the API requires.

Writing to display does not require privilege, as it affects only
the current process. The display is inherited on fork and reset on
a privileged exec.

> I would strongly prefer it if we could get this information from
> the kernel in a way that is Linux-specific but LSM-agnostic, without
> having to link to libapparmor, libselinux, libsmack and everyone else's
> favourite LSM library. At the moment we only need to link to libraries
> for the LSMs where dbus-daemon can carry out mediation (asking the LSM
> whether to accept or reject messages), and we don't need the libraries
> if we are just passing through identity information.

I can see that making dbus-daemon have to decide which label of many
to pass on to its clients would be bad.

> I would also prefer it if we can get this information from SO_PEERSEC
> (or some newer SO_PEERSEC replacement) without having to manipulate
> ambient/implicit state like attr/display; but dbus-daemon is
> single-threaded, so if we must do that, it wouldn't be *so* horrible.

An option that hasn't been discussed is a display option to provide
the Hideous format for applications that know that's what they want.
Write "hideous" into /proc/self/attr/display, and from then on you
get selinux='a:b:c:d',apparmor='z'. This could be used widely in liblsm
interfaces.

> Ideally I would like to be able to get all the LSM labels in O(1)
> syscalls. Perhaps something with the same (buffer,length) kernel <->
> user-space API as SO_PEERSEC and SO_PEERGROUPS, but instead of returning
> a single \0-terminated string, it could return either the "hideous" format,
> or a byte-blob that looks something like this?
>
>     char buffer[ENOUGH_LENGTH] = { 0 };
>     socklen_t len = sizeof (buffer);
>     char[] expected =
>     "apparmor=unconfined\0"
>     "selinux=system_u:system_r:init_t:s0\0"
>     "\0"
>     ;
>
>     getsockopt (fd, SOL_SOCKET, SO_PEERSECLABELS, &buffer, &length);
>     /* should return 0 */
>     /* now buffer should have the same bytes as expected, ending with
>      * "\0\0" */
>
> (Obviously in real life you'd have a retry loop to get the length right,
> like the SO_PEERSEC code in dbus does.)

I would see creating a friendly interface like this as part of
my mythical liblsm, but I see your point.

> Because GetConnectionCredentials() is extensible, if there is some way
> to enumerate all the security labels and get their values individually,
> we could have (pseudocode)
>
> GetConnectionCredentials(":1.1") -> {
>   "UnixUserID": 0,
>   "ProcessID": 1,
>   "LinuxSecurityLabel.apparmor": "unconfined",
>   "LinuxSecurityLabel.selinux": "system_u:system_r:init_t:s0",
>   "LinuxSecurityLabel": "unconfined",    /* deprecated */
> }
>
> or (using D-Bus' structured type system)
>
> GetConnectionCredentials(":1.1") -> {
>   "UnixUserID": 0,
>   "ProcessID": 1,
>   "LinuxSecurityLabels": {
>     "apparmor": "unconfined",
>     "selinux": "system_u:system_r:init_t:s0",
>   },
>   "LinuxSecurityLabel": "unconfined",    /* deprecated */
> }
>
> with LinuxSecurityLabel showing the first LSM registered for backwards
> compatibility? Or we could make LinuxSecurityLabel always be in the
> "hideous" format if you chose to go that way in the kernel interfaces:
> it's defined in terms of SO_PEERSEC, so whatever you do at the kernel
> level, D-Bus should mimic that.

If providing the Hideous format makes library code easier or more
efficient I'm happy to make that happen. It can't be the default due to
backward compatibility, but it can be easy as
"echo hideous > /proc/self/attr/display".

>
>     smcv


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Preferred subj= with multiple LSMs
  2019-07-19 16:29                                     ` Casey Schaufler
@ 2019-07-19 18:47                                       ` Simon McVittie
  2019-07-19 20:02                                         ` Dbus and multiple LSMs (was Preferred subj= with multiple LSMs) Casey Schaufler
  0 siblings, 1 reply; 39+ messages in thread
From: Simon McVittie @ 2019-07-19 18:47 UTC (permalink / raw)
  To: Casey Schaufler
  Cc: Paul Moore, Steve Grubb, Richard Guy Briggs, linux-audit,
	Linux Security Module list

Thanks for considering user-space in this, and sorry if I'm hijacking
this thread a bit (but I think some of the things I'm raising might be
equally applicable for audit subjects).

On Fri, 19 Jul 2019 at 09:29:17 -0700, Casey Schaufler wrote:
> On 7/19/2019 5:15 AM, Simon McVittie wrote:
> > However, I think it would be great to have multiple-"big"-LSM-aware
> > replacements for those interfaces, which present the various LSMs as
> > multiple parallel credentials.
> 
> Defining what would go into liblsm* is a task that has fallen to
> the chicken/egg paradox. We can't really define how the user-space
> should work without knowing how the kernel will work, and we can't
> solidify how the kernel will work until we know what user-space
> can use.

I was hoping the syscall wrappers in glibc would be a viable user-space
interface to the small amount of LSM stuff that dbus needs to use in an
LSM-agnostic way. That's what we use in dbus at the moment (in practice
just getsockopt, but I'd also be reading /proc/self/attr/current if there
was a specification for how to normalize it to match SO_PEERSEC results)
and it's no harder than the rest of the syscall-level APIs.

A single LSM-agnostic shared library would be the next best thing from
my point of view.

> An option that hasn't been discussed is a display option to provide
> the Hideous format for applications that know that's what they want.
> Write "hideous" into /proc/self/attr/display, and from then on you
> get selinux='a:b:c:d',apparmor='z'. This could be used widely in liblsm
> interfaces.

If the way to parse/split it is documented, then this would be easier
for dbus-daemon than continually resetting attr/display. It would be
especially good if you can document a way to find out which one of the
many labels would have been seen by an older user-space process that never
wrote to attr/display ("it's the first one in the list" would be fine),
so that we can put that one in our backwards-compatible API to clients.

Or, alternatively, we could pass it on directly to our clients and let
*them* parse it (possibly by using liblsm), the same way AppArmor-aware
D-Bus clients have to know how to use either aa_splitcon() or their
own parsing to go from the raw SO_PEERSEC result
"/usr/bin/firefox (enforce)" to the pair ("/usr/bin/firefox", "enforce")
that they probably actually wanted.

> > Do you mean that if process 11111 writes (for example) "apparmor" into
> > /proc/11111/attr/display, and then reads /proc/22222/attr/current
> > or queries the SO_PEERSEC of a socket opened by process 22222,
> > it will specifically see 22222's AppArmor label and not 22222's SELinux
> > label?
> 
> Process 11111 would see the AppArmor label when reading
> /proc/22222/attr/current. The display value is controlled
> by process 11111 so that it can control what data it wants
> to see.

OK, that's what I'd hoped.

> The display is set at the task level, so should be thread safe.

OK, good. However, thinking more about this, I have other concerns:

* In library code that can be used by a thread (task) that also uses other
  arbitrary libraries, or in an executable that uses libraries that might
  be interested in LSMs, the only safe way to deal with attr/display would
  be this sequence:

    - write desired value to /proc/self/attr/display
    - immediately read /proc/other/attr/current or query SO_PEERSEC

  and it would not be safe to rely on writing /proc/self/attr/display
  just once at startup, because some other library might have already
  changed it between startup and the actual read. Paradoxically, this
  maximizes the chance of breaking a reader that was relying on writing
  /proc/self/attr/display once during startup.

* If an async signal handler needs to know a LSM label for whatever
  reason, it will break anything in the same thread that was relying on
  that sequence, because it might have interrupted them between their
  write and their read:

    main execution path                  signal handler
    -------------------                  --------------

    write "apparmor" to attr/display
    (interrupted by async signal)
                                         write "selinux" to attr/display
                                         read attr/current or SO_PEERSEC
                                         do other stuff with SELinux label
                                         return
    (resumes)
    read attr/current or SO_PEERSEC
    expect an AppArmor label
    get a SELinux label
    sadness ensues

  Of course it's probably crazy for an async signal handler to do
  this... but people do lots of odd things in async signal handlers,
  and open(), read(), write(), getsockopt() are all async-signal-safe
  functions, so it's at least arguably valid.

> Writing to display does not require privilege, as it affects only
> the current process. The display is inherited on fork and reset on
> a privileged exec.

Another concern here: are you sure it shouldn't be reset on *any*
exec? Lots of programs (including dbus-daemon) fork-and-exec arbitrary
child processes that come from a different codebase not under our
control and aren't necessarily LSM-stacking-aware. I don't really want
to have to reset /proc/self/attr/display in our increasingly crowded
after-fork-but-before-exec code path (which, according to POSIX, is not
a safe place to invoke any non-async-signal-safe function, so we can't
easily do error handling if something goes wrong there).

Is there any possibility of having a parallel kernel API that,
if it exists, always returns the whole stack, maybe something
like /proc/<pid>/attr/current_stack and the SO_PEERSECLABELS that I
suggested previously, instead of repurposing /proc/<pid>/attr/current
and SO_PEERSEC to have contents that vary according to ambient process
state in their reader? (Bonus points if they are documented/defined with
a particular syntactic normalization this time, unlike the situation
with /proc/<pid>/attr/current and SO_PEERSEC where in principle you
need LSM-specific knowledge to know whether a trailing "\n" or "\0"
is safe to discard.)

    smcv

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Dbus and multiple LSMs (was Preferred subj= with multiple LSMs)
  2019-07-19 18:47                                       ` Simon McVittie
@ 2019-07-19 20:02                                         ` Casey Schaufler
  2019-07-22 11:36                                           ` Simon McVittie
  0 siblings, 1 reply; 39+ messages in thread
From: Casey Schaufler @ 2019-07-19 20:02 UTC (permalink / raw)
  To: Simon McVittie
  Cc: Paul Moore, Steve Grubb, Richard Guy Briggs, linux-audit,
	Linux Security Module list, casey, SELinux

On 7/19/2019 11:47 AM, Simon McVittie wrote:
> Thanks for considering user-space in this, and sorry if I'm hijacking
> this thread a bit (but I think some of the things I'm raising might be
> equally applicable for audit subjects).

Thank you for asking these questions. I think that if we
can address the issues around dbus we'll be in pretty good
shape in general.

> On Fri, 19 Jul 2019 at 09:29:17 -0700, Casey Schaufler wrote:
>> On 7/19/2019 5:15 AM, Simon McVittie wrote:
>>> However, I think it would be great to have multiple-"big"-LSM-aware
>>> replacements for those interfaces, which present the various LSMs as
>>> multiple parallel credentials.
>> Defining what would go into liblsm* is a task that has fallen to
>> the chicken/egg paradox. We can't really define how the user-space
>> should work without knowing how the kernel will work, and we can't
>> solidify how the kernel will work until we know what user-space
>> can use.
> I was hoping the syscall wrappers in glibc would be a viable user-space
> interface to the small amount of LSM stuff that dbus needs to use in an
> LSM-agnostic way. That's what we use in dbus at the moment (in practice
> just getsockopt, but I'd also be reading /proc/self/attr/current if there
> was a specification for how to normalize it to match SO_PEERSEC results)
> and it's no harder than the rest of the syscall-level APIs.

I don't see how to do that without making the Fedora and Ubuntu user space
environments remain functional.

> A single LSM-agnostic shared library would be the next best thing from
> my point of view.

Good, that's how it looks to me as well.

>> An option that hasn't been discussed is a display option to provide
>> the Hideous format for applications that know that's what they want.
>> Write "hideous" into /proc/self/attr/display, and from then on you
>> get selinux='a:b:c:d',apparmor='z'. This could be used widely in liblsm
>> interfaces.
> If the way to parse/split it is documented, then this would be easier
> for dbus-daemon than continually resetting attr/display. It would be
> especially good if you can document a way to find out which one of the
> many labels would have been seen by an older user-space process that never
> wrote to attr/display ("it's the first one in the list" would be fine),
> so that we can put that one in our backwards-compatible API to clients.

/sys/kernel/security/lsm provides the list of all LSMs active on the system.
It would be trivial to add /sys/kernel/security/default-display-lsm which
would contain that.

> Or, alternatively, we could pass it on directly to our clients and let
> *them* parse it (possibly by using liblsm), the same way AppArmor-aware
> D-Bus clients have to know how to use either aa_splitcon() or their
> own parsing to go from the raw SO_PEERSEC result
> "/usr/bin/firefox (enforce)" to the pair ("/usr/bin/firefox", "enforce")
> that they probably actually wanted.
>
>>> Do you mean that if process 11111 writes (for example) "apparmor" into
>>> /proc/11111/attr/display, and then reads /proc/22222/attr/current
>>> or queries the SO_PEERSEC of a socket opened by process 22222,
>>> it will specifically see 22222's AppArmor label and not 22222's SELinux
>>> label?
>> Process 11111 would see the AppArmor label when reading
>> /proc/22222/attr/current. The display value is controlled
>> by process 11111 so that it can control what data it wants
>> to see.
> OK, that's what I'd hoped.
>
>> The display is set at the task level, so should be thread safe.
> OK, good. However, thinking more about this, I have other concerns:
>
> * In library code that can be used by a thread (task) that also uses other
>   arbitrary libraries, or in an executable that uses libraries that might
>   be interested in LSMs, the only safe way to deal with attr/display would
>   be this sequence:
>
>     - write desired value to /proc/self/attr/display
>     - immediately read /proc/other/attr/current or query SO_PEERSEC
>
>   and it would not be safe to rely on writing /proc/self/attr/display
>   just once at startup, because some other library might have already
>   changed it between startup and the actual read. Paradoxically, this
>   maximizes the chance of breaking a reader that was relying on writing
>   /proc/self/attr/display once during startup.
>
> * If an async signal handler needs to know a LSM label for whatever
>   reason, it will break anything in the same thread that was relying on
>   that sequence, because it might have interrupted them between their
>   write and their read:
>
>     main execution path                  signal handler
>     -------------------                  --------------
>
>     write "apparmor" to attr/display
>     (interrupted by async signal)
>                                          write "selinux" to attr/display
>                                          read attr/current or SO_PEERSEC
>                                          do other stuff with SELinux label
>                                          return
>     (resumes)
>     read attr/current or SO_PEERSEC
>     expect an AppArmor label
>     get a SELinux label
>     sadness ensues
>
>   Of course it's probably crazy for an async signal handler to do
>   this... but people do lots of odd things in async signal handlers,
>   and open(), read(), write(), getsockopt() are all async-signal-safe
>   functions, so it's at least arguably valid.

Stephen Smalley has already pointed out some of these issues.
I see display being used in scripts:

	echo apparmor > /proc/self/attr/display
	apparmor-do-stuff --options --deamon

much more than inside new or updated programs.

>> Writing to display does not require privilege, as it affects only
>> the current process. The display is inherited on fork and reset on
>> a privileged exec.
> Another concern here: are you sure it shouldn't be reset on *any*
> exec?

Yes, because so much of the user-space ecosystem depends on programs
that rarely get updated there has to be a way to specify it externally.
I don't like the situation, but we can't ignore it.

> Lots of programs (including dbus-daemon) fork-and-exec arbitrary
> child processes that come from a different codebase not under our
> control and aren't necessarily LSM-stacking-aware. I don't really want
> to have to reset /proc/self/attr/display in our increasingly crowded
> after-fork-but-before-exec code path (which, according to POSIX, is not
> a safe place to invoke any non-async-signal-safe function, so we can't
> easily do error handling if something goes wrong there).

My hope is that new and updated programs will have to tools
they need to get it right, and that those that don't won't
fall over on a well configured system.

> Is there any possibility of having a parallel kernel API that,
> if it exists, always returns the whole stack, maybe something
> like /proc/<pid>/attr/current_stack and the SO_PEERSECLABELS that I
> suggested previously,

/proc/<pid>/attr/current_stack is easy. SO_PEERSECLABELS will be
harder to sell, but would not be hard to implement if we can get
agreement on the Hideous format.
 

> instead of repurposing /proc/<pid>/attr/current
> and SO_PEERSEC to have contents that vary according to ambient process
> state in their reader?

In addition, yes. Instead of? I don't think that we can have a
backward compatibility story that flies without it.

>  (Bonus points if they are documented/defined with
> a particular syntactic normalization this time, unlike the situation
> with /proc/<pid>/attr/current and SO_PEERSEC where in principle you
> need LSM-specific knowledge to know whether a trailing "\n" or "\0"
> is safe to discard.)

I think that's necessary.

>
>     smcv


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Preferred subj= with multiple LSMs
  2019-07-17 23:02                             ` Casey Schaufler
  2019-07-18 13:10                               ` Simon McVittie
@ 2019-07-19 21:21                               ` Paul Moore
  2019-07-22 20:50                                 ` James Morris
  1 sibling, 1 reply; 39+ messages in thread
From: Paul Moore @ 2019-07-19 21:21 UTC (permalink / raw)
  To: Casey Schaufler
  Cc: Steve Grubb, Richard Guy Briggs, linux-audit, Linux Security Module list

On Wed, Jul 17, 2019 at 7:02 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
> On 7/17/2019 9:23 AM, Paul Moore wrote:
> > On Wed, Jul 17, 2019 at 11:49 AM Casey Schaufler <casey@schaufler-ca.com> wrote:
> >> On 7/17/2019 5:14 AM, Paul Moore wrote:
> >>> On Tue, Jul 16, 2019 at 7:47 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
> >>>> On 7/16/2019 4:13 PM, Paul Moore wrote:
> >>>>> On Tue, Jul 16, 2019 at 6:18 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
> >>>>>> It sounds as if some variant of the Hideous format:
> >>>>>>
> >>>>>>         subj=selinux='a:b:c:d',apparmor='z'
> >>>>>>         subj=selinux/a:b:c:d/apparmor/z
> >>>>>>         subj=(selinux)a:b:c:d/(apparmor)z
> >>>>>>
> >>>>>> would meet Steve's searchability requirements, but with significant
> >>>>>> parsing performance penalties.
> >>>>> I think "hideous format" sums it up nicely.  Whatever we choose here
> >>>>> we are likely going to be stuck with for some time and I'm near to
> >>>>> 100% that multiplexing the labels onto a single field is going to be a
> >>>>> disaster.
> >>>> If the requirement is that subj= be searchable I don't see much of
> >>>> an alternative to a Hideous format. If we can get past that, and say
> >>>> that all subj_* have to be searchable we can avoid that set of issues.
> >>>> Instead of:
> >>>>
> >>>>         s = strstr(source, "subj=")
> >>>>         search_after_subj(s, ...);
> >>> This example does a lot of hand waving in search_after_subj(...)
> >>> regarding parsing the multiplexed LSM label.  Unless we restrict the
> >>> LSM label formats (which seems both wrong, and too late IMHO)
> >> I don't think it's too late, and I think it would be healthy
> >> to restrict LSM "contexts" to character sets that make command
> >> line specification possible. Embedded newlines? Ewwww.
> > That would imply that the delimiter you would choose for the
> > multiplexed approach would be something odd (I think you suggested
> > 0x02, or similar, earlier) which would likely require the multiplexed
> > subj field to become a hex encoded field which would be very
> > unfortunate in my opinion and would technically break with the current
> > subj/obj field format spec.  Picking a normal-ish delimiter, and
> > restricting its use by LSMs seems wrong to me.
>
> Just say "no" to hex encoding!

Yes, it's best avoided.

> BTW, keys are not hex encoded.

The kernel keyring keys?  Not really relevant here I don't think.

> We've never had to think about having general rules on
> what security modules do before, because with only one
> active each could do whatever it wanted without fear of
> conflict. If there is already a character that none of
> the existing modules use, how would it be wrong to
> reserve it?

"We've never had to think about having general rules on what security
modules do before..."

We famously haven't imposed restrictions on the label format before
now, and this seems like a pretty poor reason to start.

> > It's important to remember that Steve's strstr() comment only reflects
> > his set of userspace tools.  When you start talking about log
> > aggregation and analytics, it seems very likely that there are other
> > tools in use, likely with their own parsers that do much more
> > complicated searches than a simple strstr() call.
>
> Point. But long term, they'll have to be updated to accommodate
> whatever we decide on. Which makes the "simple" case, where one
> security module is in use all the more important.

Both the multiplexed and subj_X proposals handle the single major LSM
case the same: identical to what we have now.  Regardless of how
important the single major LSM case may be, it isn't a distinguishing
factor in this discussion.

> >>>> we have
> >>>>
> >>>>         s = source
> >>>>         for (i = 0; i < lsm_slots ; i++) {
> >>>>                 s = strstr(s, "subj_")
> >>>>                 if (!s)
> >>>>                         break;
> >>>>                 s = search_after_subj_(s, lsm_slot_name[i], ...)
> >>> The hand waving here in search_after_subj_(...) is much less;
> >>> essentially you just match "subj_X" and then you can take the field
> >>> value as the LSM's label without having to know the format, the policy
> >>> loaded, etc.  It is both safer and doesn't require knowledge of the
> >>> LSMs (the LSM "name" can be specified as a parameter to the search
> >>> tool).
> >> You can do that with the Hideous format as well. I wouldn't
> >> say which would be easier without delving into the audit user
> >> space.
>
> > No, you can't.  You still need to parse the multiplexed mess, that's
> > the problem.
>
> You move the parsing problem to the record, where you have to
> look for subj_selinux= instead of having the parsing problem in
> the subj= field, where you look for something like selinux=
> within the field. Neither looks like the work of an afternoon to
> get right.

Finding subj_X in an audit record is no different than finding any
other field in a record.  Parsing the multiplexed label mess is a
whole different problem and prone to lots of mistakes.

> It probably looks like I'm arguing for the Hideous format option.
> That would require less work and code disruption, so it is tempting
> to push for it. But I would have to know the user space side a
> whole lot better than I do to feel good about pushing anything that
> isn't obviously a good choice. I kind of prefer Paul's "subj=?"
> approach, but as it's harder, I don't want to spend too much time
> on it if it gets me a big, juicy, well deserved NAK.

I didn't want to have to NAK this, but if that is what it is going to
take, so be it ... as it currently stands I'm NAK'ing the the
multiplexed approach.  You don't have to go with the subj_X approach,
but the multiplexed approach is a terrible idea and I can almost
guarantee that we would be regretting that choice in a few years time.

-- 
paul moore
www.paul-moore.com

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Dbus and multiple LSMs (was Preferred subj= with multiple LSMs)
  2019-07-19 20:02                                         ` Dbus and multiple LSMs (was Preferred subj= with multiple LSMs) Casey Schaufler
@ 2019-07-22 11:36                                           ` Simon McVittie
  2019-07-22 16:04                                             ` Casey Schaufler
  0 siblings, 1 reply; 39+ messages in thread
From: Simon McVittie @ 2019-07-22 11:36 UTC (permalink / raw)
  To: Casey Schaufler
  Cc: Paul Moore, Steve Grubb, Richard Guy Briggs, linux-audit,
	Linux Security Module list, SELinux

On Fri, 19 Jul 2019 at 13:02:24 -0700, Casey Schaufler wrote:
> On 7/19/2019 11:47 AM, Simon McVittie wrote:
> > I was hoping the syscall wrappers in glibc would be a viable user-space
> > interface to the small amount of LSM stuff that dbus needs to use in an
> > LSM-agnostic way.
> 
> I don't see how to do that without making the Fedora and Ubuntu user space
> environments [not] remain functional.

What I was thinking of was a second, parallel kernel <-> user-space
interface (like the SO_PEERSECLABELS that I suggested) for future/updated
user-space components. SO_PEERSEC would continue to return some
hopefully-backwards-compatible thing, but would be deprecated, because it
cannot fully represent the reality of LSM stacking while remaining
backwards-compatible.

> I see display being used in scripts:
> 
> 	echo apparmor > /proc/self/attr/display
> 	apparmor-do-stuff --options --deamon
> 
> much more than inside new or updated programs.

Note that this implicitly relies on echo being a shell builtin, which
is common but not guaranteed (I don't think). It would work in bash or
dash, though.

If apparmor-do-stuff no longer works, and you have to wrap a shell
script around it, isn't that the same amount of user-space breakage as
if apparmor-do-stuff no longer works and you have to install a newer
version that does work? Either way, the sysadmin must take action to
change user-space components. I think the attr/display thing only reduces
the magnitude of the user-space changes required to catch up, and doesn't
eliminate the fact that those changes were needed.

> > Lots of programs (including dbus-daemon) fork-and-exec arbitrary
> > child processes that come from a different codebase not under our
> > control and aren't necessarily LSM-stacking-aware. I don't really want
> > to have to reset /proc/self/attr/display in our increasingly crowded
> > after-fork-but-before-exec code path
> 
> My hope is that new and updated programs will have to tools
> they need to get it right, and that those that don't won't
> fall over on a well configured system.

The problem I see here is that if we assume dbus-daemon is a new/updated
program that has set /proc/self/attr/display = "hideous" in order to get
the full stack of labels for its peer processes, then it will be causing
side-effects on its separately-maintained child processes - they will
no longer be able to benefit from the backwards-compatility thing where
/proc/self/attr/display (effectively) defaults to the first LSM that
has labels, because dbus-daemon overrode that (unless dbus-daemon takes
action to reverse it between fork and exec). This partially defeats the
semi-backwards-compatible handling of the existing kernel interfaces.

If dbus-daemon could read SO_PEERSECLABELS instead of SO_PEERSEC and
read /proc/<pid>/attr/current_stack instead of /proc/<pid>/attr/current,
leaving /proc/self/attr/display untouched, then this concern would go away.

Similarly, dbus-daemon can be linked to libselinux and/or libapparmor
(on Debian it's linked to both, even in the non-stackable present,
and the right one for the kernel configuration is chosen at runtime).
If one of those libraries wrote to /proc/self/attr/display, then the rest
of dbus-daemon's main thread and all child processes would implicitly be
getting the result of that - even if dbus-daemon itself had not yet been
updated for stacked LSMs (in which case it cannot be expected to reverse
their action between fork and exec, because it's an older codebase that
doesn't yet know that "big" LSMs can be stacked).

So I think libselinux and libapparmor should be enhanced to use
new kernel interfaces that get the label they want to get (either
just that label, or all the labels), instead of being enhanced to
write /proc/self/attr/display to change the meaning of old kernel
interfaces. Otherwise they can break other code in their process or
their subprocesses.

> > instead of repurposing /proc/<pid>/attr/current
> > and SO_PEERSEC to have contents that vary according to ambient process
> > state in their reader?
> 
> In addition, yes. Instead of? I don't think that we can have a
> backward compatibility story that flies without it.

Consider only SELinux and AppArmor for a moment (I know there are other
"big" LSMs like Smack, but this same reasoning applies to any pair, with
appropriate search-and-replace on their names).

Neither SELinux nor AppArmor: there are no labels, nothing changed.

AppArmor is the only "big" LSM in the stack (Ubuntu): previously,
the label was the AppArmor label; now, if attr/display is not altered,
the label is the one used by the first "big" LSM in the stack, which is
AppArmor. Nothing changed.

SELinux is the only "big" LSM in the stack (Red Hat): same as for AppArmor
being the only "big" LSM in the stack, but with s/AppArmor/SELinux/.

SELinux and AppArmor stacked: this is a situation that could not exist
before, so distro/sysadmin action must have been necessary to make it
happen. However much ambient process state is invented, I don't see any
way to make both SELinux and AppArmor user-space work without modifications:
at least one of them (the one that is second in the stack) has to use new
kernel interfaces, or alter attr/display to change the meaning of the old
kernel interfaces, or something similar, to get the second LSM's labels.
So distro/sysadmin action in user-space is also going to be necessary here
whatever happens - backward compatibility has already been broken, it's
only a question of how intrusive the user-space changes are. Is it really
so much worse if the distro/sysadmin action taken to update user-space
has to take the form of using new kernel interfaces that always do the
same thing, rather than changing the meaning of old kernel interfaces?

    smcv

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Dbus and multiple LSMs (was Preferred subj= with multiple LSMs)
  2019-07-22 11:36                                           ` Simon McVittie
@ 2019-07-22 16:04                                             ` Casey Schaufler
  0 siblings, 0 replies; 39+ messages in thread
From: Casey Schaufler @ 2019-07-22 16:04 UTC (permalink / raw)
  To: Simon McVittie
  Cc: Paul Moore, Steve Grubb, Richard Guy Briggs, linux-audit,
	Linux Security Module list, SELinux, casey

On 7/22/2019 4:36 AM, Simon McVittie wrote:
> On Fri, 19 Jul 2019 at 13:02:24 -0700, Casey Schaufler wrote:
>> On 7/19/2019 11:47 AM, Simon McVittie wrote:
>>> I was hoping the syscall wrappers in glibc would be a viable user-space
>>> interface to the small amount of LSM stuff that dbus needs to use in an
>>> LSM-agnostic way.
>> I don't see how to do that without making the Fedora and Ubuntu user space
>> environments [not] remain functional.
> What I was thinking of was a second, parallel kernel <-> user-space
> interface (like the SO_PEERSECLABELS that I suggested) for future/updated
> user-space components. SO_PEERSEC would continue to return some
> hopefully-backwards-compatible thing, but would be deprecated, because it
> cannot fully represent the reality of LSM stacking while remaining
> backwards-compatible.

I will propose SO_PEERCONTEXT and /proc/.../attr/stack/context,
both of which will use the Hideous format, in the next round. I
appreciate the suggestion and discussion.

>> I see display being used in scripts:
>>
>> 	echo apparmor > /proc/self/attr/display
>> 	apparmor-do-stuff --options --deamon
>>
>> much more than inside new or updated programs.
> Note that this implicitly relies on echo being a shell builtin, which
> is common but not guaranteed (I don't think). It would work in bash or
> dash, though.

Yes, echo being built-in can't be guaranteed. Most shells have some
way of doing the equivalent.

> If apparmor-do-stuff no longer works, and you have to wrap a shell
> script around it, isn't that the same amount of user-space breakage as
> if apparmor-do-stuff no longer works and you have to install a newer
> version that does work?

True when there is such a newer version. I'm sure you're aware
of how much system software out there hasn't been updated in this
century.

> Either way, the sysadmin must take action to
> change user-space components. I think the attr/display thing only reduces
> the magnitude of the user-space changes required to catch up, and doesn't
> eliminate the fact that those changes were needed.

Agreed. It's a tool for the times of transition.

>>> Lots of programs (including dbus-daemon) fork-and-exec arbitrary
>>> child processes that come from a different codebase not under our
>>> control and aren't necessarily LSM-stacking-aware. I don't really want
>>> to have to reset /proc/self/attr/display in our increasingly crowded
>>> after-fork-but-before-exec code path
>> My hope is that new and updated programs will have to tools
>> they need to get it right, and that those that don't won't
>> fall over on a well configured system.
> The problem I see here is that if we assume dbus-daemon is a new/updated
> program that has set /proc/self/attr/display = "hideous" in order to get
> the full stack of labels for its peer processes, then it will be causing
> side-effects on its separately-maintained child processes - they will
> no longer be able to benefit from the backwards-compatility thing where
> /proc/self/attr/display (effectively) defaults to the first LSM that
> has labels, because dbus-daemon overrode that (unless dbus-daemon takes
> action to reverse it between fork and exec). This partially defeats the
> semi-backwards-compatible handling of the existing kernel interfaces.

Point. /proc/self/attr/stack/context and SO_PEERCONTEXT comprise a better,
more reliable solution.

> If dbus-daemon could read SO_PEERSECLABELS instead of SO_PEERSEC and
> read /proc/<pid>/attr/current_stack instead of /proc/<pid>/attr/current,
> leaving /proc/self/attr/display untouched, then this concern would go away.

I agree.

> Similarly, dbus-daemon can be linked to libselinux and/or libapparmor
> (on Debian it's linked to both, even in the non-stackable present,
> and the right one for the kernel configuration is chosen at runtime).
> If one of those libraries wrote to /proc/self/attr/display, then the rest
> of dbus-daemon's main thread and all child processes would implicitly be
> getting the result of that - even if dbus-daemon itself had not yet been
> updated for stacked LSMs (in which case it cannot be expected to reverse
> their action between fork and exec, because it's an older codebase that
> doesn't yet know that "big" LSMs can be stacked).

Yes.

> So I think libselinux and libapparmor should be enhanced to use
> new kernel interfaces that get the label they want to get (either
> just that label, or all the labels), instead of being enhanced to
> write /proc/self/attr/display to change the meaning of old kernel
> interfaces. Otherwise they can break other code in their process or
> their subprocesses.

The AppArmor team is already moving away from using the /proc/self/attr
intefaces. /proc/self/attr/smack is already there, and the transition
begun. The SELinux developers seem firmly set in the position that there
is no reason they should ever change. In the long term I think we'll get
the conflict sorted out. It's hard to say what value of "long term"
we're looking at. 

>>> instead of repurposing /proc/<pid>/attr/current
>>> and SO_PEERSEC to have contents that vary according to ambient process
>>> state in their reader?
>> In addition, yes. Instead of? I don't think that we can have a
>> backward compatibility story that flies without it.
> Consider only SELinux and AppArmor for a moment (I know there are other
> "big" LSMs like Smack, but this same reasoning applies to any pair, with
> appropriate search-and-replace on their names).
>
> Neither SELinux nor AppArmor: there are no labels, nothing changed.
>
> AppArmor is the only "big" LSM in the stack (Ubuntu): previously,
> the label was the AppArmor label; now, if attr/display is not altered,
> the label is the one used by the first "big" LSM in the stack, which is
> AppArmor. Nothing changed.
>
> SELinux is the only "big" LSM in the stack (Red Hat): same as for AppArmor
> being the only "big" LSM in the stack, but with s/AppArmor/SELinux/.
>
> SELinux and AppArmor stacked: this is a situation that could not exist
> before, so distro/sysadmin action must have been necessary to make it
> happen. However much ambient process state is invented, I don't see any
> way to make both SELinux and AppArmor user-space work without modifications:
> at least one of them (the one that is second in the stack) has to use new
> kernel interfaces, or alter attr/display to change the meaning of the old
> kernel interfaces, or something similar, to get the second LSM's labels.
> So distro/sysadmin action in user-space is also going to be necessary here
> whatever happens - backward compatibility has already been broken, it's
> only a question of how intrusive the user-space changes are. Is it really
> so much worse if the distro/sysadmin action taken to update user-space
> has to take the form of using new kernel interfaces that always do the
> same thing, rather than changing the meaning of old kernel interfaces?

In addition to the big name distros/systems like RedHat, Ubuntu and
Android there are a bunch of smaller players who don't have the
expertise and/or staffing and/or upstream clout to update system
services. Some of these are targets for stacked LSMs. They will be
delighted to get updated programs, but will muddle through with the
compatibility mechanisms if they have to.

>     smcv

Thank you again for your insights on this topic. My next round
should provide what you've suggested.
 



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Preferred subj= with multiple LSMs
  2019-07-19 21:21                               ` Preferred subj= with multiple LSMs Paul Moore
@ 2019-07-22 20:50                                 ` James Morris
  2019-07-22 22:01                                   ` Casey Schaufler
  0 siblings, 1 reply; 39+ messages in thread
From: James Morris @ 2019-07-22 20:50 UTC (permalink / raw)
  To: Paul Moore
  Cc: Casey Schaufler, Steve Grubb, Richard Guy Briggs, linux-audit,
	Linux Security Module list

On Fri, 19 Jul 2019, Paul Moore wrote:

> > We've never had to think about having general rules on
> > what security modules do before, because with only one
> > active each could do whatever it wanted without fear of
> > conflict. If there is already a character that none of
> > the existing modules use, how would it be wrong to
> > reserve it?
> 
> "We've never had to think about having general rules on what security
> modules do before..."
> 
> We famously haven't imposed restrictions on the label format before
> now, and this seems like a pretty poor reason to start.

Agreed.


-- 
James Morris
<jmorris@namei.org>


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Preferred subj= with multiple LSMs
  2019-07-22 20:50                                 ` James Morris
@ 2019-07-22 22:01                                   ` Casey Schaufler
  2019-07-22 22:30                                     ` Paul Moore
  0 siblings, 1 reply; 39+ messages in thread
From: Casey Schaufler @ 2019-07-22 22:01 UTC (permalink / raw)
  To: James Morris, Paul Moore
  Cc: Steve Grubb, Richard Guy Briggs, linux-audit,
	Linux Security Module list, casey

On 7/22/2019 1:50 PM, James Morris wrote:
> On Fri, 19 Jul 2019, Paul Moore wrote:
>
>>> We've never had to think about having general rules on
>>> what security modules do before, because with only one
>>> active each could do whatever it wanted without fear of
>>> conflict. If there is already a character that none of
>>> the existing modules use, how would it be wrong to
>>> reserve it?
>> "We've never had to think about having general rules on what security
>> modules do before..."
>>
>> We famously haven't imposed restrictions on the label format before
>> now, and this seems like a pretty poor reason to start.
> Agreed.

In a follow on thread

https://www.spinics.net/lists/linux-security-module/msg29996.html

we've been discussing the needs of dbus-daemon in a multiple LSM
environment. I suggest that if supporting dbus well is assisted by
making reasonable restrictions on what constitutes a valid LSM
"context" that we have a good reason. While there are ways to
present groups of arbitrary hunks of data, why would we want to?



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Preferred subj= with multiple LSMs
  2019-07-22 22:01                                   ` Casey Schaufler
@ 2019-07-22 22:30                                     ` Paul Moore
  2019-07-23  0:11                                       ` Casey Schaufler
  2019-07-23 14:06                                       ` Simon McVittie
  0 siblings, 2 replies; 39+ messages in thread
From: Paul Moore @ 2019-07-22 22:30 UTC (permalink / raw)
  To: Casey Schaufler
  Cc: James Morris, Steve Grubb, Richard Guy Briggs, linux-audit,
	Linux Security Module list

On Mon, Jul 22, 2019 at 6:01 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
> On 7/22/2019 1:50 PM, James Morris wrote:
> > On Fri, 19 Jul 2019, Paul Moore wrote:
> >
> >>> We've never had to think about having general rules on
> >>> what security modules do before, because with only one
> >>> active each could do whatever it wanted without fear of
> >>> conflict. If there is already a character that none of
> >>> the existing modules use, how would it be wrong to
> >>> reserve it?
> >> "We've never had to think about having general rules on what security
> >> modules do before..."
> >>
> >> We famously haven't imposed restrictions on the label format before
> >> now, and this seems like a pretty poor reason to start.
> > Agreed.
>
> In a follow on thread
>
> https://www.spinics.net/lists/linux-security-module/msg29996.html
>
> we've been discussing the needs of dbus-daemon in a multiple LSM
> environment. I suggest that if supporting dbus well is assisted by
> making reasonable restrictions on what constitutes a valid LSM
> "context" that we have a good reason. While there are ways to
> present groups of arbitrary hunks of data, why would we want to?

I continue to believe that restrictions on the label format are a bad
idea, and I further believe that multiplexing the labels is going to
be a major problem that will haunt us for many, many years.  If we are
going to support multiple simultaneous LSMs I think we need to find a
way to represent those labels independently.

-- 
paul moore
www.paul-moore.com

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Preferred subj= with multiple LSMs
  2019-07-22 22:30                                     ` Paul Moore
@ 2019-07-23  0:11                                       ` Casey Schaufler
  2019-07-23 14:06                                       ` Simon McVittie
  1 sibling, 0 replies; 39+ messages in thread
From: Casey Schaufler @ 2019-07-23  0:11 UTC (permalink / raw)
  To: Paul Moore
  Cc: James Morris, Steve Grubb, Richard Guy Briggs, linux-audit,
	Linux Security Module list, Simon McVittie, casey

On 7/22/2019 3:30 PM, Paul Moore wrote:
> On Mon, Jul 22, 2019 at 6:01 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
>> On 7/22/2019 1:50 PM, James Morris wrote:
>>> On Fri, 19 Jul 2019, Paul Moore wrote:
>>>
>>>>> We've never had to think about having general rules on
>>>>> what security modules do before, because with only one
>>>>> active each could do whatever it wanted without fear of
>>>>> conflict. If there is already a character that none of
>>>>> the existing modules use, how would it be wrong to
>>>>> reserve it?
>>>> "We've never had to think about having general rules on what security
>>>> modules do before..."
>>>>
>>>> We famously haven't imposed restrictions on the label format before
>>>> now, and this seems like a pretty poor reason to start.
>>> Agreed.
>> In a follow on thread
>>
>> https://www.spinics.net/lists/linux-security-module/msg29996.html
>>
>> we've been discussing the needs of dbus-daemon in a multiple LSM
>> environment. I suggest that if supporting dbus well is assisted by
>> making reasonable restrictions on what constitutes a valid LSM
>> "context" that we have a good reason. While there are ways to
>> present groups of arbitrary hunks of data, why would we want to?
> I continue to believe that restrictions on the label format are a bad
> idea, and I further believe that multiplexing the labels is going to
> be a major problem that will haunt us for many, many years.  If we are
> going to support multiple simultaneous LSMs I think we need to find a
> way to represent those labels independently.

Let's review the bidding:

Audit wants to maintain backward compatibility while also getting
the information about multiple subject and object labels. The current
proposal:

	... subj=a:b:c:d \
	... obj=e:f:g:h obj_selinux=e:f:g:h obj_mumble=Crivens \
	... subj_selinux=a:b:c:d subj_mumble=Feegle \
	...

where obj_<lsm> and subj_<lsm> are only provided if there's more than
one active "display" LSM.

Dbus wants an atomic fetch of the security attributes from a socket
and from a /proc entry. We don't want to break compatibility, so new
interfaces are provided:

	SO_PEERCONTEXT		- packet label in the "Hideous" format
	/proc/.../attr/context	- process label in the "Hideous" format

Legacy programs want the security attributes from a socket
and from a /proc entry. Since they don't know how to differentiate
which security module is active, these are controlled by the
"display", which defaults to the first module loaded that provides
a secid_to_secctx() hook. (not quite the definition, but close enough)

 	SO_PEERSEC		- "display" LSM packet label in module native format
	/proc/.../attr/display	- set/get the "display" value
	/proc/.../attr/current	- "display" LSM process label in module native format
	/proc/.../attr/smack/current - Smack process label in module native format

A classic Android, Tizen, Fedora or Ubuntu system will continue to use these
interfaces and see no difference in behavior.

A system that really wants to use multiple "display"ing  modules will
have the same issues that dbus has, and will likely convert to the new,
"hideous" interfaces, especially if a liblsm (NOT libsecurity!) is
provided.



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Preferred subj= with multiple LSMs
  2019-07-22 22:30                                     ` Paul Moore
  2019-07-23  0:11                                       ` Casey Schaufler
@ 2019-07-23 14:06                                       ` Simon McVittie
  2019-07-23 17:32                                         ` Casey Schaufler
  2019-07-23 21:46                                         ` James Morris
  1 sibling, 2 replies; 39+ messages in thread
From: Simon McVittie @ 2019-07-23 14:06 UTC (permalink / raw)
  To: Paul Moore
  Cc: Casey Schaufler, James Morris, Steve Grubb, Richard Guy Briggs,
	linux-audit, Linux Security Module list

On Mon, 22 Jul 2019 at 18:30:35 -0400, Paul Moore wrote:
> On Mon, Jul 22, 2019 at 6:01 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
> > I suggest that if supporting dbus well is assisted by
> > making reasonable restrictions on what constitutes a valid LSM
> > "context" that we have a good reason.
> 
> I continue to believe that restrictions on the label format are a bad
> idea

Does this include the restriction "the label does not include \0",
which is an assumption that dbus is already relying on since I checked
it in the thread around
<https://marc.info/?l=linux-security-module&m=142323508321029&w=2>?
Or is that restriction so fundamental that it's considered OK?

(Other user-space tools like ls -Z and ps -Z also rely on that assumption
by printing security contexts with %s, as far as I know.)

dbus does not require a way to multiplex multiple LSMs' labels in a
printable text string, so from my point of view, multiplexed labels do
not necessarily have to be in what Casey calls the "Hideous" format,
or in any text format at all: anything with documented rules for parsing
(including the unescaping that readers are expected to apply, if there
is any) would be fine. Based on the assumption of no "\0", I previously
suggested a "\0"-delimited encoding similar to /proc/self/cmdline, which
would not need any escaping/unescaping:

    "apparmor\0" <apparmor label> "\0"
    "selinux\0" <SELinux label> "\0"
    ...
    "\0" (or this could be omitted since it's redundant with the length)

which would be fine (indeed, actually easier than the "Hideous" format)
from dbus' point of view.

dbus does not strictly need reading security labels for sockets or
processes to be atomic, either: it would be OK if we can get the complete
list of LSM labels *somehow*, possibly in O(number of LSMs) rather than
O(1) syscalls (although I'd prefer O(1)).

However, the getsockopt() interface only lets the kernel return one thing
per socket option, and I assume the networking maintainers probably don't
want to have to add SO_PEERAPPARMOR, SO_PEERSELINUX... for each LSM -
so this part at least would probably be easier as a single blob in some
text or binary format.

    smcv

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Preferred subj= with multiple LSMs
  2019-07-23 14:06                                       ` Simon McVittie
@ 2019-07-23 17:32                                         ` Casey Schaufler
  2019-07-23 21:46                                         ` James Morris
  1 sibling, 0 replies; 39+ messages in thread
From: Casey Schaufler @ 2019-07-23 17:32 UTC (permalink / raw)
  To: Simon McVittie, Paul Moore
  Cc: James Morris, Steve Grubb, Richard Guy Briggs, linux-audit,
	Linux Security Module list, casey

On 7/23/2019 7:06 AM, Simon McVittie wrote:
> On Mon, 22 Jul 2019 at 18:30:35 -0400, Paul Moore wrote:
>> On Mon, Jul 22, 2019 at 6:01 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
>>> I suggest that if supporting dbus well is assisted by
>>> making reasonable restrictions on what constitutes a valid LSM
>>> "context" that we have a good reason.
>> I continue to believe that restrictions on the label format are a bad
>> idea
> Does this include the restriction "the label does not include \0",
> which is an assumption that dbus is already relying on since I checked
> it in the thread around
> <https://marc.info/?l=linux-security-module&m=142323508321029&w=2>?
> Or is that restriction so fundamental that it's considered OK?
>
> (Other user-space tools like ls -Z and ps -Z also rely on that assumption
> by printing security contexts with %s, as far as I know.)

The "-Z" options for ls and ps are unfortunately hard coded for SELinux.
For applications to be general in the presence of LSMs you are correct
that there need to be some assumptions.

> dbus does not require a way to multiplex multiple LSMs' labels in a
> printable text string, so from my point of view, multiplexed labels do
> not necessarily have to be in what Casey calls the "Hideous" format,
> or in any text format at all: anything with documented rules for parsing
> (including the unescaping that readers are expected to apply, if there
> is any) would be fine. Based on the assumption of no "\0", I previously
> suggested a "\0"-delimited encoding similar to /proc/self/cmdline, which
> would not need any escaping/unescaping:
>
>     "apparmor\0" <apparmor label> "\0"
>     "selinux\0" <SELinux label> "\0"
>     ...
>     "\0" (or this could be omitted since it's redundant with the length)
>
> which would be fine (indeed, actually easier than the "Hideous" format)
> from dbus' point of view.

I am an advocate of a single string due to the preponderance of
scripting language programing in today's world. It would be easy to provide
a library function to transform the "Hideous" format into the "cmdline"
format or, I'll admit, the other way round. I'm not so set in my opinion
that if it came down to "cmdline" or nothing I wouldn't cave in.

> dbus does not strictly need reading security labels for sockets or
> processes to be atomic, either: it would be OK if we can get the complete
> list of LSM labels *somehow*, possibly in O(number of LSMs) rather than
> O(1) syscalls (although I'd prefer O(1)).

Stephen Smalley did an excellent job of outlining the dangers of
using the proposed "display" mechanism with multiple calls to
get the complete attribute set. Adding a new interface that gets
them all at once addresses that set of problems.

> However, the getsockopt() interface only lets the kernel return one thing
> per socket option, and I assume the networking maintainers probably don't
> want to have to add SO_PEERAPPARMOR, SO_PEERSELINUX... for each LSM -

Or a getsockopt() option that takes an LSM name and returns the value
for that module. You could do any of those, but you still end up with O(n)
and a need to know in advance what security modules to look for.

> so this part at least would probably be easier as a single blob in some
> text or binary format.

For the long term I agree. I still have to deal with legacy services
and applications that won't be updated in the foreseeable future, which
is why the old interfaces can't be updated. New interfaces are required.
I'm open to discussion on details, including format.

>     smcv


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Preferred subj= with multiple LSMs
  2019-07-23 14:06                                       ` Simon McVittie
  2019-07-23 17:32                                         ` Casey Schaufler
@ 2019-07-23 21:46                                         ` James Morris
  1 sibling, 0 replies; 39+ messages in thread
From: James Morris @ 2019-07-23 21:46 UTC (permalink / raw)
  To: Simon McVittie
  Cc: Paul Moore, Casey Schaufler, Steve Grubb, Richard Guy Briggs,
	linux-audit, Linux Security Module list

On Tue, 23 Jul 2019, Simon McVittie wrote:

> On Mon, 22 Jul 2019 at 18:30:35 -0400, Paul Moore wrote:
> > On Mon, Jul 22, 2019 at 6:01 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
> > > I suggest that if supporting dbus well is assisted by
> > > making reasonable restrictions on what constitutes a valid LSM
> > > "context" that we have a good reason.
> > 
> > I continue to believe that restrictions on the label format are a bad
> > idea
> 
> Does this include the restriction "the label does not include \0",
> which is an assumption that dbus is already relying on since I checked
> it in the thread around
> <https://marc.info/?l=linux-security-module&m=142323508321029&w=2>?
> Or is that restriction so fundamental that it's considered OK?

Security labels are strings, so this is implied.


-- 
James Morris
<jmorris@namei.org>


^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2019-07-23 21:47 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-12 16:33 Preferred subj= with multiple LSMs Casey Schaufler
     [not found] ` <c46932ec-e38e-ba15-7ceb-70e0fe0ef5dc@schaufler-ca.com>
2019-07-13 15:08 ` Steve Grubb
2019-07-15 19:04   ` Richard Guy Briggs
     [not found] ` <1979804.kRvuSoDnao@x2>
     [not found]   ` <2802ddee-b621-c2eb-9ff3-ea15c4f19d0c@schaufler-ca.com>
     [not found]     ` <3577098.oGDFHdoSSQ@x2>
2019-07-16 17:16       ` Casey Schaufler
     [not found]   ` <CAHC9VhSELVZN8feH56zsANqoHu16mPMD04Ww60W=r6tWs+8WnQ@mail.gmail.com>
2019-07-16 17:29     ` Casey Schaufler
2019-07-16 17:43       ` Paul Moore
2019-07-16 17:58         ` Casey Schaufler
2019-07-16 18:06         ` Steve Grubb
2019-07-16 18:41           ` Casey Schaufler
2019-07-16 21:25             ` Paul Moore
2019-07-16 21:46               ` Steve Grubb
2019-07-16 22:18                 ` Casey Schaufler
2019-07-16 23:13                   ` Paul Moore
2019-07-16 23:47                     ` Casey Schaufler
2019-07-17 12:14                       ` Paul Moore
2019-07-17 15:49                         ` Casey Schaufler
2019-07-17 16:23                           ` Paul Moore
2019-07-17 23:02                             ` Casey Schaufler
2019-07-18 13:10                               ` Simon McVittie
2019-07-18 16:13                                 ` Casey Schaufler
2019-07-19 12:15                                   ` Simon McVittie
2019-07-19 16:29                                     ` Casey Schaufler
2019-07-19 18:47                                       ` Simon McVittie
2019-07-19 20:02                                         ` Dbus and multiple LSMs (was Preferred subj= with multiple LSMs) Casey Schaufler
2019-07-22 11:36                                           ` Simon McVittie
2019-07-22 16:04                                             ` Casey Schaufler
2019-07-19 21:21                               ` Preferred subj= with multiple LSMs Paul Moore
2019-07-22 20:50                                 ` James Morris
2019-07-22 22:01                                   ` Casey Schaufler
2019-07-22 22:30                                     ` Paul Moore
2019-07-23  0:11                                       ` Casey Schaufler
2019-07-23 14:06                                       ` Simon McVittie
2019-07-23 17:32                                         ` Casey Schaufler
2019-07-23 21:46                                         ` James Morris
2019-07-16 23:09                 ` Paul Moore
2019-07-17  4:36                   ` James Morris
2019-07-17 12:23                     ` Paul Moore
2019-07-18 15:01               ` William Roberts
2019-07-18 18:48                 ` Casey Schaufler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).