stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC] IMA LSM based rule race condition issue on 4.19 LTS
@ 2022-12-09  7:00 Guozihua (Scott)
  2022-12-09  7:12 ` Greg KH
  2022-12-13 15:30 ` Mimi Zohar
  0 siblings, 2 replies; 30+ messages in thread
From: Guozihua (Scott) @ 2022-12-09  7:00 UTC (permalink / raw)
  To: Mimi Zohar, dmitry.kasatkin, Paul Moore, sds, eparis, Greg KH, sashal
  Cc: selinux, linux-integrity, stable

Hi community.

Previously our team reported a race condition in IMA relates to LSM 
based rules which would case IMA to match files that should be filtered 
out under normal condition. The issue was originally analyzed and fixed 
on mainstream. The patch and the discussion could be found here: 
https://lore.kernel.org/all/20220921125804.59490-1-guozihua@huawei.com/

After that, we did a regression test on 4.19 LTS and the same issue 
arises. Further analysis reveled that the issue is from a completely 
different cause.

The cause is that selinux_audit_rule_init() would set the rule (which is 
a second level pointer) to NULL immediately after called. The relevant 
codes are as shown:

security/selinux/ss/services.c:
> int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule)
> {
>         struct selinux_state *state = &selinux_state;
>         struct policydb *policydb = &state->ss->policydb;
>         struct selinux_audit_rule *tmprule;
>         struct role_datum *roledatum;
>         struct type_datum *typedatum;
>         struct user_datum *userdatum;
>         struct selinux_audit_rule **rule = (struct selinux_audit_rule **)vrule;
>         int rc = 0;
> 
>         *rule = NULL;
*rule is set to NULL here, which means the rule on IMA side is also NULL.
> 
>         if (!state->initialized)
>                 return -EOPNOTSUPP;
...
> out:
>         read_unlock(&state->ss->policy_rwlock);
> 
>         if (rc) {
>                 selinux_audit_rule_free(tmprule);
>                 tmprule = NULL;
>         }
> 
>         *rule = tmprule;
rule is updated at the end of the function.
> 
>         return rc;
> }

security/integrity/ima/ima_policy.c:
> static bool ima_match_rules(struct ima_rule_entry *rule, struct inode *inode,
>                             const struct cred *cred, u32 secid,
>                             enum ima_hooks func, int mask)
> {...
> for (i = 0; i < MAX_LSM_RULES; i++) {
>                 int rc = 0;
>                 u32 osid;
>                 int retried = 0;
> 
>                 if (!rule->lsm[i].rule)
>                         continue;
Setting rule to NULL would lead to LSM based rule matching being skipped.
> retry:
>                 switch (i) {

To solve this issue, there are multiple approaches we might take and I 
would like some input from the community.

The first proposed solution would be to change 
selinux_audit_rule_init(). Remove the set to NULL bit and update the 
rule pointer with cmpxchg.

> diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c
> index a9f2bc8443bd..aa74b04ccaf7 100644
> --- a/security/selinux/ss/services.c
> +++ b/security/selinux/ss/services.c
> @@ -3297,10 +3297,9 @@ int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule)
>         struct type_datum *typedatum;
>         struct user_datum *userdatum;
>         struct selinux_audit_rule **rule = (struct selinux_audit_rule **)vrule;
> +       struct selinux_audit_rule *orig = rule;
>         int rc = 0;
>  
> -       *rule = NULL;
> -
>         if (!state->initialized)
>                 return -EOPNOTSUPP;
>  
> @@ -3382,7 +3381,8 @@ int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule)
>                 tmprule = NULL;
>         }
>  
> -       *rule = tmprule;
> +       if (cmpxchg(rule, orig, tmprule) != orig)
> +               selinux_audit_rule_free(tmprule);
>  
>         return rc;
>  }

This solution would be an easy fix, but might influence other modules 
calling selinux_audit_rule_init() directly or indirectly (on 4.19 LTS, 
only auditfilter and IMA it seems). And it might be worth returning an 
error code such as -EAGAIN.

Or, we can access rules via RCU, similar to what we do on 5.10. This 
could means more code change and testing.

Reported-by: Huaxin Lu <luhuaxin1@huawei.com>
-- 
Best
GUO Zihua

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC] IMA LSM based rule race condition issue on 4.19 LTS
  2022-12-09  7:00 [RFC] IMA LSM based rule race condition issue on 4.19 LTS Guozihua (Scott)
@ 2022-12-09  7:12 ` Greg KH
  2022-12-09  7:53   ` Guozihua (Scott)
  2022-12-13 15:30 ` Mimi Zohar
  1 sibling, 1 reply; 30+ messages in thread
From: Greg KH @ 2022-12-09  7:12 UTC (permalink / raw)
  To: Guozihua (Scott)
  Cc: Mimi Zohar, dmitry.kasatkin, Paul Moore, sds, eparis, sashal,
	selinux, linux-integrity, stable

On Fri, Dec 09, 2022 at 03:00:35PM +0800, Guozihua (Scott) wrote:
> Hi community.
> 
> Previously our team reported a race condition in IMA relates to LSM based
> rules which would case IMA to match files that should be filtered out under
> normal condition. The issue was originally analyzed and fixed on mainstream.
> The patch and the discussion could be found here:
> https://lore.kernel.org/all/20220921125804.59490-1-guozihua@huawei.com/
> 
> After that, we did a regression test on 4.19 LTS and the same issue arises.
> Further analysis reveled that the issue is from a completely different
> cause.

What commit in the tree fixed this in newer kernels?  Why can't we just
backport that one to 4.19.y as well?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC] IMA LSM based rule race condition issue on 4.19 LTS
  2022-12-09  7:12 ` Greg KH
@ 2022-12-09  7:53   ` Guozihua (Scott)
  2022-12-09  8:46     ` Greg KH
  0 siblings, 1 reply; 30+ messages in thread
From: Guozihua (Scott) @ 2022-12-09  7:53 UTC (permalink / raw)
  To: Greg KH
  Cc: Mimi Zohar, dmitry.kasatkin, Paul Moore, sds, eparis, sashal,
	selinux, linux-integrity, stable

On 2022/12/9 15:12, Greg KH wrote:
> On Fri, Dec 09, 2022 at 03:00:35PM +0800, Guozihua (Scott) wrote:
>> Hi community.
>>
>> Previously our team reported a race condition in IMA relates to LSM based
>> rules which would case IMA to match files that should be filtered out under
>> normal condition. The issue was originally analyzed and fixed on mainstream.
>> The patch and the discussion could be found here:
>> https://lore.kernel.org/all/20220921125804.59490-1-guozihua@huawei.com/
>>
>> After that, we did a regression test on 4.19 LTS and the same issue arises.
>> Further analysis reveled that the issue is from a completely different
>> cause.
> 
> What commit in the tree fixed this in newer kernels?  Why can't we just
> backport that one to 4.19.y as well?
> 
> thanks,
> 
> greg k-h

Hi Greg,

The fix for mainline is now on linux-next, commit 	d57378d3aa4d ("ima: 
Simplify ima_lsm_copy_rule") and 	c7423dbdbc9ece ("ima: Handle -ESTALE 
returned by ima_filter_rule_match()"). However, these patches cannot be 
picked directly into 4.19.y due to code difference.

The commit which introduced the issue on mainline was believed to be 
b16942455193 ("ima: use the lsm policy update notifier"), which is not 
in 4.19.y. And the mainline patch is designed to handle the situation 
when IMA rules are accessed through RCU which has not been implemented 
on 4.19.y either.

-- 
Best
GUO Zihua


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC] IMA LSM based rule race condition issue on 4.19 LTS
  2022-12-09  7:53   ` Guozihua (Scott)
@ 2022-12-09  8:46     ` Greg KH
  2022-12-09  8:59       ` Guozihua (Scott)
  0 siblings, 1 reply; 30+ messages in thread
From: Greg KH @ 2022-12-09  8:46 UTC (permalink / raw)
  To: Guozihua (Scott)
  Cc: Mimi Zohar, dmitry.kasatkin, Paul Moore, sds, eparis, sashal,
	selinux, linux-integrity, stable

On Fri, Dec 09, 2022 at 03:53:25PM +0800, Guozihua (Scott) wrote:
> On 2022/12/9 15:12, Greg KH wrote:
> > On Fri, Dec 09, 2022 at 03:00:35PM +0800, Guozihua (Scott) wrote:
> > > Hi community.
> > > 
> > > Previously our team reported a race condition in IMA relates to LSM based
> > > rules which would case IMA to match files that should be filtered out under
> > > normal condition. The issue was originally analyzed and fixed on mainstream.
> > > The patch and the discussion could be found here:
> > > https://lore.kernel.org/all/20220921125804.59490-1-guozihua@huawei.com/
> > > 
> > > After that, we did a regression test on 4.19 LTS and the same issue arises.
> > > Further analysis reveled that the issue is from a completely different
> > > cause.
> > 
> > What commit in the tree fixed this in newer kernels?  Why can't we just
> > backport that one to 4.19.y as well?
> > 
> > thanks,
> > 
> > greg k-h
> 
> Hi Greg,
> 
> The fix for mainline is now on linux-next, commit 	d57378d3aa4d ("ima:
> Simplify ima_lsm_copy_rule") and 	c7423dbdbc9ece ("ima: Handle -ESTALE
> returned by ima_filter_rule_match()"). However, these patches cannot be
> picked directly into 4.19.y due to code difference.

Ok, so it's much more than just 4.19 that's an issue here.  And are
those commits tagged for stable inclusion?

> The commit which introduced the issue on mainline was believed to be
> b16942455193 ("ima: use the lsm policy update notifier"), which is not in
> 4.19.y. And the mainline patch is designed to handle the situation when IMA
> rules are accessed through RCU which has not been implemented on 4.19.y
> either.

Ok, then provide a series of backports to 4.19 and we will be glad to
review them.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC] IMA LSM based rule race condition issue on 4.19 LTS
  2022-12-09  8:46     ` Greg KH
@ 2022-12-09  8:59       ` Guozihua (Scott)
  2022-12-09  9:00         ` Greg KH
  0 siblings, 1 reply; 30+ messages in thread
From: Guozihua (Scott) @ 2022-12-09  8:59 UTC (permalink / raw)
  To: Greg KH
  Cc: Mimi Zohar, dmitry.kasatkin, Paul Moore, sds, eparis, sashal,
	selinux, linux-integrity, stable

On 2022/12/9 16:46, Greg KH wrote:
> On Fri, Dec 09, 2022 at 03:53:25PM +0800, Guozihua (Scott) wrote:
>> On 2022/12/9 15:12, Greg KH wrote:
>>> On Fri, Dec 09, 2022 at 03:00:35PM +0800, Guozihua (Scott) wrote:
>>>> Hi community.
>>>>
>>>> Previously our team reported a race condition in IMA relates to LSM based
>>>> rules which would case IMA to match files that should be filtered out under
>>>> normal condition. The issue was originally analyzed and fixed on mainstream.
>>>> The patch and the discussion could be found here:
>>>> https://lore.kernel.org/all/20220921125804.59490-1-guozihua@huawei.com/
>>>>
>>>> After that, we did a regression test on 4.19 LTS and the same issue arises.
>>>> Further analysis reveled that the issue is from a completely different
>>>> cause.
>>>
>>> What commit in the tree fixed this in newer kernels?  Why can't we just
>>> backport that one to 4.19.y as well?
>>>
>>> thanks,
>>>
>>> greg k-h
>>
>> Hi Greg,
>>
>> The fix for mainline is now on linux-next, commit 	d57378d3aa4d ("ima:
>> Simplify ima_lsm_copy_rule") and 	c7423dbdbc9ece ("ima: Handle -ESTALE
>> returned by ima_filter_rule_match()"). However, these patches cannot be
>> picked directly into 4.19.y due to code difference.
> 
> Ok, so it's much more than just 4.19 that's an issue here.  And are
> those commits tagged for stable inclusion?

Not actually, not on the commit itself.
> 
>> The commit which introduced the issue on mainline was believed to be
>> b16942455193 ("ima: use the lsm policy update notifier"), which is not in
>> 4.19.y. And the mainline patch is designed to handle the situation when IMA
>> rules are accessed through RCU which has not been implemented on 4.19.y
>> either.
> 
> Ok, then provide a series of backports to 4.19 and we will be glad to
> review them.
If we are backporting these commits to 4.19 then maybe we would have to 
start with the commit that makes rule access in IMA RCU protected. I'll 
have a look into whether it's easy to do.
> 
> thanks,
> 
> greg k-h

-- 
Best
GUO Zihua


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC] IMA LSM based rule race condition issue on 4.19 LTS
  2022-12-09  8:59       ` Guozihua (Scott)
@ 2022-12-09  9:00         ` Greg KH
  2022-12-09  9:11           ` Guozihua (Scott)
  0 siblings, 1 reply; 30+ messages in thread
From: Greg KH @ 2022-12-09  9:00 UTC (permalink / raw)
  To: Guozihua (Scott)
  Cc: Mimi Zohar, dmitry.kasatkin, Paul Moore, sds, eparis, sashal,
	selinux, linux-integrity, stable

On Fri, Dec 09, 2022 at 04:59:17PM +0800, Guozihua (Scott) wrote:
> On 2022/12/9 16:46, Greg KH wrote:
> > On Fri, Dec 09, 2022 at 03:53:25PM +0800, Guozihua (Scott) wrote:
> > > On 2022/12/9 15:12, Greg KH wrote:
> > > > On Fri, Dec 09, 2022 at 03:00:35PM +0800, Guozihua (Scott) wrote:
> > > > > Hi community.
> > > > > 
> > > > > Previously our team reported a race condition in IMA relates to LSM based
> > > > > rules which would case IMA to match files that should be filtered out under
> > > > > normal condition. The issue was originally analyzed and fixed on mainstream.
> > > > > The patch and the discussion could be found here:
> > > > > https://lore.kernel.org/all/20220921125804.59490-1-guozihua@huawei.com/
> > > > > 
> > > > > After that, we did a regression test on 4.19 LTS and the same issue arises.
> > > > > Further analysis reveled that the issue is from a completely different
> > > > > cause.
> > > > 
> > > > What commit in the tree fixed this in newer kernels?  Why can't we just
> > > > backport that one to 4.19.y as well?
> > > > 
> > > > thanks,
> > > > 
> > > > greg k-h
> > > 
> > > Hi Greg,
> > > 
> > > The fix for mainline is now on linux-next, commit 	d57378d3aa4d ("ima:
> > > Simplify ima_lsm_copy_rule") and 	c7423dbdbc9ece ("ima: Handle -ESTALE
> > > returned by ima_filter_rule_match()"). However, these patches cannot be
> > > picked directly into 4.19.y due to code difference.
> > 
> > Ok, so it's much more than just 4.19 that's an issue here.  And are
> > those commits tagged for stable inclusion?
> 
> Not actually, not on the commit itself.

That's not good.  When they hit Linus's tree, please submit backports to
the stable mailing list so that they can be picked up.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC] IMA LSM based rule race condition issue on 4.19 LTS
  2022-12-09  9:00         ` Greg KH
@ 2022-12-09  9:11           ` Guozihua (Scott)
  2022-12-09  9:22             ` Greg KH
  0 siblings, 1 reply; 30+ messages in thread
From: Guozihua (Scott) @ 2022-12-09  9:11 UTC (permalink / raw)
  To: Greg KH
  Cc: Mimi Zohar, dmitry.kasatkin, Paul Moore, sds, eparis, sashal,
	selinux, linux-integrity, stable

On 2022/12/9 17:00, Greg KH wrote:
> On Fri, Dec 09, 2022 at 04:59:17PM +0800, Guozihua (Scott) wrote:
>> On 2022/12/9 16:46, Greg KH wrote:
>>> On Fri, Dec 09, 2022 at 03:53:25PM +0800, Guozihua (Scott) wrote:
>>>> On 2022/12/9 15:12, Greg KH wrote:
>>>>> On Fri, Dec 09, 2022 at 03:00:35PM +0800, Guozihua (Scott) wrote:
>>>>>> Hi community.
>>>>>>
>>>>>> Previously our team reported a race condition in IMA relates to LSM based
>>>>>> rules which would case IMA to match files that should be filtered out under
>>>>>> normal condition. The issue was originally analyzed and fixed on mainstream.
>>>>>> The patch and the discussion could be found here:
>>>>>> https://lore.kernel.org/all/20220921125804.59490-1-guozihua@huawei.com/
>>>>>>
>>>>>> After that, we did a regression test on 4.19 LTS and the same issue arises.
>>>>>> Further analysis reveled that the issue is from a completely different
>>>>>> cause.
>>>>>
>>>>> What commit in the tree fixed this in newer kernels?  Why can't we just
>>>>> backport that one to 4.19.y as well?
>>>>>
>>>>> thanks,
>>>>>
>>>>> greg k-h
>>>>
>>>> Hi Greg,
>>>>
>>>> The fix for mainline is now on linux-next, commit 	d57378d3aa4d ("ima:
>>>> Simplify ima_lsm_copy_rule") and 	c7423dbdbc9ece ("ima: Handle -ESTALE
>>>> returned by ima_filter_rule_match()"). However, these patches cannot be
>>>> picked directly into 4.19.y due to code difference.
>>>
>>> Ok, so it's much more than just 4.19 that's an issue here.  And are
>>> those commits tagged for stable inclusion?
>>
>> Not actually, not on the commit itself.
> 
> That's not good.  When they hit Linus's tree, please submit backports to
> the stable mailing list so that they can be picked up.
Thing is these commits cannot be simply backported to 4.19.y. Preceding
patches are missing. How do we do backporting in this situation? Do we
first backport the preceding patches? Or maybe we develop another
solution for 4.19.y?
> 
> thanks,
> 
> greg k-h

-- 
Best
GUO Zihua


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC] IMA LSM based rule race condition issue on 4.19 LTS
  2022-12-09  9:11           ` Guozihua (Scott)
@ 2022-12-09  9:22             ` Greg KH
  2022-12-09  9:32               ` Guozihua (Scott)
  0 siblings, 1 reply; 30+ messages in thread
From: Greg KH @ 2022-12-09  9:22 UTC (permalink / raw)
  To: Guozihua (Scott)
  Cc: Mimi Zohar, dmitry.kasatkin, Paul Moore, sds, eparis, sashal,
	selinux, linux-integrity, stable

On Fri, Dec 09, 2022 at 05:11:40PM +0800, Guozihua (Scott) wrote:
> On 2022/12/9 17:00, Greg KH wrote:
> > On Fri, Dec 09, 2022 at 04:59:17PM +0800, Guozihua (Scott) wrote:
> >> On 2022/12/9 16:46, Greg KH wrote:
> >>> On Fri, Dec 09, 2022 at 03:53:25PM +0800, Guozihua (Scott) wrote:
> >>>> On 2022/12/9 15:12, Greg KH wrote:
> >>>>> On Fri, Dec 09, 2022 at 03:00:35PM +0800, Guozihua (Scott) wrote:
> >>>>>> Hi community.
> >>>>>>
> >>>>>> Previously our team reported a race condition in IMA relates to LSM based
> >>>>>> rules which would case IMA to match files that should be filtered out under
> >>>>>> normal condition. The issue was originally analyzed and fixed on mainstream.
> >>>>>> The patch and the discussion could be found here:
> >>>>>> https://lore.kernel.org/all/20220921125804.59490-1-guozihua@huawei.com/
> >>>>>>
> >>>>>> After that, we did a regression test on 4.19 LTS and the same issue arises.
> >>>>>> Further analysis reveled that the issue is from a completely different
> >>>>>> cause.
> >>>>>
> >>>>> What commit in the tree fixed this in newer kernels?  Why can't we just
> >>>>> backport that one to 4.19.y as well?
> >>>>>
> >>>>> thanks,
> >>>>>
> >>>>> greg k-h
> >>>>
> >>>> Hi Greg,
> >>>>
> >>>> The fix for mainline is now on linux-next, commit 	d57378d3aa4d ("ima:
> >>>> Simplify ima_lsm_copy_rule") and 	c7423dbdbc9ece ("ima: Handle -ESTALE
> >>>> returned by ima_filter_rule_match()"). However, these patches cannot be
> >>>> picked directly into 4.19.y due to code difference.
> >>>
> >>> Ok, so it's much more than just 4.19 that's an issue here.  And are
> >>> those commits tagged for stable inclusion?
> >>
> >> Not actually, not on the commit itself.
> > 
> > That's not good.  When they hit Linus's tree, please submit backports to
> > the stable mailing list so that they can be picked up.
> Thing is these commits cannot be simply backported to 4.19.y. Preceding
> patches are missing. How do we do backporting in this situation? Do we
> first backport the preceding patches? Or maybe we develop another
> solution for 4.19.y?

First they need to go to newer kernel trees, and then worry about 4.19.
We never want anyone to upgrade to a newer kernel and have a regression.

Also, we can't do anything until they hit Linus's tree, as per the
stable kernel rules.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC] IMA LSM based rule race condition issue on 4.19 LTS
  2022-12-09  9:22             ` Greg KH
@ 2022-12-09  9:32               ` Guozihua (Scott)
  2022-12-09  9:38                 ` Guozihua (Scott)
  0 siblings, 1 reply; 30+ messages in thread
From: Guozihua (Scott) @ 2022-12-09  9:32 UTC (permalink / raw)
  To: Greg KH
  Cc: Mimi Zohar, dmitry.kasatkin, Paul Moore, sds, eparis, sashal,
	selinux, linux-integrity, stable

On 2022/12/9 17:22, Greg KH wrote:
> On Fri, Dec 09, 2022 at 05:11:40PM +0800, Guozihua (Scott) wrote:
>> On 2022/12/9 17:00, Greg KH wrote:
>>> On Fri, Dec 09, 2022 at 04:59:17PM +0800, Guozihua (Scott) wrote:
>>>> On 2022/12/9 16:46, Greg KH wrote:
>>>>> On Fri, Dec 09, 2022 at 03:53:25PM +0800, Guozihua (Scott) wrote:
>>>>>> On 2022/12/9 15:12, Greg KH wrote:
>>>>>>> On Fri, Dec 09, 2022 at 03:00:35PM +0800, Guozihua (Scott) wrote:
>>>>>>>> Hi community.
>>>>>>>>
>>>>>>>> Previously our team reported a race condition in IMA relates to LSM based
>>>>>>>> rules which would case IMA to match files that should be filtered out under
>>>>>>>> normal condition. The issue was originally analyzed and fixed on mainstream.
>>>>>>>> The patch and the discussion could be found here:
>>>>>>>> https://lore.kernel.org/all/20220921125804.59490-1-guozihua@huawei.com/
>>>>>>>>
>>>>>>>> After that, we did a regression test on 4.19 LTS and the same issue arises.
>>>>>>>> Further analysis reveled that the issue is from a completely different
>>>>>>>> cause.
>>>>>>>
>>>>>>> What commit in the tree fixed this in newer kernels?  Why can't we just
>>>>>>> backport that one to 4.19.y as well?
>>>>>>>
>>>>>>> thanks,
>>>>>>>
>>>>>>> greg k-h
>>>>>>
>>>>>> Hi Greg,
>>>>>>
>>>>>> The fix for mainline is now on linux-next, commit 	d57378d3aa4d ("ima:
>>>>>> Simplify ima_lsm_copy_rule") and 	c7423dbdbc9ece ("ima: Handle -ESTALE
>>>>>> returned by ima_filter_rule_match()"). However, these patches cannot be
>>>>>> picked directly into 4.19.y due to code difference.
>>>>>
>>>>> Ok, so it's much more than just 4.19 that's an issue here.  And are
>>>>> those commits tagged for stable inclusion?
>>>>
>>>> Not actually, not on the commit itself.
>>>
>>> That's not good.  When they hit Linus's tree, please submit backports to
>>> the stable mailing list so that they can be picked up.
>> Thing is these commits cannot be simply backported to 4.19.y. Preceding
>> patches are missing. How do we do backporting in this situation? Do we
>> first backport the preceding patches? Or maybe we develop another
>> solution for 4.19.y?
> 
> First they need to go to newer kernel trees, and then worry about 4.19.
> We never want anyone to upgrade to a newer kernel and have a regression.
> 
> Also, we can't do anything until they hit Linus's tree, as per the
> stable kernel rules.
Alright. We'll wait for these patches to be in Linus' tree. But should
we stick to a backport from mainstream or we form a different solution
for LTS?

-- 
Best
GUO Zihua


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC] IMA LSM based rule race condition issue on 4.19 LTS
  2022-12-09  9:32               ` Guozihua (Scott)
@ 2022-12-09  9:38                 ` Guozihua (Scott)
  2022-12-09 10:27                   ` Greg KH
  0 siblings, 1 reply; 30+ messages in thread
From: Guozihua (Scott) @ 2022-12-09  9:38 UTC (permalink / raw)
  To: Greg KH
  Cc: Mimi Zohar, dmitry.kasatkin, Paul Moore, sds, eparis, sashal,
	selinux, linux-integrity, stable

On 2022/12/9 17:32, Guozihua (Scott) wrote:
> On 2022/12/9 17:22, Greg KH wrote:
>> On Fri, Dec 09, 2022 at 05:11:40PM +0800, Guozihua (Scott) wrote:
>>> On 2022/12/9 17:00, Greg KH wrote:
>>>> On Fri, Dec 09, 2022 at 04:59:17PM +0800, Guozihua (Scott) wrote:
>>>>> On 2022/12/9 16:46, Greg KH wrote:
>>>>>> On Fri, Dec 09, 2022 at 03:53:25PM +0800, Guozihua (Scott) wrote:
>>>>>>> On 2022/12/9 15:12, Greg KH wrote:
>>>>>>>> On Fri, Dec 09, 2022 at 03:00:35PM +0800, Guozihua (Scott) wrote:
>>>>>>>>> Hi community.
>>>>>>>>>
>>>>>>>>> Previously our team reported a race condition in IMA relates to LSM based
>>>>>>>>> rules which would case IMA to match files that should be filtered out under
>>>>>>>>> normal condition. The issue was originally analyzed and fixed on mainstream.
>>>>>>>>> The patch and the discussion could be found here:
>>>>>>>>> https://lore.kernel.org/all/20220921125804.59490-1-guozihua@huawei.com/
>>>>>>>>>
>>>>>>>>> After that, we did a regression test on 4.19 LTS and the same issue arises.
>>>>>>>>> Further analysis reveled that the issue is from a completely different
>>>>>>>>> cause.
>>>>>>>>
>>>>>>>> What commit in the tree fixed this in newer kernels?  Why can't we just
>>>>>>>> backport that one to 4.19.y as well?
>>>>>>>>
>>>>>>>> thanks,
>>>>>>>>
>>>>>>>> greg k-h
>>>>>>>
>>>>>>> Hi Greg,
>>>>>>>
>>>>>>> The fix for mainline is now on linux-next, commit 	d57378d3aa4d ("ima:
>>>>>>> Simplify ima_lsm_copy_rule") and 	c7423dbdbc9ece ("ima: Handle -ESTALE
>>>>>>> returned by ima_filter_rule_match()"). However, these patches cannot be
>>>>>>> picked directly into 4.19.y due to code difference.
>>>>>>
>>>>>> Ok, so it's much more than just 4.19 that's an issue here.  And are
>>>>>> those commits tagged for stable inclusion?
>>>>>
>>>>> Not actually, not on the commit itself.
>>>>
>>>> That's not good.  When they hit Linus's tree, please submit backports to
>>>> the stable mailing list so that they can be picked up.
>>> Thing is these commits cannot be simply backported to 4.19.y. Preceding
>>> patches are missing. How do we do backporting in this situation? Do we
>>> first backport the preceding patches? Or maybe we develop another
>>> solution for 4.19.y?
>>
>> First they need to go to newer kernel trees, and then worry about 4.19.
>> We never want anyone to upgrade to a newer kernel and have a regression.
>>
>> Also, we can't do anything until they hit Linus's tree, as per the
>> stable kernel rules.
> Alright. We'll wait for these patches to be in Linus' tree. But should
> we stick to a backport from mainstream or we form a different solution
> for LTS?
> 
BTW, I have a look into it and if we are backporting mainstream's
solution, we would also needs to backport b16942455193 ("ima: use the
lsm policy update notifier")
-- 
Best
GUO Zihua


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC] IMA LSM based rule race condition issue on 4.19 LTS
  2022-12-09  9:38                 ` Guozihua (Scott)
@ 2022-12-09 10:27                   ` Greg KH
  2022-12-12  2:39                     ` Guozihua (Scott)
  0 siblings, 1 reply; 30+ messages in thread
From: Greg KH @ 2022-12-09 10:27 UTC (permalink / raw)
  To: Guozihua (Scott)
  Cc: Mimi Zohar, dmitry.kasatkin, Paul Moore, sds, eparis, sashal,
	selinux, linux-integrity, stable

On Fri, Dec 09, 2022 at 05:38:00PM +0800, Guozihua (Scott) wrote:
> On 2022/12/9 17:32, Guozihua (Scott) wrote:
> > On 2022/12/9 17:22, Greg KH wrote:
> >> On Fri, Dec 09, 2022 at 05:11:40PM +0800, Guozihua (Scott) wrote:
> >>> On 2022/12/9 17:00, Greg KH wrote:
> >>>> On Fri, Dec 09, 2022 at 04:59:17PM +0800, Guozihua (Scott) wrote:
> >>>>> On 2022/12/9 16:46, Greg KH wrote:
> >>>>>> On Fri, Dec 09, 2022 at 03:53:25PM +0800, Guozihua (Scott) wrote:
> >>>>>>> On 2022/12/9 15:12, Greg KH wrote:
> >>>>>>>> On Fri, Dec 09, 2022 at 03:00:35PM +0800, Guozihua (Scott) wrote:
> >>>>>>>>> Hi community.
> >>>>>>>>>
> >>>>>>>>> Previously our team reported a race condition in IMA relates to LSM based
> >>>>>>>>> rules which would case IMA to match files that should be filtered out under
> >>>>>>>>> normal condition. The issue was originally analyzed and fixed on mainstream.
> >>>>>>>>> The patch and the discussion could be found here:
> >>>>>>>>> https://lore.kernel.org/all/20220921125804.59490-1-guozihua@huawei.com/
> >>>>>>>>>
> >>>>>>>>> After that, we did a regression test on 4.19 LTS and the same issue arises.
> >>>>>>>>> Further analysis reveled that the issue is from a completely different
> >>>>>>>>> cause.
> >>>>>>>>
> >>>>>>>> What commit in the tree fixed this in newer kernels?  Why can't we just
> >>>>>>>> backport that one to 4.19.y as well?
> >>>>>>>>
> >>>>>>>> thanks,
> >>>>>>>>
> >>>>>>>> greg k-h
> >>>>>>>
> >>>>>>> Hi Greg,
> >>>>>>>
> >>>>>>> The fix for mainline is now on linux-next, commit 	d57378d3aa4d ("ima:
> >>>>>>> Simplify ima_lsm_copy_rule") and 	c7423dbdbc9ece ("ima: Handle -ESTALE
> >>>>>>> returned by ima_filter_rule_match()"). However, these patches cannot be
> >>>>>>> picked directly into 4.19.y due to code difference.
> >>>>>>
> >>>>>> Ok, so it's much more than just 4.19 that's an issue here.  And are
> >>>>>> those commits tagged for stable inclusion?
> >>>>>
> >>>>> Not actually, not on the commit itself.
> >>>>
> >>>> That's not good.  When they hit Linus's tree, please submit backports to
> >>>> the stable mailing list so that they can be picked up.
> >>> Thing is these commits cannot be simply backported to 4.19.y. Preceding
> >>> patches are missing. How do we do backporting in this situation? Do we
> >>> first backport the preceding patches? Or maybe we develop another
> >>> solution for 4.19.y?
> >>
> >> First they need to go to newer kernel trees, and then worry about 4.19.
> >> We never want anyone to upgrade to a newer kernel and have a regression.
> >>
> >> Also, we can't do anything until they hit Linus's tree, as per the
> >> stable kernel rules.
> > Alright. We'll wait for these patches to be in Linus' tree. But should
> > we stick to a backport from mainstream or we form a different solution
> > for LTS?

We always want to have a normal backport of what is in Linus's tree if
at all possible.  Whenever we diverge from that, we almost always get it
wrong and have to fix it up again later.

> BTW, I have a look into it and if we are backporting mainstream's
> solution, we would also needs to backport b16942455193 ("ima: use the
> lsm policy update notifier")

That's fine, please just send a patch series to the stable list when
needed.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC] IMA LSM based rule race condition issue on 4.19 LTS
  2022-12-09 10:27                   ` Greg KH
@ 2022-12-12  2:39                     ` Guozihua (Scott)
  0 siblings, 0 replies; 30+ messages in thread
From: Guozihua (Scott) @ 2022-12-12  2:39 UTC (permalink / raw)
  To: Greg KH
  Cc: Mimi Zohar, dmitry.kasatkin, Paul Moore, sds, eparis, sashal,
	selinux, linux-integrity, stable

On 2022/12/9 18:27, Greg KH wrote:
> On Fri, Dec 09, 2022 at 05:38:00PM +0800, Guozihua (Scott) wrote:
>> On 2022/12/9 17:32, Guozihua (Scott) wrote:
>>> On 2022/12/9 17:22, Greg KH wrote:
>>>> On Fri, Dec 09, 2022 at 05:11:40PM +0800, Guozihua (Scott) wrote:
>>>>> On 2022/12/9 17:00, Greg KH wrote:
>>>>>> On Fri, Dec 09, 2022 at 04:59:17PM +0800, Guozihua (Scott) wrote:
>>>>>>> On 2022/12/9 16:46, Greg KH wrote:
>>>>>>>> On Fri, Dec 09, 2022 at 03:53:25PM +0800, Guozihua (Scott) wrote:
>>>>>>>>> On 2022/12/9 15:12, Greg KH wrote:
>>>>>>>>>> On Fri, Dec 09, 2022 at 03:00:35PM +0800, Guozihua (Scott) wrote:
>>>>>>>>>>> Hi community.
>>>>>>>>>>>
>>>>>>>>>>> Previously our team reported a race condition in IMA relates to LSM based
>>>>>>>>>>> rules which would case IMA to match files that should be filtered out under
>>>>>>>>>>> normal condition. The issue was originally analyzed and fixed on mainstream.
>>>>>>>>>>> The patch and the discussion could be found here:
>>>>>>>>>>> https://lore.kernel.org/all/20220921125804.59490-1-guozihua@huawei.com/
>>>>>>>>>>>
>>>>>>>>>>> After that, we did a regression test on 4.19 LTS and the same issue arises.
>>>>>>>>>>> Further analysis reveled that the issue is from a completely different
>>>>>>>>>>> cause.
>>>>>>>>>>
>>>>>>>>>> What commit in the tree fixed this in newer kernels?  Why can't we just
>>>>>>>>>> backport that one to 4.19.y as well?
>>>>>>>>>>
>>>>>>>>>> thanks,
>>>>>>>>>>
>>>>>>>>>> greg k-h
>>>>>>>>>
>>>>>>>>> Hi Greg,
>>>>>>>>>
>>>>>>>>> The fix for mainline is now on linux-next, commit 	d57378d3aa4d ("ima:
>>>>>>>>> Simplify ima_lsm_copy_rule") and 	c7423dbdbc9ece ("ima: Handle -ESTALE
>>>>>>>>> returned by ima_filter_rule_match()"). However, these patches cannot be
>>>>>>>>> picked directly into 4.19.y due to code difference.
>>>>>>>>
>>>>>>>> Ok, so it's much more than just 4.19 that's an issue here.  And are
>>>>>>>> those commits tagged for stable inclusion?
>>>>>>>
>>>>>>> Not actually, not on the commit itself.
>>>>>>
>>>>>> That's not good.  When they hit Linus's tree, please submit backports to
>>>>>> the stable mailing list so that they can be picked up.
>>>>> Thing is these commits cannot be simply backported to 4.19.y. Preceding
>>>>> patches are missing. How do we do backporting in this situation? Do we
>>>>> first backport the preceding patches? Or maybe we develop another
>>>>> solution for 4.19.y?
>>>>
>>>> First they need to go to newer kernel trees, and then worry about 4.19.
>>>> We never want anyone to upgrade to a newer kernel and have a regression.
>>>>
>>>> Also, we can't do anything until they hit Linus's tree, as per the
>>>> stable kernel rules.
>>> Alright. We'll wait for these patches to be in Linus' tree. But should
>>> we stick to a backport from mainstream or we form a different solution
>>> for LTS?
> 
> We always want to have a normal backport of what is in Linus's tree if
> at all possible.  Whenever we diverge from that, we almost always get it
> wrong and have to fix it up again later.
> 
>> BTW, I have a look into it and if we are backporting mainstream's
>> solution, we would also needs to backport b16942455193 ("ima: use the
>> lsm policy update notifier")
> 
> That's fine, please just send a patch series to the stable list when
> needed.
> 
> thanks,
> 
> greg k-h

Thanks Greg.

Any thought from Mimi?

-- 
Best
GUO Zihua


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC] IMA LSM based rule race condition issue on 4.19 LTS
  2022-12-09  7:00 [RFC] IMA LSM based rule race condition issue on 4.19 LTS Guozihua (Scott)
  2022-12-09  7:12 ` Greg KH
@ 2022-12-13 15:30 ` Mimi Zohar
  2022-12-14  1:33   ` Guozihua (Scott)
  1 sibling, 1 reply; 30+ messages in thread
From: Mimi Zohar @ 2022-12-13 15:30 UTC (permalink / raw)
  To: Guozihua (Scott),
	dmitry.kasatkin, Paul Moore, sds, eparis, Greg KH, sashal
  Cc: selinux, linux-integrity, stable

On Fri, 2022-12-09 at 15:00 +0800, Guozihua (Scott) wrote:
> Hi community.
> 
> Previously our team reported a race condition in IMA relates to LSM 
> based rules which would case IMA to match files that should be filtered 
> out under normal condition. The issue was originally analyzed and fixed 
> on mainstream. The patch and the discussion could be found here: 
> https://lore.kernel.org/all/20220921125804.59490-1-guozihua@huawei.com/
> 
> After that, we did a regression test on 4.19 LTS and the same issue 
> arises. Further analysis reveled that the issue is from a completely 
> different cause.
> 
> The cause is that selinux_audit_rule_init() would set the rule (which is 
> a second level pointer) to NULL immediately after called. The relevant 
> codes are as shown:
> 
> security/selinux/ss/services.c:
> > int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule)
> > {
> >         struct selinux_state *state = &selinux_state;
> >         struct policydb *policydb = &state->ss->policydb;
> >         struct selinux_audit_rule *tmprule;
> >         struct role_datum *roledatum;
> >         struct type_datum *typedatum;
> >         struct user_datum *userdatum;
> >         struct selinux_audit_rule **rule = (struct selinux_audit_rule **)vrule;
> >         int rc = 0;
> > 
> >         *rule = NULL;
> *rule is set to NULL here, which means the rule on IMA side is also NULL.
> > 
> >         if (!state->initialized)
> >                 return -EOPNOTSUPP;
> ...
> > out:
> >         read_unlock(&state->ss->policy_rwlock);
> > 
> >         if (rc) {
> >                 selinux_audit_rule_free(tmprule);
> >                 tmprule = NULL;
> >         }
> > 
> >         *rule = tmprule;
> rule is updated at the end of the function.
> > 
> >         return rc;
> > }
> 
> security/integrity/ima/ima_policy.c:
> > static bool ima_match_rules(struct ima_rule_entry *rule, struct inode *inode,
> >                             const struct cred *cred, u32 secid,
> >                             enum ima_hooks func, int mask)
> > {...
> > for (i = 0; i < MAX_LSM_RULES; i++) {
> >                 int rc = 0;
> >                 u32 osid;
> >                 int retried = 0;
> > 
> >                 if (!rule->lsm[i].rule)
> >                         continue;
> Setting rule to NULL would lead to LSM based rule matching being skipped.
> > retry:
> >                 switch (i) {
> 
> To solve this issue, there are multiple approaches we might take and I 
> would like some input from the community.
> 
> The first proposed solution would be to change 
> selinux_audit_rule_init(). Remove the set to NULL bit and update the 
> rule pointer with cmpxchg.
> 
> > diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c
> > index a9f2bc8443bd..aa74b04ccaf7 100644
> > --- a/security/selinux/ss/services.c
> > +++ b/security/selinux/ss/services.c
> > @@ -3297,10 +3297,9 @@ int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule)
> >         struct type_datum *typedatum;
> >         struct user_datum *userdatum;
> >         struct selinux_audit_rule **rule = (struct selinux_audit_rule **)vrule;
> > +       struct selinux_audit_rule *orig = rule;
> >         int rc = 0;
> >  
> > -       *rule = NULL;
> > -
> >         if (!state->initialized)
> >                 return -EOPNOTSUPP;
> >  
> > @@ -3382,7 +3381,8 @@ int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule)
> >                 tmprule = NULL;
> >         }
> >  
> > -       *rule = tmprule;
> > +       if (cmpxchg(rule, orig, tmprule) != orig)
> > +               selinux_audit_rule_free(tmprule);
> >  
> >         return rc;
> >  }
> 
> This solution would be an easy fix, but might influence other modules 
> calling selinux_audit_rule_init() directly or indirectly (on 4.19 LTS, 
> only auditfilter and IMA it seems). And it might be worth returning an 
> error code such as -EAGAIN.
> 
> Or, we can access rules via RCU, similar to what we do on 5.10. This 
> could means more code change and testing.

In the 4.19 kernel, IMA is doing a lazy LSM based policy rule update as
needed.  IMA waits for selinux_audit_rule_init() to complete and
shouldn't see NULL, unless there is an SELinux failure.  Before
"fixing" the problem, what exactly is the problem?

thanks,

Mimi


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC] IMA LSM based rule race condition issue on 4.19 LTS
  2022-12-13 15:30 ` Mimi Zohar
@ 2022-12-14  1:33   ` Guozihua (Scott)
  2022-12-14 12:19     ` Mimi Zohar
  0 siblings, 1 reply; 30+ messages in thread
From: Guozihua (Scott) @ 2022-12-14  1:33 UTC (permalink / raw)
  To: Mimi Zohar, dmitry.kasatkin, Paul Moore, sds, eparis, Greg KH, sashal
  Cc: selinux, linux-integrity, stable

On 2022/12/13 23:30, Mimi Zohar wrote:
> On Fri, 2022-12-09 at 15:00 +0800, Guozihua (Scott) wrote:
>> Hi community.
>>
>> Previously our team reported a race condition in IMA relates to LSM 
>> based rules which would case IMA to match files that should be filtered 
>> out under normal condition. The issue was originally analyzed and fixed 
>> on mainstream. The patch and the discussion could be found here: 
>> https://lore.kernel.org/all/20220921125804.59490-1-guozihua@huawei.com/
>>
>> After that, we did a regression test on 4.19 LTS and the same issue 
>> arises. Further analysis reveled that the issue is from a completely 
>> different cause.
>>
>> The cause is that selinux_audit_rule_init() would set the rule (which is 
>> a second level pointer) to NULL immediately after called. The relevant 
>> codes are as shown:
>>
>> security/selinux/ss/services.c:
>>> int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule)
>>> {
>>>         struct selinux_state *state = &selinux_state;
>>>         struct policydb *policydb = &state->ss->policydb;
>>>         struct selinux_audit_rule *tmprule;
>>>         struct role_datum *roledatum;
>>>         struct type_datum *typedatum;
>>>         struct user_datum *userdatum;
>>>         struct selinux_audit_rule **rule = (struct selinux_audit_rule **)vrule;
>>>         int rc = 0;
>>>
>>>         *rule = NULL;
>> *rule is set to NULL here, which means the rule on IMA side is also NULL.
>>>
>>>         if (!state->initialized)
>>>                 return -EOPNOTSUPP;
>> ...
>>> out:
>>>         read_unlock(&state->ss->policy_rwlock);
>>>
>>>         if (rc) {
>>>                 selinux_audit_rule_free(tmprule);
>>>                 tmprule = NULL;
>>>         }
>>>
>>>         *rule = tmprule;
>> rule is updated at the end of the function.
>>>
>>>         return rc;
>>> }
>>
>> security/integrity/ima/ima_policy.c:
>>> static bool ima_match_rules(struct ima_rule_entry *rule, struct inode *inode,
>>>                             const struct cred *cred, u32 secid,
>>>                             enum ima_hooks func, int mask)
>>> {...
>>> for (i = 0; i < MAX_LSM_RULES; i++) {
>>>                 int rc = 0;
>>>                 u32 osid;
>>>                 int retried = 0;
>>>
>>>                 if (!rule->lsm[i].rule)
>>>                         continue;
>> Setting rule to NULL would lead to LSM based rule matching being skipped.
>>> retry:
>>>                 switch (i) {
>>
>> To solve this issue, there are multiple approaches we might take and I 
>> would like some input from the community.
>>
>> The first proposed solution would be to change 
>> selinux_audit_rule_init(). Remove the set to NULL bit and update the 
>> rule pointer with cmpxchg.
>>
>>> diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c
>>> index a9f2bc8443bd..aa74b04ccaf7 100644
>>> --- a/security/selinux/ss/services.c
>>> +++ b/security/selinux/ss/services.c
>>> @@ -3297,10 +3297,9 @@ int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule)
>>>         struct type_datum *typedatum;
>>>         struct user_datum *userdatum;
>>>         struct selinux_audit_rule **rule = (struct selinux_audit_rule **)vrule;
>>> +       struct selinux_audit_rule *orig = rule;
>>>         int rc = 0;
>>>  
>>> -       *rule = NULL;
>>> -
>>>         if (!state->initialized)
>>>                 return -EOPNOTSUPP;
>>>  
>>> @@ -3382,7 +3381,8 @@ int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule)
>>>                 tmprule = NULL;
>>>         }
>>>  
>>> -       *rule = tmprule;
>>> +       if (cmpxchg(rule, orig, tmprule) != orig)
>>> +               selinux_audit_rule_free(tmprule);
>>>  
>>>         return rc;
>>>  }
>>
>> This solution would be an easy fix, but might influence other modules 
>> calling selinux_audit_rule_init() directly or indirectly (on 4.19 LTS, 
>> only auditfilter and IMA it seems). And it might be worth returning an 
>> error code such as -EAGAIN.
>>
>> Or, we can access rules via RCU, similar to what we do on 5.10. This 
>> could means more code change and testing.
> 
> In the 4.19 kernel, IMA is doing a lazy LSM based policy rule update as
> needed.  IMA waits for selinux_audit_rule_init() to complete and
> shouldn't see NULL, unless there is an SELinux failure.  Before
> "fixing" the problem, what exactly is the problem?

IMA runs on multiple cores. On 4.19 kernel, IMA do a lazy update on ALL
LSM based rules in one go without using RCU, which would still allow
other cores to access the rule being updated. And that's the issue.

An example scenario would be:
	CPU1			|	CPU2
opened a file and starts	|
updating LSM based rules.	|
				| opened a file and starts
				| matching rules.
				|
set a LSM based rule to NULL.	| access the same LSM based rule and
 				| see that it's NULL.

In this situation, CPU 2 would recognize this rule as not LSM based and
ignore the LSM part of the rule while matching.

-- 
Best
GUO Zihua


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC] IMA LSM based rule race condition issue on 4.19 LTS
  2022-12-14  1:33   ` Guozihua (Scott)
@ 2022-12-14 12:19     ` Mimi Zohar
  2022-12-15  8:51       ` Guozihua (Scott)
  0 siblings, 1 reply; 30+ messages in thread
From: Mimi Zohar @ 2022-12-14 12:19 UTC (permalink / raw)
  To: Guozihua (Scott),
	dmitry.kasatkin, Paul Moore, sds, eparis, Greg KH, sashal
  Cc: selinux, linux-integrity, stable

On Wed, 2022-12-14 at 09:33 +0800, Guozihua (Scott) wrote:
> On 2022/12/13 23:30, Mimi Zohar wrote:
> > On Fri, 2022-12-09 at 15:00 +0800, Guozihua (Scott) wrote:
> >> Hi community.
> >>
> >> Previously our team reported a race condition in IMA relates to LSM 
> >> based rules which would case IMA to match files that should be filtered 
> >> out under normal condition. The issue was originally analyzed and fixed 
> >> on mainstream. The patch and the discussion could be found here: 
> >> https://lore.kernel.org/all/20220921125804.59490-1-guozihua@huawei.com/
> >>
> >> After that, we did a regression test on 4.19 LTS and the same issue 
> >> arises. Further analysis reveled that the issue is from a completely 
> >> different cause.
> >>
> >> The cause is that selinux_audit_rule_init() would set the rule (which is 
> >> a second level pointer) to NULL immediately after called. The relevant 
> >> codes are as shown:
> >>
> >> security/selinux/ss/services.c:
> >>> int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule)
> >>> {
> >>>         struct selinux_state *state = &selinux_state;
> >>>         struct policydb *policydb = &state->ss->policydb;
> >>>         struct selinux_audit_rule *tmprule;
> >>>         struct role_datum *roledatum;
> >>>         struct type_datum *typedatum;
> >>>         struct user_datum *userdatum;
> >>>         struct selinux_audit_rule **rule = (struct selinux_audit_rule **)vrule;
> >>>         int rc = 0;
> >>>
> >>>         *rule = NULL;
> >> *rule is set to NULL here, which means the rule on IMA side is also NULL.
> >>>
> >>>         if (!state->initialized)
> >>>                 return -EOPNOTSUPP;
> >> ...
> >>> out:
> >>>         read_unlock(&state->ss->policy_rwlock);
> >>>
> >>>         if (rc) {
> >>>                 selinux_audit_rule_free(tmprule);
> >>>                 tmprule = NULL;
> >>>         }
> >>>
> >>>         *rule = tmprule;
> >> rule is updated at the end of the function.
> >>>
> >>>         return rc;
> >>> }
> >>
> >> security/integrity/ima/ima_policy.c:
> >>> static bool ima_match_rules(struct ima_rule_entry *rule, struct inode *inode,
> >>>                             const struct cred *cred, u32 secid,
> >>>                             enum ima_hooks func, int mask)
> >>> {...
> >>> for (i = 0; i < MAX_LSM_RULES; i++) {
> >>>                 int rc = 0;
> >>>                 u32 osid;
> >>>                 int retried = 0;
> >>>
> >>>                 if (!rule->lsm[i].rule)
> >>>                         continue;
> >> Setting rule to NULL would lead to LSM based rule matching being skipped.
> >>> retry:
> >>>                 switch (i) {
> >>
> >> To solve this issue, there are multiple approaches we might take and I 
> >> would like some input from the community.
> >>
> >> The first proposed solution would be to change 
> >> selinux_audit_rule_init(). Remove the set to NULL bit and update the 
> >> rule pointer with cmpxchg.
> >>
> >>> diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c
> >>> index a9f2bc8443bd..aa74b04ccaf7 100644
> >>> --- a/security/selinux/ss/services.c
> >>> +++ b/security/selinux/ss/services.c
> >>> @@ -3297,10 +3297,9 @@ int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule)
> >>>         struct type_datum *typedatum;
> >>>         struct user_datum *userdatum;
> >>>         struct selinux_audit_rule **rule = (struct selinux_audit_rule **)vrule;
> >>> +       struct selinux_audit_rule *orig = rule;
> >>>         int rc = 0;
> >>>  
> >>> -       *rule = NULL;
> >>> -
> >>>         if (!state->initialized)
> >>>                 return -EOPNOTSUPP;
> >>>  
> >>> @@ -3382,7 +3381,8 @@ int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule)
> >>>                 tmprule = NULL;
> >>>         }
> >>>  
> >>> -       *rule = tmprule;
> >>> +       if (cmpxchg(rule, orig, tmprule) != orig)
> >>> +               selinux_audit_rule_free(tmprule);
> >>>  
> >>>         return rc;
> >>>  }
> >>
> >> This solution would be an easy fix, but might influence other modules 
> >> calling selinux_audit_rule_init() directly or indirectly (on 4.19 LTS, 
> >> only auditfilter and IMA it seems). And it might be worth returning an 
> >> error code such as -EAGAIN.
> >>
> >> Or, we can access rules via RCU, similar to what we do on 5.10. This 
> >> could means more code change and testing.
> > 
> > In the 4.19 kernel, IMA is doing a lazy LSM based policy rule update as
> > needed.  IMA waits for selinux_audit_rule_init() to complete and
> > shouldn't see NULL, unless there is an SELinux failure.  Before
> > "fixing" the problem, what exactly is the problem?
> 
> IMA runs on multiple cores. On 4.19 kernel, IMA do a lazy update on ALL
> LSM based rules in one go without using RCU, which would still allow
> other cores to access the rule being updated. And that's the issue.
> 
> An example scenario would be:
> 	CPU1			|	CPU2
> opened a file and starts	|
> updating LSM based rules.	|
> 				| opened a file and starts
> 				| matching rules.
> 				|
> set a LSM based rule to NULL.	| access the same LSM based rule and
>  				| see that it's NULL.
> 
> In this situation, CPU 2 would recognize this rule as not LSM based and
> ignore the LSM part of the rule while matching.

Would picking up just ima_lsm_update_rule(), without changing to the
lsm policy update notifier, from upstream and calling it from
ima_lsm_update_rules() resolve the RCU locking issue?  Or are there
other issues?

thanks,

Mimi



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC] IMA LSM based rule race condition issue on 4.19 LTS
  2022-12-14 12:19     ` Mimi Zohar
@ 2022-12-15  8:51       ` Guozihua (Scott)
  2022-12-15 10:49         ` Mimi Zohar
  0 siblings, 1 reply; 30+ messages in thread
From: Guozihua (Scott) @ 2022-12-15  8:51 UTC (permalink / raw)
  To: Mimi Zohar, dmitry.kasatkin, Paul Moore, sds, eparis, Greg KH, sashal
  Cc: selinux, linux-integrity, stable

On 2022/12/14 20:19, Mimi Zohar wrote:
> On Wed, 2022-12-14 at 09:33 +0800, Guozihua (Scott) wrote:
>> On 2022/12/13 23:30, Mimi Zohar wrote:
>>> On Fri, 2022-12-09 at 15:00 +0800, Guozihua (Scott) wrote:
>>>> Hi community.
>>>>
>>>> Previously our team reported a race condition in IMA relates to LSM 
>>>> based rules which would case IMA to match files that should be filtered 
>>>> out under normal condition. The issue was originally analyzed and fixed 
>>>> on mainstream. The patch and the discussion could be found here: 
>>>> https://lore.kernel.org/all/20220921125804.59490-1-guozihua@huawei.com/
>>>>
>>>> After that, we did a regression test on 4.19 LTS and the same issue 
>>>> arises. Further analysis reveled that the issue is from a completely 
>>>> different cause.
>>>>
>>>> The cause is that selinux_audit_rule_init() would set the rule (which is 
>>>> a second level pointer) to NULL immediately after called. The relevant 
>>>> codes are as shown:
>>>>
>>>> security/selinux/ss/services.c:
>>>>> int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule)
>>>>> {
>>>>>         struct selinux_state *state = &selinux_state;
>>>>>         struct policydb *policydb = &state->ss->policydb;
>>>>>         struct selinux_audit_rule *tmprule;
>>>>>         struct role_datum *roledatum;
>>>>>         struct type_datum *typedatum;
>>>>>         struct user_datum *userdatum;
>>>>>         struct selinux_audit_rule **rule = (struct selinux_audit_rule **)vrule;
>>>>>         int rc = 0;
>>>>>
>>>>>         *rule = NULL;
>>>> *rule is set to NULL here, which means the rule on IMA side is also NULL.
>>>>>
>>>>>         if (!state->initialized)
>>>>>                 return -EOPNOTSUPP;
>>>> ...
>>>>> out:
>>>>>         read_unlock(&state->ss->policy_rwlock);
>>>>>
>>>>>         if (rc) {
>>>>>                 selinux_audit_rule_free(tmprule);
>>>>>                 tmprule = NULL;
>>>>>         }
>>>>>
>>>>>         *rule = tmprule;
>>>> rule is updated at the end of the function.
>>>>>
>>>>>         return rc;
>>>>> }
>>>>
>>>> security/integrity/ima/ima_policy.c:
>>>>> static bool ima_match_rules(struct ima_rule_entry *rule, struct inode *inode,
>>>>>                             const struct cred *cred, u32 secid,
>>>>>                             enum ima_hooks func, int mask)
>>>>> {...
>>>>> for (i = 0; i < MAX_LSM_RULES; i++) {
>>>>>                 int rc = 0;
>>>>>                 u32 osid;
>>>>>                 int retried = 0;
>>>>>
>>>>>                 if (!rule->lsm[i].rule)
>>>>>                         continue;
>>>> Setting rule to NULL would lead to LSM based rule matching being skipped.
>>>>> retry:
>>>>>                 switch (i) {
>>>>
>>>> To solve this issue, there are multiple approaches we might take and I 
>>>> would like some input from the community.
>>>>
>>>> The first proposed solution would be to change 
>>>> selinux_audit_rule_init(). Remove the set to NULL bit and update the 
>>>> rule pointer with cmpxchg.
>>>>
>>>>> diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c
>>>>> index a9f2bc8443bd..aa74b04ccaf7 100644
>>>>> --- a/security/selinux/ss/services.c
>>>>> +++ b/security/selinux/ss/services.c
>>>>> @@ -3297,10 +3297,9 @@ int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule)
>>>>>         struct type_datum *typedatum;
>>>>>         struct user_datum *userdatum;
>>>>>         struct selinux_audit_rule **rule = (struct selinux_audit_rule **)vrule;
>>>>> +       struct selinux_audit_rule *orig = rule;
>>>>>         int rc = 0;
>>>>>  
>>>>> -       *rule = NULL;
>>>>> -
>>>>>         if (!state->initialized)
>>>>>                 return -EOPNOTSUPP;
>>>>>  
>>>>> @@ -3382,7 +3381,8 @@ int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule)
>>>>>                 tmprule = NULL;
>>>>>         }
>>>>>  
>>>>> -       *rule = tmprule;
>>>>> +       if (cmpxchg(rule, orig, tmprule) != orig)
>>>>> +               selinux_audit_rule_free(tmprule);
>>>>>  
>>>>>         return rc;
>>>>>  }
>>>>
>>>> This solution would be an easy fix, but might influence other modules 
>>>> calling selinux_audit_rule_init() directly or indirectly (on 4.19 LTS, 
>>>> only auditfilter and IMA it seems). And it might be worth returning an 
>>>> error code such as -EAGAIN.
>>>>
>>>> Or, we can access rules via RCU, similar to what we do on 5.10. This 
>>>> could means more code change and testing.
>>>
>>> In the 4.19 kernel, IMA is doing a lazy LSM based policy rule update as
>>> needed.  IMA waits for selinux_audit_rule_init() to complete and
>>> shouldn't see NULL, unless there is an SELinux failure.  Before
>>> "fixing" the problem, what exactly is the problem?
>>
>> IMA runs on multiple cores. On 4.19 kernel, IMA do a lazy update on ALL
>> LSM based rules in one go without using RCU, which would still allow
>> other cores to access the rule being updated. And that's the issue.
>>
>> An example scenario would be:
>> 	CPU1			|	CPU2
>> opened a file and starts	|
>> updating LSM based rules.	|
>> 				| opened a file and starts
>> 				| matching rules.
>> 				|
>> set a LSM based rule to NULL.	| access the same LSM based rule and
>>  				| see that it's NULL.
>>
>> In this situation, CPU 2 would recognize this rule as not LSM based and
>> ignore the LSM part of the rule while matching.
> 
> Would picking up just ima_lsm_update_rule(), without changing to the
> lsm policy update notifier, from upstream and calling it from
> ima_lsm_update_rules() resolve the RCU locking issue?  Or are there
> other issues?

Hi Mimi,

That should resolve the issue just fine. However, that'd mean having a
customized ima_lsm_update_rules as well as a customized
ima_lsm_update_rule functions on 4.19.y, potentially decrease the
maintainability. The customization of the two functions mentioned above
would be to ripoff the RCU part so that we can leave the other rule-list
access untouched.

-- 
Best
GUO Zihua


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC] IMA LSM based rule race condition issue on 4.19 LTS
  2022-12-15  8:51       ` Guozihua (Scott)
@ 2022-12-15 10:49         ` Mimi Zohar
  2022-12-15 13:15           ` Guozihua (Scott)
  0 siblings, 1 reply; 30+ messages in thread
From: Mimi Zohar @ 2022-12-15 10:49 UTC (permalink / raw)
  To: Guozihua (Scott),
	dmitry.kasatkin, Paul Moore, sds, eparis, Greg KH, sashal
  Cc: selinux, linux-integrity, stable

On Thu, 2022-12-15 at 16:51 +0800, Guozihua (Scott) wrote:
> On 2022/12/14 20:19, Mimi Zohar wrote:
> > On Wed, 2022-12-14 at 09:33 +0800, Guozihua (Scott) wrote:
> >> On 2022/12/13 23:30, Mimi Zohar wrote:
> >>> On Fri, 2022-12-09 at 15:00 +0800, Guozihua (Scott) wrote:
> >>>> Hi community.
> >>>>
> >>>> Previously our team reported a race condition in IMA relates to LSM 
> >>>> based rules which would case IMA to match files that should be filtered 
> >>>> out under normal condition. The issue was originally analyzed and fixed 
> >>>> on mainstream. The patch and the discussion could be found here: 
> >>>> https://lore.kernel.org/all/20220921125804.59490-1-guozihua@huawei.com/
> >>>>
> >>>> After that, we did a regression test on 4.19 LTS and the same issue 
> >>>> arises. Further analysis reveled that the issue is from a completely 
> >>>> different cause.
> >>>>
> >>>> The cause is that selinux_audit_rule_init() would set the rule (which is 
> >>>> a second level pointer) to NULL immediately after called. The relevant 
> >>>> codes are as shown:
> >>>>
> >>>> security/selinux/ss/services.c:
> >>>>> int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule)
> >>>>> {
> >>>>>         struct selinux_state *state = &selinux_state;
> >>>>>         struct policydb *policydb = &state->ss->policydb;
> >>>>>         struct selinux_audit_rule *tmprule;
> >>>>>         struct role_datum *roledatum;
> >>>>>         struct type_datum *typedatum;
> >>>>>         struct user_datum *userdatum;
> >>>>>         struct selinux_audit_rule **rule = (struct selinux_audit_rule **)vrule;
> >>>>>         int rc = 0;
> >>>>>
> >>>>>         *rule = NULL;
> >>>> *rule is set to NULL here, which means the rule on IMA side is also NULL.
> >>>>>
> >>>>>         if (!state->initialized)
> >>>>>                 return -EOPNOTSUPP;
> >>>> ...
> >>>>> out:
> >>>>>         read_unlock(&state->ss->policy_rwlock);
> >>>>>
> >>>>>         if (rc) {
> >>>>>                 selinux_audit_rule_free(tmprule);
> >>>>>                 tmprule = NULL;
> >>>>>         }
> >>>>>
> >>>>>         *rule = tmprule;
> >>>> rule is updated at the end of the function.
> >>>>>
> >>>>>         return rc;
> >>>>> }
> >>>>
> >>>> security/integrity/ima/ima_policy.c:
> >>>>> static bool ima_match_rules(struct ima_rule_entry *rule, struct inode *inode,
> >>>>>                             const struct cred *cred, u32 secid,
> >>>>>                             enum ima_hooks func, int mask)
> >>>>> {...
> >>>>> for (i = 0; i < MAX_LSM_RULES; i++) {
> >>>>>                 int rc = 0;
> >>>>>                 u32 osid;
> >>>>>                 int retried = 0;
> >>>>>
> >>>>>                 if (!rule->lsm[i].rule)
> >>>>>                         continue;
> >>>> Setting rule to NULL would lead to LSM based rule matching being skipped.
> >>>>> retry:
> >>>>>                 switch (i) {
> >>>>
> >>>> To solve this issue, there are multiple approaches we might take and I 
> >>>> would like some input from the community.
> >>>>
> >>>> The first proposed solution would be to change 
> >>>> selinux_audit_rule_init(). Remove the set to NULL bit and update the 
> >>>> rule pointer with cmpxchg.
> >>>>
> >>>>> diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c
> >>>>> index a9f2bc8443bd..aa74b04ccaf7 100644
> >>>>> --- a/security/selinux/ss/services.c
> >>>>> +++ b/security/selinux/ss/services.c
> >>>>> @@ -3297,10 +3297,9 @@ int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule)
> >>>>>         struct type_datum *typedatum;
> >>>>>         struct user_datum *userdatum;
> >>>>>         struct selinux_audit_rule **rule = (struct selinux_audit_rule **)vrule;
> >>>>> +       struct selinux_audit_rule *orig = rule;
> >>>>>         int rc = 0;
> >>>>>  
> >>>>> -       *rule = NULL;
> >>>>> -
> >>>>>         if (!state->initialized)
> >>>>>                 return -EOPNOTSUPP;
> >>>>>  
> >>>>> @@ -3382,7 +3381,8 @@ int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule)
> >>>>>                 tmprule = NULL;
> >>>>>         }
> >>>>>  
> >>>>> -       *rule = tmprule;
> >>>>> +       if (cmpxchg(rule, orig, tmprule) != orig)
> >>>>> +               selinux_audit_rule_free(tmprule);
> >>>>>  
> >>>>>         return rc;
> >>>>>  }
> >>>>
> >>>> This solution would be an easy fix, but might influence other modules 
> >>>> calling selinux_audit_rule_init() directly or indirectly (on 4.19 LTS, 
> >>>> only auditfilter and IMA it seems). And it might be worth returning an 
> >>>> error code such as -EAGAIN.
> >>>>
> >>>> Or, we can access rules via RCU, similar to what we do on 5.10. This 
> >>>> could means more code change and testing.
> >>>
> >>> In the 4.19 kernel, IMA is doing a lazy LSM based policy rule update as
> >>> needed.  IMA waits for selinux_audit_rule_init() to complete and
> >>> shouldn't see NULL, unless there is an SELinux failure.  Before
> >>> "fixing" the problem, what exactly is the problem?
> >>
> >> IMA runs on multiple cores. On 4.19 kernel, IMA do a lazy update on ALL
> >> LSM based rules in one go without using RCU, which would still allow
> >> other cores to access the rule being updated. And that's the issue.
> >>
> >> An example scenario would be:
> >> 	CPU1			|	CPU2
> >> opened a file and starts	|
> >> updating LSM based rules.	|
> >> 				| opened a file and starts
> >> 				| matching rules.
> >> 				|
> >> set a LSM based rule to NULL.	| access the same LSM based rule and
> >>  				| see that it's NULL.
> >>
> >> In this situation, CPU 2 would recognize this rule as not LSM based and
> >> ignore the LSM part of the rule while matching.
> > 
> > Would picking up just ima_lsm_update_rule(), without changing to the
> > lsm policy update notifier, from upstream and calling it from
> > ima_lsm_update_rules() resolve the RCU locking issue?  Or are there
> > other issues?
> 
> Hi Mimi,
> 
> That should resolve the issue just fine. However, that'd mean having a
> customized ima_lsm_update_rules as well as a customized
> ima_lsm_update_rule functions on 4.19.y, potentially decrease the
> maintainability. The customization of the two functions mentioned above
> would be to ripoff the RCU part so that we can leave the other rule-list
> access untouched.

Hi Scott,

Neither do we want to backport new functionality.  Perhaps it is only a
subset of commit b16942455193 ("ima: use the lsm policy update
notifier").

Although your suggested solution of using "cmpxchg" isn't needed in
recent kernels,  it wouldn't hurt either, right?  Assuming that Paul
would be willing to accept it, that could be another option.

-- 
thanks,

Mimi


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC] IMA LSM based rule race condition issue on 4.19 LTS
  2022-12-15 10:49         ` Mimi Zohar
@ 2022-12-15 13:15           ` Guozihua (Scott)
  2022-12-15 14:30             ` Mimi Zohar
  0 siblings, 1 reply; 30+ messages in thread
From: Guozihua (Scott) @ 2022-12-15 13:15 UTC (permalink / raw)
  To: Mimi Zohar, dmitry.kasatkin, Paul Moore, sds, eparis, Greg KH, sashal
  Cc: selinux, linux-integrity, stable

On 2022/12/15 18:49, Mimi Zohar wrote:
> On Thu, 2022-12-15 at 16:51 +0800, Guozihua (Scott) wrote:
>> On 2022/12/14 20:19, Mimi Zohar wrote:
>>> On Wed, 2022-12-14 at 09:33 +0800, Guozihua (Scott) wrote:
>>>> On 2022/12/13 23:30, Mimi Zohar wrote:
>>>>> On Fri, 2022-12-09 at 15:00 +0800, Guozihua (Scott) wrote:
>>>>>> Hi community.
>>>>>>
>>>>>> Previously our team reported a race condition in IMA relates to LSM 
>>>>>> based rules which would case IMA to match files that should be filtered 
>>>>>> out under normal condition. The issue was originally analyzed and fixed 
>>>>>> on mainstream. The patch and the discussion could be found here: 
>>>>>> https://lore.kernel.org/all/20220921125804.59490-1-guozihua@huawei.com/
>>>>>>
>>>>>> After that, we did a regression test on 4.19 LTS and the same issue 
>>>>>> arises. Further analysis reveled that the issue is from a completely 
>>>>>> different cause.
>>>>>>
>>>>>> The cause is that selinux_audit_rule_init() would set the rule (which is 
>>>>>> a second level pointer) to NULL immediately after called. The relevant 
>>>>>> codes are as shown:
>>>>>>
>>>>>> security/selinux/ss/services.c:
>>>>>>> int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule)
>>>>>>> {
>>>>>>>         struct selinux_state *state = &selinux_state;
>>>>>>>         struct policydb *policydb = &state->ss->policydb;
>>>>>>>         struct selinux_audit_rule *tmprule;
>>>>>>>         struct role_datum *roledatum;
>>>>>>>         struct type_datum *typedatum;
>>>>>>>         struct user_datum *userdatum;
>>>>>>>         struct selinux_audit_rule **rule = (struct selinux_audit_rule **)vrule;
>>>>>>>         int rc = 0;
>>>>>>>
>>>>>>>         *rule = NULL;
>>>>>> *rule is set to NULL here, which means the rule on IMA side is also NULL.
>>>>>>>
>>>>>>>         if (!state->initialized)
>>>>>>>                 return -EOPNOTSUPP;
>>>>>> ...
>>>>>>> out:
>>>>>>>         read_unlock(&state->ss->policy_rwlock);
>>>>>>>
>>>>>>>         if (rc) {
>>>>>>>                 selinux_audit_rule_free(tmprule);
>>>>>>>                 tmprule = NULL;
>>>>>>>         }
>>>>>>>
>>>>>>>         *rule = tmprule;
>>>>>> rule is updated at the end of the function.
>>>>>>>
>>>>>>>         return rc;
>>>>>>> }
>>>>>>
>>>>>> security/integrity/ima/ima_policy.c:
>>>>>>> static bool ima_match_rules(struct ima_rule_entry *rule, struct inode *inode,
>>>>>>>                             const struct cred *cred, u32 secid,
>>>>>>>                             enum ima_hooks func, int mask)
>>>>>>> {...
>>>>>>> for (i = 0; i < MAX_LSM_RULES; i++) {
>>>>>>>                 int rc = 0;
>>>>>>>                 u32 osid;
>>>>>>>                 int retried = 0;
>>>>>>>
>>>>>>>                 if (!rule->lsm[i].rule)
>>>>>>>                         continue;
>>>>>> Setting rule to NULL would lead to LSM based rule matching being skipped.
>>>>>>> retry:
>>>>>>>                 switch (i) {
>>>>>>
>>>>>> To solve this issue, there are multiple approaches we might take and I 
>>>>>> would like some input from the community.
>>>>>>
>>>>>> The first proposed solution would be to change 
>>>>>> selinux_audit_rule_init(). Remove the set to NULL bit and update the 
>>>>>> rule pointer with cmpxchg.
>>>>>>
>>>>>>> diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c
>>>>>>> index a9f2bc8443bd..aa74b04ccaf7 100644
>>>>>>> --- a/security/selinux/ss/services.c
>>>>>>> +++ b/security/selinux/ss/services.c
>>>>>>> @@ -3297,10 +3297,9 @@ int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule)
>>>>>>>         struct type_datum *typedatum;
>>>>>>>         struct user_datum *userdatum;
>>>>>>>         struct selinux_audit_rule **rule = (struct selinux_audit_rule **)vrule;
>>>>>>> +       struct selinux_audit_rule *orig = rule;
>>>>>>>         int rc = 0;
>>>>>>>  
>>>>>>> -       *rule = NULL;
>>>>>>> -
>>>>>>>         if (!state->initialized)
>>>>>>>                 return -EOPNOTSUPP;
>>>>>>>  
>>>>>>> @@ -3382,7 +3381,8 @@ int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule)
>>>>>>>                 tmprule = NULL;
>>>>>>>         }
>>>>>>>  
>>>>>>> -       *rule = tmprule;
>>>>>>> +       if (cmpxchg(rule, orig, tmprule) != orig)
>>>>>>> +               selinux_audit_rule_free(tmprule);
>>>>>>>  
>>>>>>>         return rc;
>>>>>>>  }
>>>>>>
>>>>>> This solution would be an easy fix, but might influence other modules 
>>>>>> calling selinux_audit_rule_init() directly or indirectly (on 4.19 LTS, 
>>>>>> only auditfilter and IMA it seems). And it might be worth returning an 
>>>>>> error code such as -EAGAIN.
>>>>>>
>>>>>> Or, we can access rules via RCU, similar to what we do on 5.10. This 
>>>>>> could means more code change and testing.
>>>>>
>>>>> In the 4.19 kernel, IMA is doing a lazy LSM based policy rule update as
>>>>> needed.  IMA waits for selinux_audit_rule_init() to complete and
>>>>> shouldn't see NULL, unless there is an SELinux failure.  Before
>>>>> "fixing" the problem, what exactly is the problem?
>>>>
>>>> IMA runs on multiple cores. On 4.19 kernel, IMA do a lazy update on ALL
>>>> LSM based rules in one go without using RCU, which would still allow
>>>> other cores to access the rule being updated. And that's the issue.
>>>>
>>>> An example scenario would be:
>>>> 	CPU1			|	CPU2
>>>> opened a file and starts	|
>>>> updating LSM based rules.	|
>>>> 				| opened a file and starts
>>>> 				| matching rules.
>>>> 				|
>>>> set a LSM based rule to NULL.	| access the same LSM based rule and
>>>>  				| see that it's NULL.
>>>>
>>>> In this situation, CPU 2 would recognize this rule as not LSM based and
>>>> ignore the LSM part of the rule while matching.
>>>
>>> Would picking up just ima_lsm_update_rule(), without changing to the
>>> lsm policy update notifier, from upstream and calling it from
>>> ima_lsm_update_rules() resolve the RCU locking issue?  Or are there
>>> other issues?
>>
>> Hi Mimi,
>>
>> That should resolve the issue just fine. However, that'd mean having a
>> customized ima_lsm_update_rules as well as a customized
>> ima_lsm_update_rule functions on 4.19.y, potentially decrease the
>> maintainability. The customization of the two functions mentioned above
>> would be to ripoff the RCU part so that we can leave the other rule-list
>> access untouched.
> 
> Hi Scott,
> 
> Neither do we want to backport new functionality.  Perhaps it is only a
> subset of commit b16942455193 ("ima: use the lsm policy update
> notifier").
Yes we are able to backport part of this commit to get a mainline-like
behavior which would solve the issue at hand as well. However from a
maintenance standpoint it might be better to form a different commit and
not re-use the commit message from mainline commit.
> 
> Although your suggested solution of using "cmpxchg" isn't needed in
> recent kernels,  it wouldn't hurt either, right?  Assuming that Paul
> would be willing to accept it, that could be another option.
The cmpxchg part is merely to avoid multiple updates on the same rule
item. However I am not so sure why SELinux would set the rule to NULL.
My guess is that SELinux is trying to stop others from using an
invalidated rule?

Would Paul be able to provide some insight? as well as some Comment on
the proposed solution?

-- 
Best
GUO Zihua


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC] IMA LSM based rule race condition issue on 4.19 LTS
  2022-12-15 13:15           ` Guozihua (Scott)
@ 2022-12-15 14:30             ` Mimi Zohar
  2022-12-15 21:04               ` Paul Moore
  0 siblings, 1 reply; 30+ messages in thread
From: Mimi Zohar @ 2022-12-15 14:30 UTC (permalink / raw)
  To: Guozihua (Scott),
	dmitry.kasatkin, Paul Moore, sds, eparis, Greg KH, sashal
  Cc: selinux, linux-integrity, stable

On Thu, 2022-12-15 at 21:15 +0800, Guozihua (Scott) wrote:
> On 2022/12/15 18:49, Mimi Zohar wrote:
> > On Thu, 2022-12-15 at 16:51 +0800, Guozihua (Scott) wrote:
> >> On 2022/12/14 20:19, Mimi Zohar wrote:
> >>> On Wed, 2022-12-14 at 09:33 +0800, Guozihua (Scott) wrote:
> >>>> On 2022/12/13 23:30, Mimi Zohar wrote:
> >>>>> On Fri, 2022-12-09 at 15:00 +0800, Guozihua (Scott) wrote:
> >>>>>> Hi community.
> >>>>>>
> >>>>>> Previously our team reported a race condition in IMA relates to LSM 
> >>>>>> based rules which would case IMA to match files that should be filtered 
> >>>>>> out under normal condition. The issue was originally analyzed and fixed 
> >>>>>> on mainstream. The patch and the discussion could be found here: 
> >>>>>> https://lore.kernel.org/all/20220921125804.59490-1-guozihua@huawei.com/
> >>>>>>
> >>>>>> After that, we did a regression test on 4.19 LTS and the same issue 
> >>>>>> arises. Further analysis reveled that the issue is from a completely 
> >>>>>> different cause.
> >>>>>>
> >>>>>> The cause is that selinux_audit_rule_init() would set the rule (which is 
> >>>>>> a second level pointer) to NULL immediately after called. The relevant 
> >>>>>> codes are as shown:
> >>>>>>
> >>>>>> security/selinux/ss/services.c:
> >>>>>>> int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule)
> >>>>>>> {
> >>>>>>>         struct selinux_state *state = &selinux_state;
> >>>>>>>         struct policydb *policydb = &state->ss->policydb;
> >>>>>>>         struct selinux_audit_rule *tmprule;
> >>>>>>>         struct role_datum *roledatum;
> >>>>>>>         struct type_datum *typedatum;
> >>>>>>>         struct user_datum *userdatum;
> >>>>>>>         struct selinux_audit_rule **rule = (struct selinux_audit_rule **)vrule;
> >>>>>>>         int rc = 0;
> >>>>>>>
> >>>>>>>         *rule = NULL;
> >>>>>> *rule is set to NULL here, which means the rule on IMA side is also NULL.
> >>>>>>>
> >>>>>>>         if (!state->initialized)
> >>>>>>>                 return -EOPNOTSUPP;
> >>>>>> ...
> >>>>>>> out:
> >>>>>>>         read_unlock(&state->ss->policy_rwlock);
> >>>>>>>
> >>>>>>>         if (rc) {
> >>>>>>>                 selinux_audit_rule_free(tmprule);
> >>>>>>>                 tmprule = NULL;
> >>>>>>>         }
> >>>>>>>
> >>>>>>>         *rule = tmprule;
> >>>>>> rule is updated at the end of the function.
> >>>>>>>
> >>>>>>>         return rc;
> >>>>>>> }
> >>>>>>
> >>>>>> security/integrity/ima/ima_policy.c:
> >>>>>>> static bool ima_match_rules(struct ima_rule_entry *rule, struct inode *inode,
> >>>>>>>                             const struct cred *cred, u32 secid,
> >>>>>>>                             enum ima_hooks func, int mask)
> >>>>>>> {...
> >>>>>>> for (i = 0; i < MAX_LSM_RULES; i++) {
> >>>>>>>                 int rc = 0;
> >>>>>>>                 u32 osid;
> >>>>>>>                 int retried = 0;
> >>>>>>>
> >>>>>>>                 if (!rule->lsm[i].rule)
> >>>>>>>                         continue;
> >>>>>> Setting rule to NULL would lead to LSM based rule matching being skipped.
> >>>>>>> retry:
> >>>>>>>                 switch (i) {
> >>>>>>
> >>>>>> To solve this issue, there are multiple approaches we might take and I 
> >>>>>> would like some input from the community.
> >>>>>>
> >>>>>> The first proposed solution would be to change 
> >>>>>> selinux_audit_rule_init(). Remove the set to NULL bit and update the 
> >>>>>> rule pointer with cmpxchg.
> >>>>>>
> >>>>>>> diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c
> >>>>>>> index a9f2bc8443bd..aa74b04ccaf7 100644
> >>>>>>> --- a/security/selinux/ss/services.c
> >>>>>>> +++ b/security/selinux/ss/services.c
> >>>>>>> @@ -3297,10 +3297,9 @@ int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule)
> >>>>>>>         struct type_datum *typedatum;
> >>>>>>>         struct user_datum *userdatum;
> >>>>>>>         struct selinux_audit_rule **rule = (struct selinux_audit_rule **)vrule;
> >>>>>>> +       struct selinux_audit_rule *orig = rule;
> >>>>>>>         int rc = 0;
> >>>>>>>  
> >>>>>>> -       *rule = NULL;
> >>>>>>> -
> >>>>>>>         if (!state->initialized)
> >>>>>>>                 return -EOPNOTSUPP;
> >>>>>>>  
> >>>>>>> @@ -3382,7 +3381,8 @@ int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule)
> >>>>>>>                 tmprule = NULL;
> >>>>>>>         }
> >>>>>>>  
> >>>>>>> -       *rule = tmprule;
> >>>>>>> +       if (cmpxchg(rule, orig, tmprule) != orig)
> >>>>>>> +               selinux_audit_rule_free(tmprule);
> >>>>>>>  
> >>>>>>>         return rc;
> >>>>>>>  }
> >>>>>>
> >>>>>> This solution would be an easy fix, but might influence other modules 
> >>>>>> calling selinux_audit_rule_init() directly or indirectly (on 4.19 LTS, 
> >>>>>> only auditfilter and IMA it seems). And it might be worth returning an 
> >>>>>> error code such as -EAGAIN.
> >>>>>>
> >>>>>> Or, we can access rules via RCU, similar to what we do on 5.10. This 
> >>>>>> could means more code change and testing.
> >>>>>
> >>>>> In the 4.19 kernel, IMA is doing a lazy LSM based policy rule update as
> >>>>> needed.  IMA waits for selinux_audit_rule_init() to complete and
> >>>>> shouldn't see NULL, unless there is an SELinux failure.  Before
> >>>>> "fixing" the problem, what exactly is the problem?
> >>>>
> >>>> IMA runs on multiple cores. On 4.19 kernel, IMA do a lazy update on ALL
> >>>> LSM based rules in one go without using RCU, which would still allow
> >>>> other cores to access the rule being updated. And that's the issue.
> >>>>
> >>>> An example scenario would be:
> >>>> 	CPU1			|	CPU2
> >>>> opened a file and starts	|
> >>>> updating LSM based rules.	|
> >>>> 				| opened a file and starts
> >>>> 				| matching rules.
> >>>> 				|
> >>>> set a LSM based rule to NULL.	| access the same LSM based rule and
> >>>>  				| see that it's NULL.
> >>>>
> >>>> In this situation, CPU 2 would recognize this rule as not LSM based and
> >>>> ignore the LSM part of the rule while matching.
> >>>
> >>> Would picking up just ima_lsm_update_rule(), without changing to the
> >>> lsm policy update notifier, from upstream and calling it from
> >>> ima_lsm_update_rules() resolve the RCU locking issue?  Or are there
> >>> other issues?
> >>
> >> Hi Mimi,
> >>
> >> That should resolve the issue just fine. However, that'd mean having a
> >> customized ima_lsm_update_rules as well as a customized
> >> ima_lsm_update_rule functions on 4.19.y, potentially decrease the
> >> maintainability. The customization of the two functions mentioned above
> >> would be to ripoff the RCU part so that we can leave the other rule-list
> >> access untouched.
> > 
> > Hi Scott,
> > 
> > Neither do we want to backport new functionality.  Perhaps it is only a
> > subset of commit b16942455193 ("ima: use the lsm policy update
> > notifier").
> Yes we are able to backport part of this commit to get a mainline-like
> behavior which would solve the issue at hand as well. However from a
> maintenance standpoint it might be better to form a different commit and
> not re-use the commit message from mainline commit.

I assume that is fine, but cherry-pick the original commit with the -x
option, so there is a correlation to the upstream commit.  The patch
description would mention that the patch is a partial backport.

thanks,

Mimi

> > 
> > Although your suggested solution of using "cmpxchg" isn't needed in
> > recent kernels,  it wouldn't hurt either, right?  Assuming that Paul
> > would be willing to accept it, that could be another option.
> The cmpxchg part is merely to avoid multiple updates on the same rule
> item. However I am not so sure why SELinux would set the rule to NULL.
> My guess is that SELinux is trying to stop others from using an
> invalidated rule?
> 
> Would Paul be able to provide some insight? as well as some Comment on
> the proposed solution?



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC] IMA LSM based rule race condition issue on 4.19 LTS
  2022-12-15 14:30             ` Mimi Zohar
@ 2022-12-15 21:04               ` Paul Moore
  2022-12-16  2:36                 ` Guozihua (Scott)
  0 siblings, 1 reply; 30+ messages in thread
From: Paul Moore @ 2022-12-15 21:04 UTC (permalink / raw)
  To: Mimi Zohar
  Cc: Guozihua (Scott),
	dmitry.kasatkin, sds, eparis, Greg KH, sashal, selinux,
	linux-integrity, stable

On Thu, Dec 15, 2022 at 9:30 AM Mimi Zohar <zohar@linux.ibm.com> wrote:
>
> On Thu, 2022-12-15 at 21:15 +0800, Guozihua (Scott) wrote:
> > On 2022/12/15 18:49, Mimi Zohar wrote:
> > > On Thu, 2022-12-15 at 16:51 +0800, Guozihua (Scott) wrote:
> > >> On 2022/12/14 20:19, Mimi Zohar wrote:
> > >>> On Wed, 2022-12-14 at 09:33 +0800, Guozihua (Scott) wrote:
> > >>>> On 2022/12/13 23:30, Mimi Zohar wrote:
> > >>>>> On Fri, 2022-12-09 at 15:00 +0800, Guozihua (Scott) wrote:
> > >>>>>> Hi community.
> > >>>>>>
> > >>>>>> Previously our team reported a race condition in IMA relates to LSM
> > >>>>>> based rules which would case IMA to match files that should be filtered
> > >>>>>> out under normal condition. The issue was originally analyzed and fixed
> > >>>>>> on mainstream. The patch and the discussion could be found here:
> > >>>>>> https://lore.kernel.org/all/20220921125804.59490-1-guozihua@huawei.com/
> > >>>>>>
> > >>>>>> After that, we did a regression test on 4.19 LTS and the same issue
> > >>>>>> arises. Further analysis reveled that the issue is from a completely
> > >>>>>> different cause.
> > >>>>>>
> > >>>>>> The cause is that selinux_audit_rule_init() would set the rule (which is
> > >>>>>> a second level pointer) to NULL immediately after called. The relevant
> > >>>>>> codes are as shown:
> > >>>>>>
> > >>>>>> security/selinux/ss/services.c:
> > >>>>>>> int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule)
> > >>>>>>> {
> > >>>>>>>         struct selinux_state *state = &selinux_state;
> > >>>>>>>         struct policydb *policydb = &state->ss->policydb;
> > >>>>>>>         struct selinux_audit_rule *tmprule;
> > >>>>>>>         struct role_datum *roledatum;
> > >>>>>>>         struct type_datum *typedatum;
> > >>>>>>>         struct user_datum *userdatum;
> > >>>>>>>         struct selinux_audit_rule **rule = (struct selinux_audit_rule **)vrule;
> > >>>>>>>         int rc = 0;
> > >>>>>>>
> > >>>>>>>         *rule = NULL;
> > >>>>>> *rule is set to NULL here, which means the rule on IMA side is also NULL.
> > >>>>>>>
> > >>>>>>>         if (!state->initialized)
> > >>>>>>>                 return -EOPNOTSUPP;
> > >>>>>> ...
> > >>>>>>> out:
> > >>>>>>>         read_unlock(&state->ss->policy_rwlock);
> > >>>>>>>
> > >>>>>>>         if (rc) {
> > >>>>>>>                 selinux_audit_rule_free(tmprule);
> > >>>>>>>                 tmprule = NULL;
> > >>>>>>>         }
> > >>>>>>>
> > >>>>>>>         *rule = tmprule;
> > >>>>>> rule is updated at the end of the function.
> > >>>>>>>
> > >>>>>>>         return rc;
> > >>>>>>> }
> > >>>>>>
> > >>>>>> security/integrity/ima/ima_policy.c:
> > >>>>>>> static bool ima_match_rules(struct ima_rule_entry *rule, struct inode *inode,
> > >>>>>>>                             const struct cred *cred, u32 secid,
> > >>>>>>>                             enum ima_hooks func, int mask)
> > >>>>>>> {...
> > >>>>>>> for (i = 0; i < MAX_LSM_RULES; i++) {
> > >>>>>>>                 int rc = 0;
> > >>>>>>>                 u32 osid;
> > >>>>>>>                 int retried = 0;
> > >>>>>>>
> > >>>>>>>                 if (!rule->lsm[i].rule)
> > >>>>>>>                         continue;
> > >>>>>> Setting rule to NULL would lead to LSM based rule matching being skipped.
> > >>>>>>> retry:
> > >>>>>>>                 switch (i) {
> > >>>>>>
> > >>>>>> To solve this issue, there are multiple approaches we might take and I
> > >>>>>> would like some input from the community.
> > >>>>>>
> > >>>>>> The first proposed solution would be to change
> > >>>>>> selinux_audit_rule_init(). Remove the set to NULL bit and update the
> > >>>>>> rule pointer with cmpxchg.
> > >>>>>>
> > >>>>>>> diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c
> > >>>>>>> index a9f2bc8443bd..aa74b04ccaf7 100644
> > >>>>>>> --- a/security/selinux/ss/services.c
> > >>>>>>> +++ b/security/selinux/ss/services.c
> > >>>>>>> @@ -3297,10 +3297,9 @@ int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule)
> > >>>>>>>         struct type_datum *typedatum;
> > >>>>>>>         struct user_datum *userdatum;
> > >>>>>>>         struct selinux_audit_rule **rule = (struct selinux_audit_rule **)vrule;
> > >>>>>>> +       struct selinux_audit_rule *orig = rule;
> > >>>>>>>         int rc = 0;
> > >>>>>>>
> > >>>>>>> -       *rule = NULL;
> > >>>>>>> -
> > >>>>>>>         if (!state->initialized)
> > >>>>>>>                 return -EOPNOTSUPP;
> > >>>>>>>
> > >>>>>>> @@ -3382,7 +3381,8 @@ int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule)
> > >>>>>>>                 tmprule = NULL;
> > >>>>>>>         }
> > >>>>>>>
> > >>>>>>> -       *rule = tmprule;
> > >>>>>>> +       if (cmpxchg(rule, orig, tmprule) != orig)
> > >>>>>>> +               selinux_audit_rule_free(tmprule);
> > >>>>>>>
> > >>>>>>>         return rc;
> > >>>>>>>  }
> > >>>>>>
> > >>>>>> This solution would be an easy fix, but might influence other modules
> > >>>>>> calling selinux_audit_rule_init() directly or indirectly (on 4.19 LTS,
> > >>>>>> only auditfilter and IMA it seems). And it might be worth returning an
> > >>>>>> error code such as -EAGAIN.
> > >>>>>>
> > >>>>>> Or, we can access rules via RCU, similar to what we do on 5.10. This
> > >>>>>> could means more code change and testing.
> > >>>>>
> > >>>>> In the 4.19 kernel, IMA is doing a lazy LSM based policy rule update as
> > >>>>> needed.  IMA waits for selinux_audit_rule_init() to complete and
> > >>>>> shouldn't see NULL, unless there is an SELinux failure.  Before
> > >>>>> "fixing" the problem, what exactly is the problem?
> > >>>>
> > >>>> IMA runs on multiple cores. On 4.19 kernel, IMA do a lazy update on ALL
> > >>>> LSM based rules in one go without using RCU, which would still allow
> > >>>> other cores to access the rule being updated. And that's the issue.
> > >>>>
> > >>>> An example scenario would be:
> > >>>>  CPU1                    |       CPU2
> > >>>> opened a file and starts |
> > >>>> updating LSM based rules.        |
> > >>>>                          | opened a file and starts
> > >>>>                          | matching rules.
> > >>>>                          |
> > >>>> set a LSM based rule to NULL.    | access the same LSM based rule and
> > >>>>                                  | see that it's NULL.
> > >>>>
> > >>>> In this situation, CPU 2 would recognize this rule as not LSM based and
> > >>>> ignore the LSM part of the rule while matching.
> > >>>
> > >>> Would picking up just ima_lsm_update_rule(), without changing to the
> > >>> lsm policy update notifier, from upstream and calling it from
> > >>> ima_lsm_update_rules() resolve the RCU locking issue?  Or are there
> > >>> other issues?
> > >>
> > >> Hi Mimi,
> > >>
> > >> That should resolve the issue just fine. However, that'd mean having a
> > >> customized ima_lsm_update_rules as well as a customized
> > >> ima_lsm_update_rule functions on 4.19.y, potentially decrease the
> > >> maintainability. The customization of the two functions mentioned above
> > >> would be to ripoff the RCU part so that we can leave the other rule-list
> > >> access untouched.
> > >
> > > Hi Scott,
> > >
> > > Neither do we want to backport new functionality.  Perhaps it is only a
> > > subset of commit b16942455193 ("ima: use the lsm policy update
> > > notifier").
> > Yes we are able to backport part of this commit to get a mainline-like
> > behavior which would solve the issue at hand as well. However from a
> > maintenance standpoint it might be better to form a different commit and
> > not re-use the commit message from mainline commit.
>
> I assume that is fine, but cherry-pick the original commit with the -x
> option, so there is a correlation to the upstream commit.  The patch
> description would mention that the patch is a partial backport.

FWIW, if the changes in the backport are significant I tend to use the
following approach as it captures both the original commit as well as
the details on what changes were made and why.

>>>
ima: use the lsm policy update notifier

Really good explanation of what changes were necessary from the
original patch, including why they were necessary in the first place.

   commit b169424551930a9325f700f502802f4d515194e5
   Author: Janne Karhunen <janne.karhunen@gmail.com>
   Date:   Fri Jun 14 15:20:15 2019 +0300

   ima: use the lsm policy update notifier

   Don't do lazy policy updates while running the rule matching,
   run the updates as they happen.

   Depends on commit f242064c5df3 ("LSM: switch to blocking policy update
                                   notifiers")

   Signed-off-by: Janne Karhunen <janne.karhunen@gmail.com>
   Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
>>>

> > > Although your suggested solution of using "cmpxchg" isn't needed in
> > > recent kernels,  it wouldn't hurt either, right?  Assuming that Paul
> > > would be willing to accept it, that could be another option.
> > The cmpxchg part is merely to avoid multiple updates on the same rule
> > item. However I am not so sure why SELinux would set the rule to NULL.
> > My guess is that SELinux is trying to stop others from using an
> > invalidated rule?
> >
> > Would Paul be able to provide some insight? as well as some Comment on
> > the proposed solution?

I'm not comfortable with what might happen with a cmpxchg assignment
when multiple threads are in a related RCU critical section; I'm
assuming they would see the new value immediately (it is atomic,
right?), which I imagine could cause some consistency problems.
However, if someone who understands the intersection of cmpxchg/RCU
better than I do can assure me this isn't a problem we can consider
it.

How bad is the backport really?  Perhaps it is worth doing it to see
what it looks like?

-- 
paul-moore.com

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC] IMA LSM based rule race condition issue on 4.19 LTS
  2022-12-15 21:04               ` Paul Moore
@ 2022-12-16  2:36                 ` Guozihua (Scott)
  2022-12-16  3:04                   ` Paul Moore
  0 siblings, 1 reply; 30+ messages in thread
From: Guozihua (Scott) @ 2022-12-16  2:36 UTC (permalink / raw)
  To: Paul Moore, Mimi Zohar
  Cc: dmitry.kasatkin, sds, eparis, Greg KH, sashal, selinux,
	linux-integrity, stable

On 2022/12/16 5:04, Paul Moore wrote:
> On Thu, Dec 15, 2022 at 9:30 AM Mimi Zohar <zohar@linux.ibm.com> wrote:
>>
>> On Thu, 2022-12-15 at 21:15 +0800, Guozihua (Scott) wrote:
>>> On 2022/12/15 18:49, Mimi Zohar wrote:
>>>> On Thu, 2022-12-15 at 16:51 +0800, Guozihua (Scott) wrote:
>>>>> On 2022/12/14 20:19, Mimi Zohar wrote:
>>>>>> On Wed, 2022-12-14 at 09:33 +0800, Guozihua (Scott) wrote:
>>>>>>> On 2022/12/13 23:30, Mimi Zohar wrote:
>>>>>>>> On Fri, 2022-12-09 at 15:00 +0800, Guozihua (Scott) wrote:
>>>>>>>>> Hi community.
>>>>>>>>>
>>>>>>>>> Previously our team reported a race condition in IMA relates to LSM
>>>>>>>>> based rules which would case IMA to match files that should be filtered
>>>>>>>>> out under normal condition. The issue was originally analyzed and fixed
>>>>>>>>> on mainstream. The patch and the discussion could be found here:
>>>>>>>>> https://lore.kernel.org/all/20220921125804.59490-1-guozihua@huawei.com/
>>>>>>>>>
>>>>>>>>> After that, we did a regression test on 4.19 LTS and the same issue
>>>>>>>>> arises. Further analysis reveled that the issue is from a completely
>>>>>>>>> different cause.
>>>>>>>>>
>>>>>>>>> The cause is that selinux_audit_rule_init() would set the rule (which is
>>>>>>>>> a second level pointer) to NULL immediately after called. The relevant
>>>>>>>>> codes are as shown:
>>>>>>>>>
>>>>>>>>> security/selinux/ss/services.c:
>>>>>>>>>> int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule)
>>>>>>>>>> {
>>>>>>>>>>         struct selinux_state *state = &selinux_state;
>>>>>>>>>>         struct policydb *policydb = &state->ss->policydb;
>>>>>>>>>>         struct selinux_audit_rule *tmprule;
>>>>>>>>>>         struct role_datum *roledatum;
>>>>>>>>>>         struct type_datum *typedatum;
>>>>>>>>>>         struct user_datum *userdatum;
>>>>>>>>>>         struct selinux_audit_rule **rule = (struct selinux_audit_rule **)vrule;
>>>>>>>>>>         int rc = 0;
>>>>>>>>>>
>>>>>>>>>>         *rule = NULL;
>>>>>>>>> *rule is set to NULL here, which means the rule on IMA side is also NULL.
>>>>>>>>>>
>>>>>>>>>>         if (!state->initialized)
>>>>>>>>>>                 return -EOPNOTSUPP;
>>>>>>>>> ...
>>>>>>>>>> out:
>>>>>>>>>>         read_unlock(&state->ss->policy_rwlock);
>>>>>>>>>>
>>>>>>>>>>         if (rc) {
>>>>>>>>>>                 selinux_audit_rule_free(tmprule);
>>>>>>>>>>                 tmprule = NULL;
>>>>>>>>>>         }
>>>>>>>>>>
>>>>>>>>>>         *rule = tmprule;
>>>>>>>>> rule is updated at the end of the function.
>>>>>>>>>>
>>>>>>>>>>         return rc;
>>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> security/integrity/ima/ima_policy.c:
>>>>>>>>>> static bool ima_match_rules(struct ima_rule_entry *rule, struct inode *inode,
>>>>>>>>>>                             const struct cred *cred, u32 secid,
>>>>>>>>>>                             enum ima_hooks func, int mask)
>>>>>>>>>> {...
>>>>>>>>>> for (i = 0; i < MAX_LSM_RULES; i++) {
>>>>>>>>>>                 int rc = 0;
>>>>>>>>>>                 u32 osid;
>>>>>>>>>>                 int retried = 0;
>>>>>>>>>>
>>>>>>>>>>                 if (!rule->lsm[i].rule)
>>>>>>>>>>                         continue;
>>>>>>>>> Setting rule to NULL would lead to LSM based rule matching being skipped.
>>>>>>>>>> retry:
>>>>>>>>>>                 switch (i) {
>>>>>>>>>
>>>>>>>>> To solve this issue, there are multiple approaches we might take and I
>>>>>>>>> would like some input from the community.
>>>>>>>>>
>>>>>>>>> The first proposed solution would be to change
>>>>>>>>> selinux_audit_rule_init(). Remove the set to NULL bit and update the
>>>>>>>>> rule pointer with cmpxchg.
>>>>>>>>>
>>>>>>>>>> diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c
>>>>>>>>>> index a9f2bc8443bd..aa74b04ccaf7 100644
>>>>>>>>>> --- a/security/selinux/ss/services.c
>>>>>>>>>> +++ b/security/selinux/ss/services.c
>>>>>>>>>> @@ -3297,10 +3297,9 @@ int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule)
>>>>>>>>>>         struct type_datum *typedatum;
>>>>>>>>>>         struct user_datum *userdatum;
>>>>>>>>>>         struct selinux_audit_rule **rule = (struct selinux_audit_rule **)vrule;
>>>>>>>>>> +       struct selinux_audit_rule *orig = rule;
>>>>>>>>>>         int rc = 0;
>>>>>>>>>>
>>>>>>>>>> -       *rule = NULL;
>>>>>>>>>> -
>>>>>>>>>>         if (!state->initialized)
>>>>>>>>>>                 return -EOPNOTSUPP;
>>>>>>>>>>
>>>>>>>>>> @@ -3382,7 +3381,8 @@ int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule)
>>>>>>>>>>                 tmprule = NULL;
>>>>>>>>>>         }
>>>>>>>>>>
>>>>>>>>>> -       *rule = tmprule;
>>>>>>>>>> +       if (cmpxchg(rule, orig, tmprule) != orig)
>>>>>>>>>> +               selinux_audit_rule_free(tmprule);
>>>>>>>>>>
>>>>>>>>>>         return rc;
>>>>>>>>>>  }
>>>>>>>>>
>>>>>>>>> This solution would be an easy fix, but might influence other modules
>>>>>>>>> calling selinux_audit_rule_init() directly or indirectly (on 4.19 LTS,
>>>>>>>>> only auditfilter and IMA it seems). And it might be worth returning an
>>>>>>>>> error code such as -EAGAIN.
>>>>>>>>>
>>>>>>>>> Or, we can access rules via RCU, similar to what we do on 5.10. This
>>>>>>>>> could means more code change and testing.
>>>>>>>>
>>>>>>>> In the 4.19 kernel, IMA is doing a lazy LSM based policy rule update as
>>>>>>>> needed.  IMA waits for selinux_audit_rule_init() to complete and
>>>>>>>> shouldn't see NULL, unless there is an SELinux failure.  Before
>>>>>>>> "fixing" the problem, what exactly is the problem?
>>>>>>>
>>>>>>> IMA runs on multiple cores. On 4.19 kernel, IMA do a lazy update on ALL
>>>>>>> LSM based rules in one go without using RCU, which would still allow
>>>>>>> other cores to access the rule being updated. And that's the issue.
>>>>>>>
>>>>>>> An example scenario would be:
>>>>>>>  CPU1                    |       CPU2
>>>>>>> opened a file and starts |
>>>>>>> updating LSM based rules.        |
>>>>>>>                          | opened a file and starts
>>>>>>>                          | matching rules.
>>>>>>>                          |
>>>>>>> set a LSM based rule to NULL.    | access the same LSM based rule and
>>>>>>>                                  | see that it's NULL.
>>>>>>>
>>>>>>> In this situation, CPU 2 would recognize this rule as not LSM based and
>>>>>>> ignore the LSM part of the rule while matching.
>>>>>>
>>>>>> Would picking up just ima_lsm_update_rule(), without changing to the
>>>>>> lsm policy update notifier, from upstream and calling it from
>>>>>> ima_lsm_update_rules() resolve the RCU locking issue?  Or are there
>>>>>> other issues?
>>>>>
>>>>> Hi Mimi,
>>>>>
>>>>> That should resolve the issue just fine. However, that'd mean having a
>>>>> customized ima_lsm_update_rules as well as a customized
>>>>> ima_lsm_update_rule functions on 4.19.y, potentially decrease the
>>>>> maintainability. The customization of the two functions mentioned above
>>>>> would be to ripoff the RCU part so that we can leave the other rule-list
>>>>> access untouched.
>>>>
>>>> Hi Scott,
>>>>
>>>> Neither do we want to backport new functionality.  Perhaps it is only a
>>>> subset of commit b16942455193 ("ima: use the lsm policy update
>>>> notifier").
>>> Yes we are able to backport part of this commit to get a mainline-like
>>> behavior which would solve the issue at hand as well. However from a
>>> maintenance standpoint it might be better to form a different commit and
>>> not re-use the commit message from mainline commit.
>>
>> I assume that is fine, but cherry-pick the original commit with the -x
>> option, so there is a correlation to the upstream commit.  The patch
>> description would mention that the patch is a partial backport.
> 
> FWIW, if the changes in the backport are significant I tend to use the
> following approach as it captures both the original commit as well as
> the details on what changes were made and why.
> 
>>>>
> ima: use the lsm policy update notifier
> 
> Really good explanation of what changes were necessary from the
> original patch, including why they were necessary in the first place.
> 
>    commit b169424551930a9325f700f502802f4d515194e5
>    Author: Janne Karhunen <janne.karhunen@gmail.com>
>    Date:   Fri Jun 14 15:20:15 2019 +0300
> 
>    ima: use the lsm policy update notifier
> 
>    Don't do lazy policy updates while running the rule matching,
>    run the updates as they happen.
> 
>    Depends on commit f242064c5df3 ("LSM: switch to blocking policy update
>                                    notifiers")
> 
>    Signed-off-by: Janne Karhunen <janne.karhunen@gmail.com>
>    Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
>>>>
Thanks for the suggestion Mimi and Paul.
> 
>>>> Although your suggested solution of using "cmpxchg" isn't needed in
>>>> recent kernels,  it wouldn't hurt either, right?  Assuming that Paul
>>>> would be willing to accept it, that could be another option.
>>> The cmpxchg part is merely to avoid multiple updates on the same rule
>>> item. However I am not so sure why SELinux would set the rule to NULL.
>>> My guess is that SELinux is trying to stop others from using an
>>> invalidated rule?
>>>
>>> Would Paul be able to provide some insight? as well as some Comment on
>>> the proposed solution?
> 
> I'm not comfortable with what might happen with a cmpxchg assignment
> when multiple threads are in a related RCU critical section; I'm
> assuming they would see the new value immediately (it is atomic,
> right?), which I imagine could cause some consistency problems.
> However, if someone who understands the intersection of cmpxchg/RCU
> better than I do can assure me this isn't a problem we can consider
> it.
> 
> How bad is the backport really?  Perhaps it is worth doing it to see
> what it looks like?
> 
It might not be that bad, I'll try to post a version next Monday.

-- 
Best
GUO Zihua


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC] IMA LSM based rule race condition issue on 4.19 LTS
  2022-12-16  2:36                 ` Guozihua (Scott)
@ 2022-12-16  3:04                   ` Paul Moore
  2022-12-19  7:10                     ` Guozihua (Scott)
  2023-01-06  1:05                     ` Mimi Zohar
  0 siblings, 2 replies; 30+ messages in thread
From: Paul Moore @ 2022-12-16  3:04 UTC (permalink / raw)
  To: Guozihua (Scott)
  Cc: Mimi Zohar, dmitry.kasatkin, sds, eparis, Greg KH, sashal,
	selinux, linux-integrity, stable

On Thu, Dec 15, 2022 at 9:36 PM Guozihua (Scott) <guozihua@huawei.com> wrote:
> On 2022/12/16 5:04, Paul Moore wrote:

...

> > How bad is the backport really?  Perhaps it is worth doing it to see
> > what it looks like?
> >
> It might not be that bad, I'll try to post a version next Monday.

Thanks for giving it a shot.

-- 
paul-moore.com

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC] IMA LSM based rule race condition issue on 4.19 LTS
  2022-12-16  3:04                   ` Paul Moore
@ 2022-12-19  7:10                     ` Guozihua (Scott)
  2022-12-19 13:11                       ` Mimi Zohar
  2023-01-06  1:05                     ` Mimi Zohar
  1 sibling, 1 reply; 30+ messages in thread
From: Guozihua (Scott) @ 2022-12-19  7:10 UTC (permalink / raw)
  To: Paul Moore
  Cc: Mimi Zohar, dmitry.kasatkin, sds, eparis, Greg KH, sashal,
	selinux, linux-integrity, stable

On 2022/12/16 11:04, Paul Moore wrote:
> On Thu, Dec 15, 2022 at 9:36 PM Guozihua (Scott) <guozihua@huawei.com> wrote:
>> On 2022/12/16 5:04, Paul Moore wrote:
> 
> ...
> 
>>> How bad is the backport really?  Perhaps it is worth doing it to see
>>> what it looks like?
>>>
>> It might not be that bad, I'll try to post a version next Monday.
> 
> Thanks for giving it a shot.
> 
When I am trying a partial backport of b16942455193 ("ima: use the lsm
policy update notifier"), I took a closer look into it and if we rip off
the RCU and the notifier part, there would be a potential UAF issue when
multiple processes are calling ima_lsm_update_rule() and
ima_match_rules() at the same time. ima_lsm_update_rule() would free the
old rule if the new rule is successfully copied and initialized, leading
to ima_match_rules() accessing a freed rule.

To reserve the mainline solution, we would have to either introduce RCU
for rule access, which would work better with notifier mechanism or the
same rule would be updated multiple times, or we would have to introduce
a lock for LSM based rule update.

-- 
Best
GUO Zihua


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC] IMA LSM based rule race condition issue on 4.19 LTS
  2022-12-19  7:10                     ` Guozihua (Scott)
@ 2022-12-19 13:11                       ` Mimi Zohar
  2022-12-20  1:11                         ` Guozihua (Scott)
  0 siblings, 1 reply; 30+ messages in thread
From: Mimi Zohar @ 2022-12-19 13:11 UTC (permalink / raw)
  To: Guozihua (Scott), Paul Moore
  Cc: dmitry.kasatkin, sds, eparis, Greg KH, sashal, selinux,
	linux-integrity, stable

On Mon, 2022-12-19 at 15:10 +0800, Guozihua (Scott) wrote:
> On 2022/12/16 11:04, Paul Moore wrote:
> > On Thu, Dec 15, 2022 at 9:36 PM Guozihua (Scott) <guozihua@huawei.com> wrote:
> >> On 2022/12/16 5:04, Paul Moore wrote:
> > 
> > ...
> > 
> >>> How bad is the backport really?  Perhaps it is worth doing it to see
> >>> what it looks like?
> >>>
> >> It might not be that bad, I'll try to post a version next Monday.
> > 
> > Thanks for giving it a shot.
> > 
> When I am trying a partial backport of b16942455193 ("ima: use the lsm
> policy update notifier"), I took a closer look into it and if we rip off
> the RCU and the notifier part, there would be a potential UAF issue when
> multiple processes are calling ima_lsm_update_rule() and
> ima_match_rules() at the same time. ima_lsm_update_rule() would free the
> old rule if the new rule is successfully copied and initialized, leading
> to ima_match_rules() accessing a freed rule.
> 
> To reserve the mainline solution, we would have to either introduce RCU
> for rule access, which would work better with notifier mechanism or the
> same rule would be updated multiple times, or we would have to introduce
> a lock for LSM based rule update.

Even with the RCU changes, the rules will be updated multiple times. 
With your "ima: Handle -ESTALE returned by ima_filter_rule_match()"
patch, upstream makes a single local copy of the rule to avoid updating
it multiple times.  Without the notifier, it's updating all the rules.

Perhaps an atomic variable to detect if the rules are already being
updated would suffice.  If the atomic variable is set, make a single
local copy of the rule.

-- 
thanks,

Mimi



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC] IMA LSM based rule race condition issue on 4.19 LTS
  2022-12-19 13:11                       ` Mimi Zohar
@ 2022-12-20  1:11                         ` Guozihua (Scott)
  2022-12-21 10:51                           ` Guozihua (Scott)
  0 siblings, 1 reply; 30+ messages in thread
From: Guozihua (Scott) @ 2022-12-20  1:11 UTC (permalink / raw)
  To: Mimi Zohar, Paul Moore
  Cc: dmitry.kasatkin, sds, eparis, Greg KH, sashal, selinux,
	linux-integrity, stable

On 2022/12/19 21:11, Mimi Zohar wrote:
> On Mon, 2022-12-19 at 15:10 +0800, Guozihua (Scott) wrote:
>> On 2022/12/16 11:04, Paul Moore wrote:
>>> On Thu, Dec 15, 2022 at 9:36 PM Guozihua (Scott) <guozihua@huawei.com> wrote:
>>>> On 2022/12/16 5:04, Paul Moore wrote:
>>>
>>> ...
>>>
>>>>> How bad is the backport really?  Perhaps it is worth doing it to see
>>>>> what it looks like?
>>>>>
>>>> It might not be that bad, I'll try to post a version next Monday.
>>>
>>> Thanks for giving it a shot.
>>>
>> When I am trying a partial backport of b16942455193 ("ima: use the lsm
>> policy update notifier"), I took a closer look into it and if we rip off
>> the RCU and the notifier part, there would be a potential UAF issue when
>> multiple processes are calling ima_lsm_update_rule() and
>> ima_match_rules() at the same time. ima_lsm_update_rule() would free the
>> old rule if the new rule is successfully copied and initialized, leading
>> to ima_match_rules() accessing a freed rule.
>>
>> To reserve the mainline solution, we would have to either introduce RCU
>> for rule access, which would work better with notifier mechanism or the
>> same rule would be updated multiple times, or we would have to introduce
>> a lock for LSM based rule update.
> 
> Even with the RCU changes, the rules will be updated multiple times. 
> With your "ima: Handle -ESTALE returned by ima_filter_rule_match()"
> patch, upstream makes a single local copy of the rule to avoid updating
> it multiple times.  Without the notifier, it's updating all the rules.
That's true. However, in the mainline solution, we are only making a
local copy of the rule. In 4.19, because of the lazy update mechanism,
we are replacing the rule on the rule list multiple times and is trying
to free the original rule.
> 
> Perhaps an atomic variable to detect if the rules are already being
> updated would suffice.  If the atomic variable is set, make a single
> local copy of the rule.
That should do it. I'll send a patch set soon.

-- 
Best
GUO Zihua


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC] IMA LSM based rule race condition issue on 4.19 LTS
  2022-12-20  1:11                         ` Guozihua (Scott)
@ 2022-12-21 10:51                           ` Guozihua (Scott)
  2022-12-23  8:04                             ` Guozihua (Scott)
  0 siblings, 1 reply; 30+ messages in thread
From: Guozihua (Scott) @ 2022-12-21 10:51 UTC (permalink / raw)
  To: Mimi Zohar, Paul Moore
  Cc: dmitry.kasatkin, sds, eparis, Greg KH, sashal, selinux,
	linux-integrity, stable, luhuaxin

On 2022/12/20 9:11, Guozihua (Scott) wrote:
> On 2022/12/19 21:11, Mimi Zohar wrote:
>> On Mon, 2022-12-19 at 15:10 +0800, Guozihua (Scott) wrote:
>>> On 2022/12/16 11:04, Paul Moore wrote:
>>>> On Thu, Dec 15, 2022 at 9:36 PM Guozihua (Scott) <guozihua@huawei.com> wrote:
>>>>> On 2022/12/16 5:04, Paul Moore wrote:
>>>>
>>>> ...
>>>>
>>>>>> How bad is the backport really?  Perhaps it is worth doing it to see
>>>>>> what it looks like?
>>>>>>
>>>>> It might not be that bad, I'll try to post a version next Monday.
>>>>
>>>> Thanks for giving it a shot.
>>>>
>>> When I am trying a partial backport of b16942455193 ("ima: use the lsm
>>> policy update notifier"), I took a closer look into it and if we rip off
>>> the RCU and the notifier part, there would be a potential UAF issue when
>>> multiple processes are calling ima_lsm_update_rule() and
>>> ima_match_rules() at the same time. ima_lsm_update_rule() would free the
>>> old rule if the new rule is successfully copied and initialized, leading
>>> to ima_match_rules() accessing a freed rule.
>>>
>>> To reserve the mainline solution, we would have to either introduce RCU
>>> for rule access, which would work better with notifier mechanism or the
>>> same rule would be updated multiple times, or we would have to introduce
>>> a lock for LSM based rule update.
>>
>> Even with the RCU changes, the rules will be updated multiple times. 
>> With your "ima: Handle -ESTALE returned by ima_filter_rule_match()"
>> patch, upstream makes a single local copy of the rule to avoid updating
>> it multiple times.  Without the notifier, it's updating all the rules.
> That's true. However, in the mainline solution, we are only making a
> local copy of the rule. In 4.19, because of the lazy update mechanism,
> we are replacing the rule on the rule list multiple times and is trying
> to free the original rule.
>>
>> Perhaps an atomic variable to detect if the rules are already being
>> updated would suffice.  If the atomic variable is set, make a single
>> local copy of the rule.
> That should do it. I'll send a patch set soon.
> 
Including Huaxin Lu in the loop. Sorry for forgotten about it for quite
some time.

I tried the backported solution, it seems that it's causing RCU stall.
It seems on 4.19.y IMA is already accessing rules through RCU. Still
debugging it.

The backported patches are as below

> diff --git a/security/integrity/ima/ima_policy.c b/security/integrity/ima/ima_policy.c
> index 0819b7600649..20349ef6383b 100644
> --- a/security/integrity/ima/ima_policy.c
> +++ b/security/integrity/ima/ima_policy.c
> @@ -353,6 +353,8 @@ static void ima_lsm_update_rules(void)
>         }
>  }
>  
> +static bool rule_updating = false;
> +
>  /**
>   * ima_match_rules - determine whether an inode matches the measure rule.
>   * @rule: a pointer to a rule
> @@ -369,6 +371,9 @@ static bool ima_match_rules(struct ima_rule_entry *rule, struct inode *inode,
>                             enum ima_hooks func, int mask)
>  {
>         int i;
> +       bool result = false;
> +       struct ima_rule_entry *lsm_rule = rule;
> +       bool rule_reinitialized = false;
>  
>         if ((rule->flags & IMA_FUNC) &&
>             (rule->func != func && func != POST_SETATTR))
> @@ -408,7 +413,7 @@ static bool ima_match_rules(struct ima_rule_entry *rule, struct inode *inode,
>                 u32 osid;
>                 int retried = 0;
>  
> -               if (!rule->lsm[i].rule)
> +               if (!lsm_rule->lsm[i].rule)
>                         continue;
>  retry:
>                 switch (i) {
> @@ -417,31 +422,49 @@ static bool ima_match_rules(struct ima_rule_entry *rule, struct inode *inode,
>                 case LSM_OBJ_TYPE:
>                         security_inode_getsecid(inode, &osid);
>                         rc = security_filter_rule_match(osid,
> -                                                       rule->lsm[i].type,
> +                                                       lsm_rule->lsm[i].type,
>                                                         Audit_equal,
> -                                                       rule->lsm[i].rule,
> +                                                       lsm_rule->lsm[i].rule,
>                                                         NULL);
>                         break;
>                 case LSM_SUBJ_USER:
>                 case LSM_SUBJ_ROLE:
>                 case LSM_SUBJ_TYPE:
>                         rc = security_filter_rule_match(secid,
> -                                                       rule->lsm[i].type,
> +                                                       lsm_rule->lsm[i].type,
>                                                         Audit_equal,
> -                                                       rule->lsm[i].rule,
> +                                                       lsm_rule->lsm[i].rule,
>                                                         NULL);
>                 default:
>                         break;
>                 }>                 if ((rc < 0) && (!retried)) {
>                         retried = 1;
> -                       ima_lsm_update_rules();
> -                       goto retry;
> +                       if (READ_ONCE(rule_updating)) {
> +                               lsm_rule = ima_lsm_copy_rule(rule);
> +                               if (lsm_rule) {
> +                                       rule_reinitialized = true;
> +                                       goto retry;
> +                               }
> +                       } else {
> +                               WRITE_ONCE(rule_updating, true);
> +                               ima_lsm_update_rules();
> +                               WRITE_ONCE(rule_updating, false);
> +                               goto retry;
> +                       }
> +               }
> +               if (!rc) {
> +                       result = false;
> +                       goto out;
>                 }
> -               if (!rc)
> -                       return false;
>         }
> -       return true;
> +       result = true;
> +
> +out:
> +       if (rule_reinitialized) {
> +               ima_lsm_free_rule(lsm_rule);
> +       }
> +       return result;
>  }
> diff --git a/security/integrity/ima/ima_policy.c b/security/integrity/ima/ima_policy.c
> index b2dadff3626b..0819b7600649 100644
> --- a/security/integrity/ima/ima_policy.c
> +++ b/security/integrity/ima/ima_policy.c
> @@ -256,27 +256,99 @@ static void ima_free_rule(struct ima_rule_entry *entry)
>         kfree(entry);
>  }
>  
> +static void ima_lsm_free_rule(struct ima_rule_entry *entry)
> +{
> +       int i;
> +
> +       for (i = 0; i < MAX_LSM_RULES; i++) {
> +               kfree(entry->lsm[i].rule);
> +               kfree(entry->lsm[i].args_p);
> +       }
> +       kfree(entry);
> +}
> +
> +static struct ima_rule_entry *ima_lsm_copy_rule(struct ima_rule_entry *entry)
> +{
> +       struct ima_rule_entry *nentry;
> +       int i, result;
> +
> +       nentry = kmalloc(sizeof(*nentry), GFP_KERNEL);
> +       if (!nentry)
> +               return NULL;
> +
> +       /*
> +       * Immutable elements are copied over as pointers and data; only
> +       * lsm rules can change
> +       */
> +       memcpy(nentry, entry, sizeof(*nentry));
> +       memset(nentry->lsm, 0, FIELD_SIZEOF(struct ima_rule_entry, lsm));
> +
> +       for (i = 0; i < MAX_LSM_RULES; i++) {
> +               if (!entry->lsm[i].rule)
> +                       continue;
> +
> +               nentry->lsm[i].type = entry->lsm[i].type;
> +               nentry->lsm[i].args_p = kstrdup(entry->lsm[i].args_p,
> +                                              GFP_KERNEL);
> +               if (!nentry->lsm[i].args_p)
> +                       goto out_err;
> +
> +               result = security_filter_rule_init(nentry->lsm[i].type,
> +                                                 Audit_equal,
> +                                                 nentry->lsm[i].args_p,
> +                                                 &nentry->lsm[i].rule);
> +               if (result == -EINVAL)
> +                       pr_warn("ima: rule for LSM \'%d\' is undefined\n",
> +                               entry->lsm[i].type);
> +       }
> +       return nentry;
> +
> +out_err:
> +       ima_lsm_free_rule(nentry);
> +       return NULL;
> +}
> +
> +static int ima_lsm_update_rule(struct ima_rule_entry *entry)
> +{
> +       struct ima_rule_entry *nentry;> +
> +       nentry = ima_lsm_copy_rule(entry);
> +       if (!nentry)
> +               return -ENOMEM;
> +
> +       list_replace_rcu(&entry->list, &nentry->list);
> +       synchronize_rcu();
> +       ima_lsm_free_rule(entry);
> +
> +       return 0;
> +}
> +
>  /*
>   * The LSM policy can be reloaded, leaving the IMA LSM based rules referring
>   * to the old, stale LSM policy.  Update the IMA LSM based rules to reflect
> - * the reloaded LSM policy.  We assume the rules still exist; and BUG_ON() if
> - * they don't.
> + * the reloaded LSM policy.
>   */
>  static void ima_lsm_update_rules(void)
>  {
> -       struct ima_rule_entry *entry;
> -       int result;
> -       int i;
> +       struct ima_rule_entry *entry, *e;
> +       int i, result, needs_update;
>  
> -       list_for_each_entry(entry, &ima_policy_rules, list) {
> +       list_for_each_entry_safe(entry, e, &ima_policy_rules, list) {
> +               needs_update = 0;
>                 for (i = 0; i < MAX_LSM_RULES; i++) {
> -                       if (!entry->lsm[i].rule)
> -                               continue;> -                       result =
security_filter_rule_init(entry->lsm[i].type,
> -                                                          Audit_equal,
> -                                                          entry->lsm[i].args_p,
> -                                                          &entry->lsm[i].rule);
> -                       BUG_ON(!entry->lsm[i].rule);
> +                       if (entry->lsm[i].rule) {
> +                               needs_update = 1;
> +                               break;
> +                       }
> +               }
> +               if (!needs_update)
> +                       continue;
> +
> +               result = ima_lsm_update_rule(entry);
> +               if (result) {
> +                       pr_err("ima: lsm rule update error %d\n",
> +                               result);
> +                       return;
>                 }
>         }
>  }

-- 
Best
GUO Zihua


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC] IMA LSM based rule race condition issue on 4.19 LTS
  2022-12-21 10:51                           ` Guozihua (Scott)
@ 2022-12-23  8:04                             ` Guozihua (Scott)
  2022-12-24  3:41                               ` Guozihua (Scott)
  0 siblings, 1 reply; 30+ messages in thread
From: Guozihua (Scott) @ 2022-12-23  8:04 UTC (permalink / raw)
  To: Mimi Zohar, Paul Moore
  Cc: dmitry.kasatkin, sds, eparis, Greg KH, sashal, selinux,
	linux-integrity, stable, luhuaxin

On 2022/12/21 18:51, Guozihua (Scott) wrote:
> On 2022/12/20 9:11, Guozihua (Scott) wrote:
>> On 2022/12/19 21:11, Mimi Zohar wrote:
>>> On Mon, 2022-12-19 at 15:10 +0800, Guozihua (Scott) wrote:
>>>> On 2022/12/16 11:04, Paul Moore wrote:
>>>>> On Thu, Dec 15, 2022 at 9:36 PM Guozihua (Scott) <guozihua@huawei.com> wrote:
>>>>>> On 2022/12/16 5:04, Paul Moore wrote:
>>>>>
>>>>> ...
>>>>>
>>>>>>> How bad is the backport really?  Perhaps it is worth doing it to see
>>>>>>> what it looks like?
>>>>>>>
>>>>>> It might not be that bad, I'll try to post a version next Monday.
>>>>>
>>>>> Thanks for giving it a shot.
>>>>>
>>>> When I am trying a partial backport of b16942455193 ("ima: use the lsm
>>>> policy update notifier"), I took a closer look into it and if we rip off
>>>> the RCU and the notifier part, there would be a potential UAF issue when
>>>> multiple processes are calling ima_lsm_update_rule() and
>>>> ima_match_rules() at the same time. ima_lsm_update_rule() would free the
>>>> old rule if the new rule is successfully copied and initialized, leading
>>>> to ima_match_rules() accessing a freed rule.
>>>>
>>>> To reserve the mainline solution, we would have to either introduce RCU
>>>> for rule access, which would work better with notifier mechanism or the
>>>> same rule would be updated multiple times, or we would have to introduce
>>>> a lock for LSM based rule update.
>>>
>>> Even with the RCU changes, the rules will be updated multiple times. 
>>> With your "ima: Handle -ESTALE returned by ima_filter_rule_match()"
>>> patch, upstream makes a single local copy of the rule to avoid updating
>>> it multiple times.  Without the notifier, it's updating all the rules.
>> That's true. However, in the mainline solution, we are only making a
>> local copy of the rule. In 4.19, because of the lazy update mechanism,
>> we are replacing the rule on the rule list multiple times and is trying
>> to free the original rule.
>>>
>>> Perhaps an atomic variable to detect if the rules are already being
>>> updated would suffice.  If the atomic variable is set, make a single
>>> local copy of the rule.
>> That should do it. I'll send a patch set soon.
>>
> Including Huaxin Lu in the loop. Sorry for forgotten about it for quite
> some time.
> 
> I tried the backported solution, it seems that it's causing RCU stall.
> It seems on 4.19.y IMA is already accessing rules through RCU. Still
> debugging it.
It seems that after the backport, a NULL pointer deference pops out.
I'll have to look into it.

-- 
Best
GUO Zihua


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC] IMA LSM based rule race condition issue on 4.19 LTS
  2022-12-23  8:04                             ` Guozihua (Scott)
@ 2022-12-24  3:41                               ` Guozihua (Scott)
  2022-12-24  7:47                                 ` Guozihua (Scott)
  0 siblings, 1 reply; 30+ messages in thread
From: Guozihua (Scott) @ 2022-12-24  3:41 UTC (permalink / raw)
  To: Mimi Zohar, Paul Moore
  Cc: dmitry.kasatkin, sds, eparis, Greg KH, sashal, selinux,
	linux-integrity, stable, luhuaxin

On 2022/12/23 16:04, Guozihua (Scott) wrote:
> On 2022/12/21 18:51, Guozihua (Scott) wrote:
>> On 2022/12/20 9:11, Guozihua (Scott) wrote:
>>> On 2022/12/19 21:11, Mimi Zohar wrote:
>>>> On Mon, 2022-12-19 at 15:10 +0800, Guozihua (Scott) wrote:
>>>>> On 2022/12/16 11:04, Paul Moore wrote:
>>>>>> On Thu, Dec 15, 2022 at 9:36 PM Guozihua (Scott) <guozihua@huawei.com> wrote:
>>>>>>> On 2022/12/16 5:04, Paul Moore wrote:
>>>>>>
>>>>>> ...
>>>>>>
>>>>>>>> How bad is the backport really?  Perhaps it is worth doing it to see
>>>>>>>> what it looks like?
>>>>>>>>
>>>>>>> It might not be that bad, I'll try to post a version next Monday.
>>>>>>
>>>>>> Thanks for giving it a shot.
>>>>>>
>>>>> When I am trying a partial backport of b16942455193 ("ima: use the lsm
>>>>> policy update notifier"), I took a closer look into it and if we rip off
>>>>> the RCU and the notifier part, there would be a potential UAF issue when
>>>>> multiple processes are calling ima_lsm_update_rule() and
>>>>> ima_match_rules() at the same time. ima_lsm_update_rule() would free the
>>>>> old rule if the new rule is successfully copied and initialized, leading
>>>>> to ima_match_rules() accessing a freed rule.
>>>>>
>>>>> To reserve the mainline solution, we would have to either introduce RCU
>>>>> for rule access, which would work better with notifier mechanism or the
>>>>> same rule would be updated multiple times, or we would have to introduce
>>>>> a lock for LSM based rule update.
>>>>
>>>> Even with the RCU changes, the rules will be updated multiple times. 
>>>> With your "ima: Handle -ESTALE returned by ima_filter_rule_match()"
>>>> patch, upstream makes a single local copy of the rule to avoid updating
>>>> it multiple times.  Without the notifier, it's updating all the rules.
>>> That's true. However, in the mainline solution, we are only making a
>>> local copy of the rule. In 4.19, because of the lazy update mechanism,
>>> we are replacing the rule on the rule list multiple times and is trying
>>> to free the original rule.
>>>>
>>>> Perhaps an atomic variable to detect if the rules are already being
>>>> updated would suffice.  If the atomic variable is set, make a single
>>>> local copy of the rule.
>>> That should do it. I'll send a patch set soon.
>>>
>> Including Huaxin Lu in the loop. Sorry for forgotten about it for quite
>> some time.
>>
>> I tried the backported solution, it seems that it's causing RCU stall.
>> It seems on 4.19.y IMA is already accessing rules through RCU. Still
>> debugging it.
> It seems that after the backport, a NULL pointer deference pops out.
> I'll have to look into it.
> 
It seems that any other means except from a full RCU or locking won't be
able to prevent race condition between policy update and rule match. Any
other suggestions?

-- 
Best
GUO Zihua


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC] IMA LSM based rule race condition issue on 4.19 LTS
  2022-12-24  3:41                               ` Guozihua (Scott)
@ 2022-12-24  7:47                                 ` Guozihua (Scott)
  0 siblings, 0 replies; 30+ messages in thread
From: Guozihua (Scott) @ 2022-12-24  7:47 UTC (permalink / raw)
  To: Mimi Zohar, Paul Moore
  Cc: dmitry.kasatkin, sds, eparis, Greg KH, sashal, selinux,
	linux-integrity, stable, luhuaxin

On 2022/12/24 11:41, Guozihua (Scott) wrote:
> On 2022/12/23 16:04, Guozihua (Scott) wrote:
>> On 2022/12/21 18:51, Guozihua (Scott) wrote:
>>> On 2022/12/20 9:11, Guozihua (Scott) wrote:
>>>> On 2022/12/19 21:11, Mimi Zohar wrote:
>>>>> On Mon, 2022-12-19 at 15:10 +0800, Guozihua (Scott) wrote:
>>>>>> On 2022/12/16 11:04, Paul Moore wrote:
>>>>>>> On Thu, Dec 15, 2022 at 9:36 PM Guozihua (Scott) <guozihua@huawei.com> wrote:
>>>>>>>> On 2022/12/16 5:04, Paul Moore wrote:
>>>>>>>
>>>>>>> ...
>>>>>>>
>>>>>>>>> How bad is the backport really?  Perhaps it is worth doing it to see
>>>>>>>>> what it looks like?
>>>>>>>>>
>>>>>>>> It might not be that bad, I'll try to post a version next Monday.
>>>>>>>
>>>>>>> Thanks for giving it a shot.
>>>>>>>
>>>>>> When I am trying a partial backport of b16942455193 ("ima: use the lsm
>>>>>> policy update notifier"), I took a closer look into it and if we rip off
>>>>>> the RCU and the notifier part, there would be a potential UAF issue when
>>>>>> multiple processes are calling ima_lsm_update_rule() and
>>>>>> ima_match_rules() at the same time. ima_lsm_update_rule() would free the
>>>>>> old rule if the new rule is successfully copied and initialized, leading
>>>>>> to ima_match_rules() accessing a freed rule.
>>>>>>
>>>>>> To reserve the mainline solution, we would have to either introduce RCU
>>>>>> for rule access, which would work better with notifier mechanism or the
>>>>>> same rule would be updated multiple times, or we would have to introduce
>>>>>> a lock for LSM based rule update.
>>>>>
>>>>> Even with the RCU changes, the rules will be updated multiple times. 
>>>>> With your "ima: Handle -ESTALE returned by ima_filter_rule_match()"
>>>>> patch, upstream makes a single local copy of the rule to avoid updating
>>>>> it multiple times.  Without the notifier, it's updating all the rules.
>>>> That's true. However, in the mainline solution, we are only making a
>>>> local copy of the rule. In 4.19, because of the lazy update mechanism,
>>>> we are replacing the rule on the rule list multiple times and is trying
>>>> to free the original rule.
>>>>>
>>>>> Perhaps an atomic variable to detect if the rules are already being
>>>>> updated would suffice.  If the atomic variable is set, make a single
>>>>> local copy of the rule.
>>>> That should do it. I'll send a patch set soon.
>>>>
>>> Including Huaxin Lu in the loop. Sorry for forgotten about it for quite
>>> some time.
>>>
>>> I tried the backported solution, it seems that it's causing RCU stall.
>>> It seems on 4.19.y IMA is already accessing rules through RCU. Still
>>> debugging it.
>> It seems that after the backport, a NULL pointer deference pops out.
>> I'll have to look into it.
>>
> It seems that any other means except from a full RCU or locking won't be
> able to prevent race condition between policy update and rule match. Any
> other suggestions?
Correction: full RCU won't be enough as RCU won't work well without
notifier. My suggestion would be to backport the notifier mechanism.

-- 
Best
GUO Zihua


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC] IMA LSM based rule race condition issue on 4.19 LTS
  2022-12-16  3:04                   ` Paul Moore
  2022-12-19  7:10                     ` Guozihua (Scott)
@ 2023-01-06  1:05                     ` Mimi Zohar
  1 sibling, 0 replies; 30+ messages in thread
From: Mimi Zohar @ 2023-01-06  1:05 UTC (permalink / raw)
  To: Paul Moore, Guozihua (Scott)
  Cc: dmitry.kasatkin, sds, eparis, Greg KH, sashal, selinux,
	linux-integrity, stable

On Thu, 2022-12-15 at 22:04 -0500, Paul Moore wrote:
> On Thu, Dec 15, 2022 at 9:36 PM Guozihua (Scott) <guozihua@huawei.com> wrote:
> > On 2022/12/16 5:04, Paul Moore wrote:
> 
> ...
> 
> > > How bad is the backport really?  Perhaps it is worth doing it to see
> > > what it looks like?
> > >
> > It might not be that bad, I'll try to post a version next Monday.
> 
> Thanks for giving it a shot.

FYI, in the end backporting the atomic to blocking LSM notifier change
was the best solution.  Other than one minor correction, v6 of the
"ima: Fix IMA mishandling of LSM based rule during" looks good.

-- 
thanks,

Mimi


^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2023-01-06  1:06 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-09  7:00 [RFC] IMA LSM based rule race condition issue on 4.19 LTS Guozihua (Scott)
2022-12-09  7:12 ` Greg KH
2022-12-09  7:53   ` Guozihua (Scott)
2022-12-09  8:46     ` Greg KH
2022-12-09  8:59       ` Guozihua (Scott)
2022-12-09  9:00         ` Greg KH
2022-12-09  9:11           ` Guozihua (Scott)
2022-12-09  9:22             ` Greg KH
2022-12-09  9:32               ` Guozihua (Scott)
2022-12-09  9:38                 ` Guozihua (Scott)
2022-12-09 10:27                   ` Greg KH
2022-12-12  2:39                     ` Guozihua (Scott)
2022-12-13 15:30 ` Mimi Zohar
2022-12-14  1:33   ` Guozihua (Scott)
2022-12-14 12:19     ` Mimi Zohar
2022-12-15  8:51       ` Guozihua (Scott)
2022-12-15 10:49         ` Mimi Zohar
2022-12-15 13:15           ` Guozihua (Scott)
2022-12-15 14:30             ` Mimi Zohar
2022-12-15 21:04               ` Paul Moore
2022-12-16  2:36                 ` Guozihua (Scott)
2022-12-16  3:04                   ` Paul Moore
2022-12-19  7:10                     ` Guozihua (Scott)
2022-12-19 13:11                       ` Mimi Zohar
2022-12-20  1:11                         ` Guozihua (Scott)
2022-12-21 10:51                           ` Guozihua (Scott)
2022-12-23  8:04                             ` Guozihua (Scott)
2022-12-24  3:41                               ` Guozihua (Scott)
2022-12-24  7:47                                 ` Guozihua (Scott)
2023-01-06  1:05                     ` Mimi Zohar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).