[PATCH rdma-next] IB/core: Don't register MAD agents for LSM notifications

selinux.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH rdma-next] IB/core: Don't register MAD agents for LSM notifications
@ 2019-01-27  8:10 Leon Romanovsky
  2019-01-28 16:37 ` Paul Moore
  0 siblings, 1 reply; 19+ messages in thread
From: Leon Romanovsky @ 2019-01-27  8:10 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Daniel Jurgens, RDMA mailing list, selinux, paul, ddutile,
	Leon Romanovsky

From: Daniel Jurgens <danielj@mellanox.com>

When creating many MAD agents in a short period of time receive packet
processing can be delayed long enough to cause timeouts while new agents
are being added to the atomic notifier chain with IRQs disabled.
Notifier chain registration and unregstration is an O(n) operation. With
large numbers of MAD agents being created and destroyed simultaneously
the CPUs spend too much time with interrupts disabled.

After this change previously granted access for MAD agents will not be
revoked if there is a relevant security policy change. This behavior is
already the case for most things controlled by a security policy.

Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 drivers/infiniband/core/security.c | 34 ++++--------------------------
 include/rdma/ib_mad.h              |  3 ---
 2 files changed, 4 insertions(+), 33 deletions(-)

diff --git a/drivers/infiniband/core/security.c b/drivers/infiniband/core/security.c
index 1efadbccf394..73598acb518a 100644
--- a/drivers/infiniband/core/security.c
+++ b/drivers/infiniband/core/security.c
@@ -676,21 +676,6 @@ static int ib_security_pkey_access(struct ib_device *dev,
 	return security_ib_pkey_access(sec, subnet_prefix, pkey);
 }

-static int ib_mad_agent_security_change(struct notifier_block *nb,
-					unsigned long event,
-					void *data)
-{
-	struct ib_mad_agent *ag = container_of(nb, struct ib_mad_agent, lsm_nb);
-
-	if (event != LSM_POLICY_CHANGE)
-		return NOTIFY_DONE;
-
-	ag->smp_allowed = !security_ib_endport_manage_subnet(
-		ag->security, dev_name(&ag->device->dev), ag->port_num);
-
-	return NOTIFY_OK;
-}
-
 int ib_mad_agent_security_setup(struct ib_mad_agent *agent,
 				enum ib_qp_type qp_type)
 {
@@ -710,16 +695,9 @@ int ib_mad_agent_security_setup(struct ib_mad_agent *agent,
 						dev_name(&agent->device->dev),
 						agent->port_num);
 	if (ret)
-		return ret;
+		security_ib_free_security(agent->security);

-	agent->lsm_nb.notifier_call = ib_mad_agent_security_change;
-	ret = register_lsm_notifier(&agent->lsm_nb);
-	if (ret)
-		return ret;
-
-	agent->smp_allowed = true;
-	agent->lsm_nb_reg = true;
-	return 0;
+	return ret;
 }

 void ib_mad_agent_security_cleanup(struct ib_mad_agent *agent)
@@ -728,8 +706,6 @@ void ib_mad_agent_security_cleanup(struct ib_mad_agent *agent)
 		return;

 	security_ib_free_security(agent->security);
-	if (agent->lsm_nb_reg)
-		unregister_lsm_notifier(&agent->lsm_nb);
 }

 int ib_mad_enforce_security(struct ib_mad_agent_private *map, u16 pkey_index)
@@ -737,11 +713,9 @@ int ib_mad_enforce_security(struct ib_mad_agent_private *map, u16 pkey_index)
 	if (!rdma_protocol_ib(map->agent.device, map->agent.port_num))
 		return 0;

-	if (map->agent.qp->qp_type == IB_QPT_SMI) {
-		if (!map->agent.smp_allowed)
-			return -EACCES;
+	/* SMI agent enforcement is done during agent creation */
+	if (map->agent.qp->qp_type == IB_QPT_SMI)
 		return 0;
-	}

 	return ib_security_pkey_access(map->agent.device,
 				       map->agent.port_num,
diff --git a/include/rdma/ib_mad.h b/include/rdma/ib_mad.h
index fdef558e3a2d..12543e09c3ed 100644
--- a/include/rdma/ib_mad.h
+++ b/include/rdma/ib_mad.h
@@ -619,9 +619,6 @@ struct ib_mad_agent {
 	u8			port_num;
 	u8			rmpp_version;
 	void			*security;
-	bool			smp_allowed;
-	bool			lsm_nb_reg;
-	struct notifier_block   lsm_nb;
 };

 /**
--
2.19.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH rdma-next] IB/core: Don't register MAD agents for LSM notifications
  2019-01-27  8:10 [PATCH rdma-next] IB/core: Don't register MAD agents for LSM notifications Leon Romanovsky
@ 2019-01-28 16:37 ` Paul Moore
  2019-01-28 16:57   ` Daniel Jurgens
  0 siblings, 1 reply; 19+ messages in thread
From: Paul Moore @ 2019-01-28 16:37 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Doug Ledford, Jason Gunthorpe, Daniel Jurgens, RDMA mailing list,
	selinux, ddutile, Leon Romanovsky

On Sun, Jan 27, 2019 at 3:10 AM Leon Romanovsky <leon@kernel.org> wrote:
> From: Daniel Jurgens <danielj@mellanox.com>
>
> When creating many MAD agents in a short period of time receive packet
> processing can be delayed long enough to cause timeouts while new agents
> are being added to the atomic notifier chain with IRQs disabled.
> Notifier chain registration and unregstration is an O(n) operation. With
> large numbers of MAD agents being created and destroyed simultaneously
> the CPUs spend too much time with interrupts disabled.
>
> After this change previously granted access for MAD agents will not be
> revoked if there is a relevant security policy change. This behavior is
> already the case for most things controlled by a security policy.
>
> Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
> ---
>  drivers/infiniband/core/security.c | 34 ++++--------------------------
>  include/rdma/ib_mad.h              |  3 ---
>  2 files changed, 4 insertions(+), 33 deletions(-)

Perhaps predictably, I'm not very excited about this change.  Have you
looked closer into the slowdown to see where the cycles are being
spent?  I'm wondering if the issue is that a large number of notifiers
are being registered with the same priority causing the while loop in
notifier_chain_register() to take a significant amount of time.

> diff --git a/drivers/infiniband/core/security.c b/drivers/infiniband/core/security.c
> index 1efadbccf394..73598acb518a 100644
> --- a/drivers/infiniband/core/security.c
> +++ b/drivers/infiniband/core/security.c
> @@ -676,21 +676,6 @@ static int ib_security_pkey_access(struct ib_device *dev,
>         return security_ib_pkey_access(sec, subnet_prefix, pkey);
>  }
>
> -static int ib_mad_agent_security_change(struct notifier_block *nb,
> -                                       unsigned long event,
> -                                       void *data)
> -{
> -       struct ib_mad_agent *ag = container_of(nb, struct ib_mad_agent, lsm_nb);
> -
> -       if (event != LSM_POLICY_CHANGE)
> -               return NOTIFY_DONE;
> -
> -       ag->smp_allowed = !security_ib_endport_manage_subnet(
> -               ag->security, dev_name(&ag->device->dev), ag->port_num);
> -
> -       return NOTIFY_OK;
> -}
> -
>  int ib_mad_agent_security_setup(struct ib_mad_agent *agent,
>                                 enum ib_qp_type qp_type)
>  {
> @@ -710,16 +695,9 @@ int ib_mad_agent_security_setup(struct ib_mad_agent *agent,
>                                                 dev_name(&agent->device->dev),
>                                                 agent->port_num);
>         if (ret)
> -               return ret;
> +               security_ib_free_security(agent->security);
>
> -       agent->lsm_nb.notifier_call = ib_mad_agent_security_change;
> -       ret = register_lsm_notifier(&agent->lsm_nb);
> -       if (ret)
> -               return ret;
> -
> -       agent->smp_allowed = true;
> -       agent->lsm_nb_reg = true;
> -       return 0;
> +       return ret;
>  }
>
>  void ib_mad_agent_security_cleanup(struct ib_mad_agent *agent)
> @@ -728,8 +706,6 @@ void ib_mad_agent_security_cleanup(struct ib_mad_agent *agent)
>                 return;
>
>         security_ib_free_security(agent->security);
> -       if (agent->lsm_nb_reg)
> -               unregister_lsm_notifier(&agent->lsm_nb);
>  }
>
>  int ib_mad_enforce_security(struct ib_mad_agent_private *map, u16 pkey_index)
> @@ -737,11 +713,9 @@ int ib_mad_enforce_security(struct ib_mad_agent_private *map, u16 pkey_index)
>         if (!rdma_protocol_ib(map->agent.device, map->agent.port_num))
>                 return 0;
>
> -       if (map->agent.qp->qp_type == IB_QPT_SMI) {
> -               if (!map->agent.smp_allowed)
> -                       return -EACCES;
> +       /* SMI agent enforcement is done during agent creation */
> +       if (map->agent.qp->qp_type == IB_QPT_SMI)
>                 return 0;
> -       }
>
>         return ib_security_pkey_access(map->agent.device,
>                                        map->agent.port_num,
> diff --git a/include/rdma/ib_mad.h b/include/rdma/ib_mad.h
> index fdef558e3a2d..12543e09c3ed 100644
> --- a/include/rdma/ib_mad.h
> +++ b/include/rdma/ib_mad.h
> @@ -619,9 +619,6 @@ struct ib_mad_agent {
>         u8                      port_num;
>         u8                      rmpp_version;
>         void                    *security;
> -       bool                    smp_allowed;
> -       bool                    lsm_nb_reg;
> -       struct notifier_block   lsm_nb;
>  };
>
>  /**
> --
> 2.19.1

-- 
paul moore
www.paul-moore.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH rdma-next] IB/core: Don't register MAD agents for LSM notifications
  2019-01-28 16:37 ` Paul Moore
@ 2019-01-28 16:57   ` Daniel Jurgens
  2019-01-28 23:03     ` Paul Moore
  0 siblings, 1 reply; 19+ messages in thread
From: Daniel Jurgens @ 2019-01-28 16:57 UTC (permalink / raw)
  To: Paul Moore, Leon Romanovsky
  Cc: Doug Ledford, Jason Gunthorpe, RDMA mailing list, selinux,
	ddutile, Leon Romanovsky

On 1/28/2019 10:37 AM, Paul Moore wrote:
> On Sun, Jan 27, 2019 at 3:10 AM Leon Romanovsky <leon@kernel.org> wrote:
>> From: Daniel Jurgens <danielj@mellanox.com>
>>
>> ---
>>  drivers/infiniband/core/security.c | 34 ++++--------------------------
>>  include/rdma/ib_mad.h              |  3 ---
>>  2 files changed, 4 insertions(+), 33 deletions(-)
> Perhaps predictably, I'm not very excited about this change.  Have you
> looked closer into the slowdown to see where the cycles are being
> spent?  I'm wondering if the issue is that a large number of notifiers
> are being registered with the same priority causing the while loop in
> notifier_chain_register() to take a significant amount of time.

That's what's happening, each MAD agent is registering it's own notifier. The bug reporter was creating hundreds or thousands of  short lived MAD agents. With IRQs disabled too long it resulted in timeouts.

When I initially added the notifier mechanism I thought it was you that said it wasn't really needed, since access wasn't generally revoked in these types of scenarios. Given that I didn't think this would be especially controversial. It was nice to have, unfortunately it causes problems even for users that don't enable SELinux.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH rdma-next] IB/core: Don't register MAD agents for LSM notifications
  2019-01-28 16:57   ` Daniel Jurgens
@ 2019-01-28 23:03     ` Paul Moore
  2019-01-29  2:57       ` Don Dutile
  0 siblings, 1 reply; 19+ messages in thread
From: Paul Moore @ 2019-01-28 23:03 UTC (permalink / raw)
  To: Daniel Jurgens
  Cc: Leon Romanovsky, Doug Ledford, Jason Gunthorpe,
	RDMA mailing list, selinux, ddutile, Leon Romanovsky

On Mon, Jan 28, 2019 at 11:57 AM Daniel Jurgens <danielj@mellanox.com> wrote:
> On 1/28/2019 10:37 AM, Paul Moore wrote:
> > On Sun, Jan 27, 2019 at 3:10 AM Leon Romanovsky <leon@kernel.org> wrote:
> >> From: Daniel Jurgens <danielj@mellanox.com>
> >>
> >> ---
> >>  drivers/infiniband/core/security.c | 34 ++++--------------------------
> >>  include/rdma/ib_mad.h              |  3 ---
> >>  2 files changed, 4 insertions(+), 33 deletions(-)
> > Perhaps predictably, I'm not very excited about this change.  Have you
> > looked closer into the slowdown to see where the cycles are being
> > spent?  I'm wondering if the issue is that a large number of notifiers
> > are being registered with the same priority causing the while loop in
> > notifier_chain_register() to take a significant amount of time.
>
> That's what's happening, each MAD agent is registering it's own notifier. The bug reporter was creating hundreds or thousands of  short lived MAD agents. With IRQs disabled too long it resulted in timeouts.
>
> When I initially added the notifier mechanism I thought it was you that said it wasn't really needed, since access wasn't generally revoked in these types of scenarios. Given that I didn't think this would be especially controversial. It was nice to have, unfortunately it causes problems even for users that don't enable SELinux.

Revoking permission is difficult, and in some cases likely impossible,
but that doesn't mean we shouldn't try to make it possible when we
can.  I'd like to see if we can sort this out before we give up and
rip it out.

We might be able to modify notifier_chain_register() such that if the
notifier being registered has the same priority as the current node it
is inserted before the current node.  It looks like all of the IB
notifiers have the same priority so that should speed up registration
significantly in this case, unfortunately that doesn't help
unregistering.  I think we would need to move to convert
notifier_block to use list_head if we want to handle removal in a
timely manner.

However, there is also a concern that delivering notifications to
hundreds of thousands of registered notifiers is going to be
problematic.  None of the above is going to fix that.

I'm trying to quickly understand the MAD agent lifecycle, and it looks
like you have your own register/unregister routines, with locking, so
is it reasonable to assume that it would be possible to iterate over
the MAD agents in the IB code?  I wonder if it would be possible to
group MAD agents (per-port grouping, does that make sense?) such that
several agents would share a single LSM notifier registration?

-- 
paul moore
www.paul-moore.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH rdma-next] IB/core: Don't register MAD agents for LSM notifications
  2019-01-28 23:03     ` Paul Moore
@ 2019-01-29  2:57       ` Don Dutile
  2019-01-29 17:30         ` Paul Moore
  0 siblings, 1 reply; 19+ messages in thread
From: Don Dutile @ 2019-01-29  2:57 UTC (permalink / raw)
  To: Paul Moore, Daniel Jurgens
  Cc: Leon Romanovsky, Doug Ledford, Jason Gunthorpe,
	RDMA mailing list, selinux, Leon Romanovsky

On 01/28/2019 06:03 PM, Paul Moore wrote:
> On Mon, Jan 28, 2019 at 11:57 AM Daniel Jurgens <danielj@mellanox.com> wrote:
>> On 1/28/2019 10:37 AM, Paul Moore wrote:
>>> On Sun, Jan 27, 2019 at 3:10 AM Leon Romanovsky <leon@kernel.org> wrote:
>>>> From: Daniel Jurgens <danielj@mellanox.com>
>>>>
>>>> ---
>>>>   drivers/infiniband/core/security.c | 34 ++++--------------------------
>>>>   include/rdma/ib_mad.h              |  3 ---
>>>>   2 files changed, 4 insertions(+), 33 deletions(-)
>>> Perhaps predictably, I'm not very excited about this change.  Have you
>>> looked closer into the slowdown to see where the cycles are being
>>> spent?  I'm wondering if the issue is that a large number of notifiers
>>> are being registered with the same priority causing the while loop in
>>> notifier_chain_register() to take a significant amount of time.
>>
>> That's what's happening, each MAD agent is registering it's own notifier. The bug reporter was creating hundreds or thousands of  short lived MAD agents. With IRQs disabled too long it resulted in timeouts.
>>
>> When I initially added the notifier mechanism I thought it was you that said it wasn't really needed, since access wasn't generally revoked in these types of scenarios. Given that I didn't think this would be especially controversial. It was nice to have, unfortunately it causes problems even for users that don't enable SELinux.
> 
> Revoking permission is difficult, and in some cases likely impossible,
> but that doesn't mean we shouldn't try to make it possible when we
> can.  I'd like to see if we can sort this out before we give up and
> rip it out.
> 
Agree that it'd be nice to see it work, but since few (no other?) subsys registered
with selinux/security is doing this, can we 'release' the IB mad agents from this
'imprisonment' and figure out a general solution that resolves this issue, and
then it can be implemented not only for ib mad packet access, but for other
subsys using selinux/security, and wanting the same revocation support.

> We might be able to modify notifier_chain_register() such that if the
> notifier being registered has the same priority as the current node it
> is inserted before the current node.  It looks like all of the IB
> notifiers have the same priority so that should speed up registration
> significantly in this case, unfortunately that doesn't help
> unregistering.  I think we would need to move to convert
> notifier_block to use list_head if we want to handle removal in a
> timely manner.
> 
> However, there is also a concern that delivering notifications to
> hundreds of thousands of registered notifiers is going to be
> problematic.  None of the above is going to fix that.
> 
sooo....

> I'm trying to quickly understand the MAD agent lifecycle, and it looks
> like you have your own register/unregister routines, with locking, so
> is it reasonable to assume that it would be possible to iterate over
> the MAD agents in the IB code?  I wonder if it would be possible to
> group MAD agents (per-port grouping, does that make sense?) such that
> several agents would share a single LSM notifier registration?
> 
one can have numerous MAD agents.  it isn't just one.
Think: you can have any number of apps scanning the fabric for various stats
        to draw node/graph pics, gather per-port packet counts, gather bw numbers,
        etc.  For IB, scanning the fabric involves 'mad pkts'.
        In addition to scanning, one could do 'less then nice things' to switches,
        not just read stats but set parameters... like routing table entries.
        and thus, the security level addition.
       One is highly unlikely to grant access to an app to do mad pkt generation/use,
       then change your mind (semi-?)randomly to enable/disable it.... generally always enabled,
       or always disabled.

Let's take this case as a catalyst to resolve a previously unknown perf issue,
and not hold it up to a higher functional requirement then other apps that want to be secured.

thanks...

-dd


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH rdma-next] IB/core: Don't register MAD agents for LSM notifications
  2019-01-29  2:57       ` Don Dutile
@ 2019-01-29 17:30         ` Paul Moore
  2019-01-29 17:48           ` Daniel Jurgens
  2019-02-01 14:21           ` Don Dutile
  0 siblings, 2 replies; 19+ messages in thread
From: Paul Moore @ 2019-01-29 17:30 UTC (permalink / raw)
  To: Don Dutile
  Cc: Daniel Jurgens, Leon Romanovsky, Doug Ledford, Jason Gunthorpe,
	RDMA mailing list, selinux, Leon Romanovsky

On Mon, Jan 28, 2019 at 9:57 PM Don Dutile <ddutile@redhat.com> wrote:
> On 01/28/2019 06:03 PM, Paul Moore wrote:
> > On Mon, Jan 28, 2019 at 11:57 AM Daniel Jurgens <danielj@mellanox.com> wrote:
> >> On 1/28/2019 10:37 AM, Paul Moore wrote:
> >>> On Sun, Jan 27, 2019 at 3:10 AM Leon Romanovsky <leon@kernel.org> wrote:
> >>>> From: Daniel Jurgens <danielj@mellanox.com>
> >>>>
> >>>> ---
> >>>>   drivers/infiniband/core/security.c | 34 ++++--------------------------
> >>>>   include/rdma/ib_mad.h              |  3 ---
> >>>>   2 files changed, 4 insertions(+), 33 deletions(-)
> >>> Perhaps predictably, I'm not very excited about this change.  Have you
> >>> looked closer into the slowdown to see where the cycles are being
> >>> spent?  I'm wondering if the issue is that a large number of notifiers
> >>> are being registered with the same priority causing the while loop in
> >>> notifier_chain_register() to take a significant amount of time.
> >>
> >> That's what's happening, each MAD agent is registering it's own notifier. The bug reporter was creating hundreds or thousands of  short lived MAD agents. With IRQs disabled too long it resulted in timeouts.
> >>
> >> When I initially added the notifier mechanism I thought it was you that said it wasn't really needed, since access wasn't generally revoked in these types of scenarios. Given that I didn't think this would be especially controversial. It was nice to have, unfortunately it causes problems even for users that don't enable SELinux.
> >
> > Revoking permission is difficult, and in some cases likely impossible,
> > but that doesn't mean we shouldn't try to make it possible when we
> > can.  I'd like to see if we can sort this out before we give up and
> > rip it out.
>
> Agree that it'd be nice to see it work, but since few (no other?) subsys registered
> with selinux/security is doing this, can we 'release' the IB mad agents from this
> 'imprisonment' and figure out a general solution that resolves this issue, and
> then it can be implemented not only for ib mad packet access, but for other
> subsys using selinux/security, and wanting the same revocation support.

If you scroll up, ever so slightly to my last reply (it's just a few
lines, I'll wait), you'll see where I said "I'd like to see if we can
sort this out before we give up and rip it out."

I don't recall at the moment why the notifier approach was taken with
the IB hooks back when Daniel wrote the code, and I'm definitely not
sure what other subsystems you are using for your comparisons here,
but I can tell you that I am beyond sick and tired of hearing people
immediately jump to "rip out this security code" because of some
performance glitch.  I care about performance too, okay?  We've got a
bunch of clever people here, let's spend some time and see if we can
improve what we've got before we make some knee jerk reaction and just
rip out the access controls.

> > I'm trying to quickly understand the MAD agent lifecycle, and it looks
> > like you have your own register/unregister routines, with locking, so
> > is it reasonable to assume that it would be possible to iterate over
> > the MAD agents in the IB code?  I wonder if it would be possible to
> > group MAD agents (per-port grouping, does that make sense?) such that
> > several agents would share a single LSM notifier registration?
> >
> one can have numerous MAD agents.  it isn't just one.

Clearly.  Daniel already mentioned hundreds of thousands of MAD agents.

> Think: you can have any number of apps scanning the fabric for various stats
>         to draw node/graph pics, gather per-port packet counts, gather bw numbers,
>         etc.  For IB, scanning the fabric involves 'mad pkts'.
>         In addition to scanning, one could do 'less then nice things' to switches,
>         not just read stats but set parameters... like routing table entries.
>         and thus, the security level addition.
>        One is highly unlikely to grant access to an app to do mad pkt generation/use,
>        then change your mind (semi-?)randomly to enable/disable it.... generally always enabled,
>        or always disabled.
>
> Let's take this case as a catalyst to resolve a previously unknown perf issue,
> and not hold it up to a higher functional requirement then other apps that want to be secured.

See my comment above.  I don't even know what you are talking about
regarding "higher functional arguments", what other subsystems?
Seriously Don, language like this causes an extreme allergic reaction.

Anyway, let's move on to something a bit more constructive, shall we?

From what I've gathered from this thread the issue stems from the
ib_mad_agent_security_setup() where we register the
ib_mad_agent_security_change() callback with the LSM notification
mechanism.  Looking at the ib_mad_agent_security_change() function,
all it seems to do is perform a LSM permission check
(security_ib_endport_manage_subnet()) and store the result in
ib_mad_agent->smp_allowed; quickly grep'ing through the code, the only
place where I see that value used is in ib_mad_enforce_security().

Looking at ib_mad_enforce_security(), it seems to have access to the
necessary MAD agent information, via ib_mad_agent_private->agent, to
perform the LSM permission check directly in this function.  Could we
not get rid of the MAD agent LSM notification callback and simply do
the LSM access check directly in ib_mad_enforce_security()?  What am I
missing?

I do see that the SELinux implementation of the
security_ib_endport_manage_subnet() LSM hook doesn't seem to cache the
IB endport labels.  If that proves to be a performance issue we can
create a cache for that (like we do for the pkeys and a few other
objects).

I barely understand IB, so it is very possible there is a fundamental
flaw in the logic above, but let's try to work together and figure
something out.  The above may not be it, but I'm not ready to throw in
the towel.

-- 
paul moore
www.paul-moore.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH rdma-next] IB/core: Don't register MAD agents for LSM notifications
  2019-01-29 17:30         ` Paul Moore
@ 2019-01-29 17:48           ` Daniel Jurgens
  2019-01-29 18:02             ` Daniel Jurgens
  2019-02-01 14:21           ` Don Dutile
  1 sibling, 1 reply; 19+ messages in thread
From: Daniel Jurgens @ 2019-01-29 17:48 UTC (permalink / raw)
  To: Paul Moore, Don Dutile
  Cc: Leon Romanovsky, Doug Ledford, Jason Gunthorpe,
	RDMA mailing list, selinux, Leon Romanovsky


On 1/29/2019 11:30 AM, Paul Moore wrote:
> On Mon, Jan 28, 2019 at 9:57 PM Don Dutile <ddutile@redhat.com> wrote:
>> On 01/28/2019 06:03 PM, Paul Moore wrote:
>>> On Mon, Jan 28, 2019 at 11:57 AM Daniel Jurgens <danielj@mellanox.com> wrote:
>>>> On 1/28/2019 10:37 AM, Paul Moore wrote:
>>>>> On Sun, Jan 27, 2019 at 3:10 AM Leon Romanovsky <leon@kernel.org> wrote:
>>>>>> From: Daniel Jurgens <danielj@mellanox.com>
>>>>>>
>>>>>> ---
>>>>>>   drivers/infiniband/core/security.c | 34 ++++--------------------------
>>>>>>   include/rdma/ib_mad.h              |  3 ---
>>>>>>   2 files changed, 4 insertions(+), 33 deletions(-)
>>>>>
>>> I'm trying to quickly understand the MAD agent lifecycle, and it looks
>>> like you have your own register/unregister routines, with locking, so
>>> is it reasonable to assume that it would be possible to iterate over
>>> the MAD agents in the IB code?  I wonder if it would be possible to
>>> group MAD agents (per-port grouping, does that make sense?) such that
>>> several agents would share a single LSM notifier registration?
>>>
>> one can have numerous MAD agents.  it isn't just one.
> Clearly.  Daniel already mentioned hundreds of thousands of MAD agents.
Just to be clear, it was hundreds to thousands, not hundreds of thousands.
>
>> Think: you can have any number of apps scanning the fabric for various stats
>>         to draw node/graph pics, gather per-port packet counts, gather bw numbers,
>>         etc.  For IB, scanning the fabric involves 'mad pkts'.
>>         In addition to scanning, one could do 'less then nice things' to switches,
>>         not just read stats but set parameters... like routing table entries.
>>         and thus, the security level addition.
>>        One is highly unlikely to grant access to an app to do mad pkt generation/use,
>>        then change your mind (semi-?)randomly to enable/disable it.... generally always enabled,
>>        or always disabled.
>>
>> Let's take this case as a catalyst to resolve a previously unknown perf issue,
>> and not hold it up to a higher functional requirement then other apps that want to be secured.
> See my comment above.  I don't even know what you are talking about
> regarding "higher functional arguments", what other subsystems?
> Seriously Don, language like this causes an extreme allergic reaction.
>
> Anyway, let's move on to something a bit more constructive, shall we?
>
> From what I've gathered from this thread the issue stems from the
> ib_mad_agent_security_setup() where we register the
> ib_mad_agent_security_change() callback with the LSM notification
> mechanism.  Looking at the ib_mad_agent_security_change() function,
> all it seems to do is perform a LSM permission check
> (security_ib_endport_manage_subnet()) and store the result in
> ib_mad_agent->smp_allowed; quickly grep'ing through the code, the only
> place where I see that value used is in ib_mad_enforce_security().
>
> Looking at ib_mad_enforce_security(), it seems to have access to the
> necessary MAD agent information, via ib_mad_agent_private->agent, to
> perform the LSM permission check directly in this function.  Could we
> not get rid of the MAD agent LSM notification callback and simply do
> the LSM access check directly in ib_mad_enforce_security()?  What am I
> missing?
>
> I do see that the SELinux implementation of the
> security_ib_endport_manage_subnet() LSM hook doesn't seem to cache the
> IB endport labels.  If that proves to be a performance issue we can
> create a cache for that (like we do for the pkeys and a few other
> objects).
>
> I barely understand IB, so it is very possible there is a fundamental
> flaw in the logic above, but let's try to work together and figure
> something out.  The above may not be it, but I'm not ready to throw in
> the towel.
I think that's a reasonable idea, also holding agents lists in the IB layer and only registering for one notifier could work as well (it's already registered for the PKeys). Jason had suggested that internally, but the problem is I wasn't really able to reproduce the reported issue, so we're kind of flying blind either way in terms of knowing if either approach addresses the perf problem.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH rdma-next] IB/core: Don't register MAD agents for LSM notifications
  2019-01-29 17:48           ` Daniel Jurgens
@ 2019-01-29 18:02             ` Daniel Jurgens
  2019-01-29 20:51               ` Paul Moore
  0 siblings, 1 reply; 19+ messages in thread
From: Daniel Jurgens @ 2019-01-29 18:02 UTC (permalink / raw)
  To: Paul Moore, Don Dutile
  Cc: Leon Romanovsky, Doug Ledford, Jason Gunthorpe,
	RDMA mailing list, selinux, Leon Romanovsky


On 1/29/2019 11:48 AM, Daniel Jurgens wrote:
> On 1/29/2019 11:30 AM, Paul Moore wrote:
>> On Mon, Jan 28, 2019 at 9:57 PM Don Dutile <ddutile@redhat.com> wrote:
>>> On 01/28/2019 06:03 PM, Paul Moore wrote:
>>>> On Mon, Jan 28, 2019 at 11:57 AM Daniel Jurgens <danielj@mellanox.com> wrote:
>>>>> On 1/28/2019 10:37 AM, Paul Moore wrote:
>>>>>> On Sun, Jan 27, 2019 at 3:10 AM Leon Romanovsky <leon@kernel.org> wrote:
>>>>>>> From: Daniel Jurgens <danielj@mellanox.com>
>>>>>>>
>>>>>>> ---
>>>>>>>   drivers/infiniband/core/security.c | 34 ++++--------------------------
>>>>>>>   include/rdma/ib_mad.h              |  3 ---
>>>>>>>   2 files changed, 4 insertions(+), 33 deletions(-)
>>>> I'm trying to quickly understand the MAD agent lifecycle, and it looks
>>>> like you have your own register/unregister routines, with locking, so
>>>> is it reasonable to assume that it would be possible to iterate over
>>>> the MAD agents in the IB code?  I wonder if it would be possible to
>>>> group MAD agents (per-port grouping, does that make sense?) such that
>>>> several agents would share a single LSM notifier registration?
>>>>
>>> one can have numerous MAD agents.  it isn't just one.
>> Clearly.  Daniel already mentioned hundreds of thousands of MAD agents.
> Just to be clear, it was hundreds to thousands, not hundreds of thousands.
>>> Think: you can have any number of apps scanning the fabric for various stats
>>>         to draw node/graph pics, gather per-port packet counts, gather bw numbers,
>>>         etc.  For IB, scanning the fabric involves 'mad pkts'.
>>>         In addition to scanning, one could do 'less then nice things' to switches,
>>>         not just read stats but set parameters... like routing table entries.
>>>         and thus, the security level addition.
>>>        One is highly unlikely to grant access to an app to do mad pkt generation/use,
>>>        then change your mind (semi-?)randomly to enable/disable it.... generally always enabled,
>>>        or always disabled.
>>>
>>> Let's take this case as a catalyst to resolve a previously unknown perf issue,
>>> and not hold it up to a higher functional requirement then other apps that want to be secured.
>> See my comment above.  I don't even know what you are talking about
>> regarding "higher functional arguments", what other subsystems?
>> Seriously Don, language like this causes an extreme allergic reaction.
>>
>> Anyway, let's move on to something a bit more constructive, shall we?
>>
>> From what I've gathered from this thread the issue stems from the
>> ib_mad_agent_security_setup() where we register the
>> ib_mad_agent_security_change() callback with the LSM notification
>> mechanism.  Looking at the ib_mad_agent_security_change() function,
>> all it seems to do is perform a LSM permission check
>> (security_ib_endport_manage_subnet()) and store the result in
>> ib_mad_agent->smp_allowed; quickly grep'ing through the code, the only
>> place where I see that value used is in ib_mad_enforce_security().
>>
>> Looking at ib_mad_enforce_security(), it seems to have access to the
>> necessary MAD agent information, via ib_mad_agent_private->agent, to
>> perform the LSM permission check directly in this function.  Could we
>> not get rid of the MAD agent LSM notification callback and simply do
>> the LSM access check directly in ib_mad_enforce_security()?  What am I
>> missing?
>>
>> I do see that the SELinux implementation of the
>> security_ib_endport_manage_subnet() LSM hook doesn't seem to cache the
>> IB endport labels.  If that proves to be a performance issue we can
>> create a cache for that (like we do for the pkeys and a few other
>> objects).
>>
>> I barely understand IB, so it is very possible there is a fundamental
>> flaw in the logic above, but let's try to work together and figure
>> something out.  The above may not be it, but I'm not ready to throw in
>> the towel.
> I think that's a reasonable idea, also holding agents lists in the IB layer and only registering for one notifier could work as well (it's already registered for the PKeys). Jason had suggested that internally, but the problem is I wasn't really able to reproduce the reported issue, so we're kind of flying blind either way in terms of knowing if either approach addresses the perf problem.

 I guess I should clarify. Obviously if we only register for the notifier once and hold the list(s) in the IB layer we won't have issues with register/unregister, but there may still be a problem lurking during a notification event. So this is less bad, at least it wouldn't affect non SElinux users.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH rdma-next] IB/core: Don't register MAD agents for LSM notifications
  2019-01-29 18:02             ` Daniel Jurgens
@ 2019-01-29 20:51               ` Paul Moore
  2019-01-29 20:58                 ` Daniel Jurgens
  0 siblings, 1 reply; 19+ messages in thread
From: Paul Moore @ 2019-01-29 20:51 UTC (permalink / raw)
  To: Daniel Jurgens
  Cc: Don Dutile, Leon Romanovsky, Doug Ledford, Jason Gunthorpe,
	RDMA mailing list, selinux, Leon Romanovsky

On Tue, Jan 29, 2019 at 1:02 PM Daniel Jurgens <danielj@mellanox.com> wrote:
> On 1/29/2019 11:48 AM, Daniel Jurgens wrote:
> > On 1/29/2019 11:30 AM, Paul Moore wrote:
> >> On Mon, Jan 28, 2019 at 9:57 PM Don Dutile <ddutile@redhat.com> wrote:
> >>> On 01/28/2019 06:03 PM, Paul Moore wrote:
> >>>> On Mon, Jan 28, 2019 at 11:57 AM Daniel Jurgens <danielj@mellanox.com> wrote:
> >>>>> On 1/28/2019 10:37 AM, Paul Moore wrote:
> >>>>>> On Sun, Jan 27, 2019 at 3:10 AM Leon Romanovsky <leon@kernel.org> wrote:
> >>>>>>> From: Daniel Jurgens <danielj@mellanox.com>
> >>>>>>>
> >>>>>>> ---
> >>>>>>>   drivers/infiniband/core/security.c | 34 ++++--------------------------
> >>>>>>>   include/rdma/ib_mad.h              |  3 ---
> >>>>>>>   2 files changed, 4 insertions(+), 33 deletions(-)
> >>>> I'm trying to quickly understand the MAD agent lifecycle, and it looks
> >>>> like you have your own register/unregister routines, with locking, so
> >>>> is it reasonable to assume that it would be possible to iterate over
> >>>> the MAD agents in the IB code?  I wonder if it would be possible to
> >>>> group MAD agents (per-port grouping, does that make sense?) such that
> >>>> several agents would share a single LSM notifier registration?
> >>>>
> >>> one can have numerous MAD agents.  it isn't just one.
> >> Clearly.  Daniel already mentioned hundreds of thousands of MAD agents.
> > Just to be clear, it was hundreds to thousands, not hundreds of thousands.

Ah, sorry about that, I guess my mind multiplied it by a few orders of
magnitude.  Regardless, even a few hundred is a lot for a full list
traversal.

> >> From what I've gathered from this thread the issue stems from the
> >> ib_mad_agent_security_setup() where we register the
> >> ib_mad_agent_security_change() callback with the LSM notification
> >> mechanism.  Looking at the ib_mad_agent_security_change() function,
> >> all it seems to do is perform a LSM permission check
> >> (security_ib_endport_manage_subnet()) and store the result in
> >> ib_mad_agent->smp_allowed; quickly grep'ing through the code, the only
> >> place where I see that value used is in ib_mad_enforce_security().
> >>
> >> Looking at ib_mad_enforce_security(), it seems to have access to the
> >> necessary MAD agent information, via ib_mad_agent_private->agent, to
> >> perform the LSM permission check directly in this function.  Could we
> >> not get rid of the MAD agent LSM notification callback and simply do
> >> the LSM access check directly in ib_mad_enforce_security()?  What am I
> >> missing?
> >>
> >> I do see that the SELinux implementation of the
> >> security_ib_endport_manage_subnet() LSM hook doesn't seem to cache the
> >> IB endport labels.  If that proves to be a performance issue we can
> >> create a cache for that (like we do for the pkeys and a few other
> >> objects).
> >>
> >> I barely understand IB, so it is very possible there is a fundamental
> >> flaw in the logic above, but let's try to work together and figure
> >> something out.  The above may not be it, but I'm not ready to throw in
> >> the towel.
> > I think that's a reasonable idea, also holding agents lists in the IB layer and only registering for one notifier could work as well (it's already registered for the PKeys). Jason had suggested that internally, but the problem is I wasn't really able to reproduce the reported issue, so we're kind of flying blind either way in terms of knowing if either approach addresses the perf problem.
>
>  I guess I should clarify. Obviously if we only register for the notifier once and hold the list(s) in the IB layer we won't have issues with register/unregister, but there may still be a problem lurking during a notification event. So this is less bad, at least it wouldn't affect non SElinux users.

Okay, so let's attempt the change above where we just do the access
check directly.  Although I'm a little concerned that without a
reproducer we might not end up fixing the problem we're trying to fix.
Is anyone in touch with the person who originally reported the
problem?  It would be great if we could get that person to verify the
change ...

-- 
paul moore
www.paul-moore.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH rdma-next] IB/core: Don't register MAD agents for LSM notifications
  2019-01-29 20:51               ` Paul Moore
@ 2019-01-29 20:58                 ` Daniel Jurgens
  2019-01-29 21:13                   ` Paul Moore
  0 siblings, 1 reply; 19+ messages in thread
From: Daniel Jurgens @ 2019-01-29 20:58 UTC (permalink / raw)
  To: Paul Moore
  Cc: Don Dutile, Leon Romanovsky, Doug Ledford, Jason Gunthorpe,
	RDMA mailing list, selinux, Leon Romanovsky

On 1/29/2019 2:51 PM, Paul Moore wrote:
>
> Okay, so let's attempt the change above where we just do the access
> check directly.  Although I'm a little concerned that without a
> reproducer we might not end up fixing the problem we're trying to fix.
> Is anyone in touch with the person who originally reported the
> problem?  It would be great if we could get that person to verify the
> change ...

I decided to go with maintaining a list in IB core. The notifier call is done under rcu_read_lock vs spin_lock_irq for register/unregister, so we shouldn't have any problems in that case. So only registering once basically achieves the same thing as taking it all out. I'm testing it now. I'll send it for internal review today assuming it checks out. Hopefully Leon can get it posted tomorrow, I know Don has some schedule pressure here.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH rdma-next] IB/core: Don't register MAD agents for LSM notifications
  2019-01-29 20:58                 ` Daniel Jurgens
@ 2019-01-29 21:13                   ` Paul Moore
  2019-02-01 13:57                     ` Paul Moore
  0 siblings, 1 reply; 19+ messages in thread
From: Paul Moore @ 2019-01-29 21:13 UTC (permalink / raw)
  To: Daniel Jurgens
  Cc: Don Dutile, Leon Romanovsky, Doug Ledford, Jason Gunthorpe,
	RDMA mailing list, selinux, Leon Romanovsky

On Tue, Jan 29, 2019 at 3:58 PM Daniel Jurgens <danielj@mellanox.com> wrote:
> On 1/29/2019 2:51 PM, Paul Moore wrote:
> >
> > Okay, so let's attempt the change above where we just do the access
> > check directly.  Although I'm a little concerned that without a
> > reproducer we might not end up fixing the problem we're trying to fix.
> > Is anyone in touch with the person who originally reported the
> > problem?  It would be great if we could get that person to verify the
> > change ...
>
> I decided to go with maintaining a list in IB core. The notifier call is done under rcu_read_lock vs spin_lock_irq for register/unregister, so we shouldn't have any problems in that case. So only registering once basically achieves the same thing as taking it all out. I'm testing it now. I'll send it for internal review today assuming it checks out. Hopefully Leon can get it posted tomorrow, I know Don has some schedule pressure here.

Okay sounds good.  We're still at -rc4 so as long as we can get
something posted this week, or early next, I see no reason why it
can't make the upcoming merge window.

I'm guessing Don's schedule pressure is more a RH deadline, and not an
upstream constraint.

-- 
paul moore
www.paul-moore.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH rdma-next] IB/core: Don't register MAD agents for LSM notifications
  2019-01-29 21:13                   ` Paul Moore
@ 2019-02-01 13:57                     ` Paul Moore
  2019-02-01 14:16                       ` Daniel Jurgens
  0 siblings, 1 reply; 19+ messages in thread
From: Paul Moore @ 2019-02-01 13:57 UTC (permalink / raw)
  To: Daniel Jurgens
  Cc: Don Dutile, Leon Romanovsky, Doug Ledford, Jason Gunthorpe,
	RDMA mailing list, selinux, Leon Romanovsky

On Tue, Jan 29, 2019 at 4:13 PM Paul Moore <paul@paul-moore.com> wrote:
> On Tue, Jan 29, 2019 at 3:58 PM Daniel Jurgens <danielj@mellanox.com> wrote:
> > On 1/29/2019 2:51 PM, Paul Moore wrote:
> > >
> > > Okay, so let's attempt the change above where we just do the access
> > > check directly.  Although I'm a little concerned that without a
> > > reproducer we might not end up fixing the problem we're trying to fix.
> > > Is anyone in touch with the person who originally reported the
> > > problem?  It would be great if we could get that person to verify the
> > > change ...
> >
> > I decided to go with maintaining a list in IB core. The notifier call is done under rcu_read_lock vs spin_lock_irq for register/unregister, so we shouldn't have any problems in that case. So only registering once basically achieves the same thing as taking it all out. I'm testing it now. I'll send it for internal review today assuming it checks out. Hopefully Leon can get it posted tomorrow, I know Don has some schedule pressure here.
>
> Okay sounds good.  We're still at -rc4 so as long as we can get
> something posted this week, or early next, I see no reason why it
> can't make the upcoming merge window.
>
> I'm guessing Don's schedule pressure is more a RH deadline, and not an
> upstream constraint.

I just wanted to check in and see how this was progressing?  I didn't
see anything in my inbox, but perhaps I missed it ...

-- 
paul moore
www.paul-moore.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH rdma-next] IB/core: Don't register MAD agents for LSM notifications
  2019-02-01 13:57                     ` Paul Moore
@ 2019-02-01 14:16                       ` Daniel Jurgens
  2019-02-01 16:09                         ` Paul Moore
  0 siblings, 1 reply; 19+ messages in thread
From: Daniel Jurgens @ 2019-02-01 14:16 UTC (permalink / raw)
  To: Paul Moore
  Cc: Don Dutile, Leon Romanovsky, Doug Ledford, Jason Gunthorpe,
	RDMA mailing list, selinux, Leon Romanovsky


On 2/1/2019 7:57 AM, Paul Moore wrote:
> On Tue, Jan 29, 2019 at 4:13 PM Paul Moore <paul@paul-moore.com> wrote:
>> On Tue, Jan 29, 2019 at 3:58 PM Daniel Jurgens <danielj@mellanox.com> wrote:
>>> On 1/29/2019 2:51 PM, Paul Moore wrote:
>>>> Okay, so let's attempt the change above where we just do the access
>>>> check directly.  Although I'm a little concerned that without a
>>>> reproducer we might not end up fixing the problem we're trying to fix.
>>>> Is anyone in touch with the person who originally reported the
>>>> problem?  It would be great if we could get that person to verify the
>>>> change ...
>>> I decided to go with maintaining a list in IB core. The notifier call is done under rcu_read_lock vs spin_lock_irq for register/unregister, so we shouldn't have any problems in that case. So only registering once basically achieves the same thing as taking it all out. I'm testing it now. I'll send it for internal review today assuming it checks out. Hopefully Leon can get it posted tomorrow, I know Don has some schedule pressure here.
>> Okay sounds good.  We're still at -rc4 so as long as we can get
>> something posted this week, or early next, I see no reason why it
>> can't make the upcoming merge window.
>>
>> I'm guessing Don's schedule pressure is more a RH deadline, and not an
>> upstream constraint.
> I just wanted to check in and see how this was progressing?  I didn't
> see anything in my inbox, but perhaps I missed it ...
It's passed internal review. Leon should send it soon (Sunday at the soonest, IL has Friday-Saturday weekends), he may be waiting for a regression run to finish.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH rdma-next] IB/core: Don't register MAD agents for LSM notifications
  2019-01-29 17:30         ` Paul Moore
  2019-01-29 17:48           ` Daniel Jurgens
@ 2019-02-01 14:21           ` Don Dutile
  2019-02-01 16:25             ` Paul Moore
  1 sibling, 1 reply; 19+ messages in thread
From: Don Dutile @ 2019-02-01 14:21 UTC (permalink / raw)
  To: Paul Moore
  Cc: Daniel Jurgens, Leon Romanovsky, Doug Ledford, Jason Gunthorpe,
	RDMA mailing list, selinux, Leon Romanovsky

On 01/29/2019 12:30 PM, Paul Moore wrote:
> On Mon, Jan 28, 2019 at 9:57 PM Don Dutile <ddutile@redhat.com> wrote:
>> On 01/28/2019 06:03 PM, Paul Moore wrote:
>>> On Mon, Jan 28, 2019 at 11:57 AM Daniel Jurgens <danielj@mellanox.com> wrote:
>>>> On 1/28/2019 10:37 AM, Paul Moore wrote:
>>>>> On Sun, Jan 27, 2019 at 3:10 AM Leon Romanovsky <leon@kernel.org> wrote:
>>>>>> From: Daniel Jurgens <danielj@mellanox.com>
>>>>>>
>>>>>> ---
>>>>>>    drivers/infiniband/core/security.c | 34 ++++--------------------------
>>>>>>    include/rdma/ib_mad.h              |  3 ---
>>>>>>    2 files changed, 4 insertions(+), 33 deletions(-)
>>>>> Perhaps predictably, I'm not very excited about this change.  Have you
>>>>> looked closer into the slowdown to see where the cycles are being
>>>>> spent?  I'm wondering if the issue is that a large number of notifiers
>>>>> are being registered with the same priority causing the while loop in
>>>>> notifier_chain_register() to take a significant amount of time.
>>>>
>>>> That's what's happening, each MAD agent is registering it's own notifier. The bug reporter was creating hundreds or thousands of  short lived MAD agents. With IRQs disabled too long it resulted in timeouts.
>>>>
>>>> When I initially added the notifier mechanism I thought it was you that said it wasn't really needed, since access wasn't generally revoked in these types of scenarios. Given that I didn't think this would be especially controversial. It was nice to have, unfortunately it causes problems even for users that don't enable SELinux.
>>>
>>> Revoking permission is difficult, and in some cases likely impossible,
>>> but that doesn't mean we shouldn't try to make it possible when we
>>> can.  I'd like to see if we can sort this out before we give up and
>>> rip it out.
>>
>> Agree that it'd be nice to see it work, but since few (no other?) subsys registered
>> with selinux/security is doing this, can we 'release' the IB mad agents from this
>> 'imprisonment' and figure out a general solution that resolves this issue, and
>> then it can be implemented not only for ib mad packet access, but for other
>> subsys using selinux/security, and wanting the same revocation support.
> 
> If you scroll up, ever so slightly to my last reply (it's just a few
> lines, I'll wait), you'll see where I said "I'd like to see if we can
> sort this out before we give up and rip it out."
> 
> I don't recall at the moment why the notifier approach was taken with
> the IB hooks back when Daniel wrote the code, and I'm definitely not
> sure what other subsystems you are using for your comparisons here,
> but I can tell you that I am beyond sick and tired of hearing people
> immediately jump to "rip out this security code" because of some
> performance glitch.  I care about performance too, okay?  We've got a
I didn't advocate for ripping out the security code; I advocated to eliminating
the notifier callback, not setup by many other subsystem selinux consumers,
so the notifier performance issue could be resolved.
I see this as a notifier perf issue, not a security issue.

> bunch of clever people here, let's spend some time and see if we can
> improve what we've got before we make some knee jerk reaction and just
> rip out the access controls.
> 
It's not knee jerk.  I spent hours hacking away at the RHEL implementation
to skip notification registration if selinux is disabled... the short-term 'hack'
for the impacted user(s), since the users weren't enabling selinux. :(
*but*, I'm looking for a long-term, selinux-enabled solution.

>>> I'm trying to quickly understand the MAD agent lifecycle, and it looks
>>> like you have your own register/unregister routines, with locking, so
>>> is it reasonable to assume that it would be possible to iterate over
>>> the MAD agents in the IB code?  I wonder if it would be possible to
>>> group MAD agents (per-port grouping, does that make sense?) such that
>>> several agents would share a single LSM notifier registration?
>>>
>> one can have numerous MAD agents.  it isn't just one.
> 
> Clearly.  Daniel already mentioned hundreds of thousands of MAD agents.
> 
>> Think: you can have any number of apps scanning the fabric for various stats
>>          to draw node/graph pics, gather per-port packet counts, gather bw numbers,
>>          etc.  For IB, scanning the fabric involves 'mad pkts'.
>>          In addition to scanning, one could do 'less then nice things' to switches,
>>          not just read stats but set parameters... like routing table entries.
>>          and thus, the security level addition.
>>         One is highly unlikely to grant access to an app to do mad pkt generation/use,
>>         then change your mind (semi-?)randomly to enable/disable it.... generally always enabled,
>>         or always disabled.
>>
>> Let's take this case as a catalyst to resolve a previously unknown perf issue,
>> and not hold it up to a higher functional requirement then other apps that want to be secured.
> 
> See my comment above.  I don't even know what you are talking about
> regarding "higher functional arguments", what other subsystems?
> Seriously Don, language like this causes an extreme allergic reaction.
> 
You have to read more carefully.  As I stated above, I'm looking to remove
a unique method in the ib-selinux setup, not remove selinux support entirely.
I wasn't advocating for a full-revert.

> Anyway, let's move on to something a bit more constructive, shall we?
> 
Agree.
Looking fwd to Daniel's patches of his proposed work-around... which is
effectively to reduce the notifier use.

>  From what I've gathered from this thread the issue stems from the
> ib_mad_agent_security_setup() where we register the
> ib_mad_agent_security_change() callback with the LSM notification
> mechanism.  Looking at the ib_mad_agent_security_change() function,
> all it seems to do is perform a LSM permission check
> (security_ib_endport_manage_subnet()) and store the result in
> ib_mad_agent->smp_allowed; quickly grep'ing through the code, the only
> place where I see that value used is in ib_mad_enforce_security().
> 
> Looking at ib_mad_enforce_security(), it seems to have access to the
> necessary MAD agent information, via ib_mad_agent_private->agent, to
> perform the LSM permission check directly in this function.  Could we
> not get rid of the MAD agent LSM notification callback and simply do
> the LSM access check directly in ib_mad_enforce_security()?  What am I
> missing?
> 
> I do see that the SELinux implementation of the
> security_ib_endport_manage_subnet() LSM hook doesn't seem to cache the
> IB endport labels.  If that proves to be a performance issue we can
> create a cache for that (like we do for the pkeys and a few other
> objects).
> 
> I barely understand IB, so it is very possible there is a fundamental
> flaw in the logic above, but let's try to work together and figure
> something out.  The above may not be it, but I'm not ready to throw in
> the towel.
> 
Thanks for your time & review.
--dd


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH rdma-next] IB/core: Don't register MAD agents for LSM notifications
  2019-02-01 14:16                       ` Daniel Jurgens
@ 2019-02-01 16:09                         ` Paul Moore
  2019-02-08 19:58                           ` Don Dutile
  0 siblings, 1 reply; 19+ messages in thread
From: Paul Moore @ 2019-02-01 16:09 UTC (permalink / raw)
  To: Daniel Jurgens
  Cc: Don Dutile, Leon Romanovsky, Doug Ledford, Jason Gunthorpe,
	RDMA mailing list, selinux, Leon Romanovsky

On Fri, Feb 1, 2019 at 9:16 AM Daniel Jurgens <danielj@mellanox.com> wrote:
> On 2/1/2019 7:57 AM, Paul Moore wrote:
> > On Tue, Jan 29, 2019 at 4:13 PM Paul Moore <paul@paul-moore.com> wrote:
> >> On Tue, Jan 29, 2019 at 3:58 PM Daniel Jurgens <danielj@mellanox.com> wrote:
> >>> On 1/29/2019 2:51 PM, Paul Moore wrote:
> >>>> Okay, so let's attempt the change above where we just do the access
> >>>> check directly.  Although I'm a little concerned that without a
> >>>> reproducer we might not end up fixing the problem we're trying to fix.
> >>>> Is anyone in touch with the person who originally reported the
> >>>> problem?  It would be great if we could get that person to verify the
> >>>> change ...
> >>> I decided to go with maintaining a list in IB core. The notifier call is done under rcu_read_lock vs spin_lock_irq for register/unregister, so we shouldn't have any problems in that case. So only registering once basically achieves the same thing as taking it all out. I'm testing it now. I'll send it for internal review today assuming it checks out. Hopefully Leon can get it posted tomorrow, I know Don has some schedule pressure here.
> >> Okay sounds good.  We're still at -rc4 so as long as we can get
> >> something posted this week, or early next, I see no reason why it
> >> can't make the upcoming merge window.
> >>
> >> I'm guessing Don's schedule pressure is more a RH deadline, and not an
> >> upstream constraint.
> > I just wanted to check in and see how this was progressing?  I didn't
> > see anything in my inbox, but perhaps I missed it ...
> It's passed internal review. Leon should send it soon (Sunday at the soonest, IL has Friday-Saturday weekends), he may be waiting for a regression run to finish.

Great, thanks for the update.

-- 
paul moore
www.paul-moore.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH rdma-next] IB/core: Don't register MAD agents for LSM notifications
  2019-02-01 14:21           ` Don Dutile
@ 2019-02-01 16:25             ` Paul Moore
  0 siblings, 0 replies; 19+ messages in thread
From: Paul Moore @ 2019-02-01 16:25 UTC (permalink / raw)
  To: Don Dutile
  Cc: Daniel Jurgens, Leon Romanovsky, Doug Ledford, Jason Gunthorpe,
	RDMA mailing list, selinux, Leon Romanovsky

On Fri, Feb 1, 2019 at 9:21 AM Don Dutile <ddutile@redhat.com> wrote:
> On 01/29/2019 12:30 PM, Paul Moore wrote:
> > On Mon, Jan 28, 2019 at 9:57 PM Don Dutile <ddutile@redhat.com> wrote:
> >> On 01/28/2019 06:03 PM, Paul Moore wrote:
> >>> On Mon, Jan 28, 2019 at 11:57 AM Daniel Jurgens <danielj@mellanox.com> wrote:
> >>>> On 1/28/2019 10:37 AM, Paul Moore wrote:
> >>>>> On Sun, Jan 27, 2019 at 3:10 AM Leon Romanovsky <leon@kernel.org> wrote:
> >>>>>> From: Daniel Jurgens <danielj@mellanox.com>
> >>>>>>
> >>>>>> ---
> >>>>>>    drivers/infiniband/core/security.c | 34 ++++--------------------------
> >>>>>>    include/rdma/ib_mad.h              |  3 ---
> >>>>>>    2 files changed, 4 insertions(+), 33 deletions(-)
> >>>>> Perhaps predictably, I'm not very excited about this change.  Have you
> >>>>> looked closer into the slowdown to see where the cycles are being
> >>>>> spent?  I'm wondering if the issue is that a large number of notifiers
> >>>>> are being registered with the same priority causing the while loop in
> >>>>> notifier_chain_register() to take a significant amount of time.
> >>>>
> >>>> That's what's happening, each MAD agent is registering it's own notifier. The bug reporter was creating hundreds or thousands of  short lived MAD agents. With IRQs disabled too long it resulted in timeouts.
> >>>>
> >>>> When I initially added the notifier mechanism I thought it was you that said it wasn't really needed, since access wasn't generally revoked in these types of scenarios. Given that I didn't think this would be especially controversial. It was nice to have, unfortunately it causes problems even for users that don't enable SELinux.
> >>>
> >>> Revoking permission is difficult, and in some cases likely impossible,
> >>> but that doesn't mean we shouldn't try to make it possible when we
> >>> can.  I'd like to see if we can sort this out before we give up and
> >>> rip it out.
> >>
> >> Agree that it'd be nice to see it work, but since few (no other?) subsys registered
> >> with selinux/security is doing this, can we 'release' the IB mad agents from this
> >> 'imprisonment' and figure out a general solution that resolves this issue, and
> >> then it can be implemented not only for ib mad packet access, but for other
> >> subsys using selinux/security, and wanting the same revocation support.
> >
> > If you scroll up, ever so slightly to my last reply (it's just a few
> > lines, I'll wait), you'll see where I said "I'd like to see if we can
> > sort this out before we give up and rip it out."
> >
> > I don't recall at the moment why the notifier approach was taken with
> > the IB hooks back when Daniel wrote the code, and I'm definitely not
> > sure what other subsystems you are using for your comparisons here,
> > but I can tell you that I am beyond sick and tired of hearing people
> > immediately jump to "rip out this security code" because of some
> > performance glitch.  I care about performance too, okay?  We've got a
>
> I didn't advocate for ripping out the security code; I advocated to eliminating
> the notifier callback, not setup by many other subsystem selinux consumers,
> so the notifier performance issue could be resolved.
> I see this as a notifier perf issue, not a security issue.

In the original patch Daniel posted, removing the notifier callback
resulted in the removing the security_ib_endport_manage_subnet() call,
which is removing security code.  I described one way to remove the
MAD LSM notifier entirely (basically move the LSM hook down into
ib_mad_enforce_security()) as well as one way to reduce it's usage
(group MAD agents together); it looks like Daniel is taking the latter
approach.  Either option should mitigate the locking issue we're
currently seeing while preserving the access controls of the current
code.

For the record, the IB code has a LSM notifier callback here because
it is caching an access decision (ib_mad_agent.smp_allowed) so the IB
code needs to be notified when the LSM's security policy changes so it
can update it's cached access control decision.  You don't see the
notifier used in other kernel subsystems because they don't cache
access decisions like IB does.  This is why the claim of IB being
unfairly singled out with the LSM notifier is misleading, and not
correct IMHO.

> > bunch of clever people here, let's spend some time and see if we can
> > improve what we've got before we make some knee jerk reaction and just
> > rip out the access controls.
>
> It's not knee jerk.  I spent hours hacking away at the RHEL implementation
> to skip notification registration if selinux is disabled... the short-term 'hack'
> for the impacted user(s), since the users weren't enabling selinux. :(
> *but*, I'm looking for a long-term, selinux-enabled solution.

Removing the LSM hook without making some effort at preserving the
access control made it look like a knee jerk to me, but perhaps I've
seen this stuff too often and it strikes a nerve.  Regardless, it
looks like we have some options now for a long(er) term solution which
should address both our concerns.

-- 
paul moore
www.paul-moore.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH rdma-next] IB/core: Don't register MAD agents for LSM notifications
  2019-02-01 16:09                         ` Paul Moore
@ 2019-02-08 19:58                           ` Don Dutile
  2019-02-08 20:04                             ` Paul Moore
  0 siblings, 1 reply; 19+ messages in thread
From: Don Dutile @ 2019-02-08 19:58 UTC (permalink / raw)
  To: Paul Moore, Daniel Jurgens
  Cc: Leon Romanovsky, Doug Ledford, Jason Gunthorpe,
	RDMA mailing list, selinux, Leon Romanovsky

On 02/01/2019 11:09 AM, Paul Moore wrote:
> On Fri, Feb 1, 2019 at 9:16 AM Daniel Jurgens <danielj@mellanox.com> wrote:
>> On 2/1/2019 7:57 AM, Paul Moore wrote:
>>> On Tue, Jan 29, 2019 at 4:13 PM Paul Moore <paul@paul-moore.com> wrote:
>>>> On Tue, Jan 29, 2019 at 3:58 PM Daniel Jurgens <danielj@mellanox.com> wrote:
>>>>> On 1/29/2019 2:51 PM, Paul Moore wrote:
>>>>>> Okay, so let's attempt the change above where we just do the access
>>>>>> check directly.  Although I'm a little concerned that without a
>>>>>> reproducer we might not end up fixing the problem we're trying to fix.
>>>>>> Is anyone in touch with the person who originally reported the
>>>>>> problem?  It would be great if we could get that person to verify the
>>>>>> change ...
>>>>> I decided to go with maintaining a list in IB core. The notifier call is done under rcu_read_lock vs spin_lock_irq for register/unregister, so we shouldn't have any problems in that case. So only registering once basically achieves the same thing as taking it all out. I'm testing it now. I'll send it for internal review today assuming it checks out. Hopefully Leon can get it posted tomorrow, I know Don has some schedule pressure here.
>>>> Okay sounds good.  We're still at -rc4 so as long as we can get
>>>> something posted this week, or early next, I see no reason why it
>>>> can't make the upcoming merge window.
>>>>
>>>> I'm guessing Don's schedule pressure is more a RH deadline, and not an
>>>> upstream constraint.
>>> I just wanted to check in and see how this was progressing?  I didn't
>>> see anything in my inbox, but perhaps I missed it ...
>> It's passed internal review. Leon should send it soon (Sunday at the soonest, IL has Friday-Saturday weekends), he may be waiting for a regression run to finish.
> 
> Great, thanks for the update.
> 
update? we're approaching the 2nd Sunday after 2/1/2019, when this email was sent ...


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH rdma-next] IB/core: Don't register MAD agents for LSM notifications
  2019-02-08 19:58                           ` Don Dutile
@ 2019-02-08 20:04                             ` Paul Moore
  2019-02-08 22:29                               ` Don Dutile
  0 siblings, 1 reply; 19+ messages in thread
From: Paul Moore @ 2019-02-08 20:04 UTC (permalink / raw)
  To: Don Dutile
  Cc: Daniel Jurgens, Leon Romanovsky, Doug Ledford, Jason Gunthorpe,
	RDMA mailing list, selinux, Leon Romanovsky

On Fri, Feb 8, 2019 at 2:58 PM Don Dutile <ddutile@redhat.com> wrote:
> On 02/01/2019 11:09 AM, Paul Moore wrote:
> > On Fri, Feb 1, 2019 at 9:16 AM Daniel Jurgens <danielj@mellanox.com> wrote:
> >> On 2/1/2019 7:57 AM, Paul Moore wrote:
> >>> On Tue, Jan 29, 2019 at 4:13 PM Paul Moore <paul@paul-moore.com> wrote:
> >>>> On Tue, Jan 29, 2019 at 3:58 PM Daniel Jurgens <danielj@mellanox.com> wrote:
> >>>>> On 1/29/2019 2:51 PM, Paul Moore wrote:
> >>>>>> Okay, so let's attempt the change above where we just do the access
> >>>>>> check directly.  Although I'm a little concerned that without a
> >>>>>> reproducer we might not end up fixing the problem we're trying to fix.
> >>>>>> Is anyone in touch with the person who originally reported the
> >>>>>> problem?  It would be great if we could get that person to verify the
> >>>>>> change ...
> >>>>> I decided to go with maintaining a list in IB core. The notifier call is done under rcu_read_lock vs spin_lock_irq for register/unregister, so we shouldn't have any problems in that case. So only registering once basically achieves the same thing as taking it all out. I'm testing it now. I'll send it for internal review today assuming it checks out. Hopefully Leon can get it posted tomorrow, I know Don has some schedule pressure here.
> >>>> Okay sounds good.  We're still at -rc4 so as long as we can get
> >>>> something posted this week, or early next, I see no reason why it
> >>>> can't make the upcoming merge window.
> >>>>
> >>>> I'm guessing Don's schedule pressure is more a RH deadline, and not an
> >>>> upstream constraint.
> >>> I just wanted to check in and see how this was progressing?  I didn't
> >>> see anything in my inbox, but perhaps I missed it ...
> >> It's passed internal review. Leon should send it soon (Sunday at the soonest, IL has Friday-Saturday weekends), he may be waiting for a regression run to finish.
> >
> > Great, thanks for the update.
>
> update? we're approaching the 2nd Sunday after 2/1/2019, when this email was sent ...

Leon sent out an updated patchset on February 2nd, and I sent a
reply/ack for the SELinux relevant patch a few days later on the 6th.
Archive link below, but it looks like you were CC'd ... ?

* https://lore.kernel.org/selinux/20190202090945.4106-1-leon@kernel.org

-- 
paul moore
www.paul-moore.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH rdma-next] IB/core: Don't register MAD agents for LSM notifications
  2019-02-08 20:04                             ` Paul Moore
@ 2019-02-08 22:29                               ` Don Dutile
  0 siblings, 0 replies; 19+ messages in thread
From: Don Dutile @ 2019-02-08 22:29 UTC (permalink / raw)
  To: Paul Moore
  Cc: Daniel Jurgens, Leon Romanovsky, Doug Ledford, Jason Gunthorpe,
	RDMA mailing list, selinux, Leon Romanovsky

On 02/08/2019 03:04 PM, Paul Moore wrote:
> On Fri, Feb 8, 2019 at 2:58 PM Don Dutile <ddutile@redhat.com> wrote:
>> On 02/01/2019 11:09 AM, Paul Moore wrote:
>>> On Fri, Feb 1, 2019 at 9:16 AM Daniel Jurgens <danielj@mellanox.com> wrote:
>>>> On 2/1/2019 7:57 AM, Paul Moore wrote:
>>>>> On Tue, Jan 29, 2019 at 4:13 PM Paul Moore <paul@paul-moore.com> wrote:
>>>>>> On Tue, Jan 29, 2019 at 3:58 PM Daniel Jurgens <danielj@mellanox.com> wrote:
>>>>>>> On 1/29/2019 2:51 PM, Paul Moore wrote:
>>>>>>>> Okay, so let's attempt the change above where we just do the access
>>>>>>>> check directly.  Although I'm a little concerned that without a
>>>>>>>> reproducer we might not end up fixing the problem we're trying to fix.
>>>>>>>> Is anyone in touch with the person who originally reported the
>>>>>>>> problem?  It would be great if we could get that person to verify the
>>>>>>>> change ...
>>>>>>> I decided to go with maintaining a list in IB core. The notifier call is done under rcu_read_lock vs spin_lock_irq for register/unregister, so we shouldn't have any problems in that case. So only registering once basically achieves the same thing as taking it all out. I'm testing it now. I'll send it for internal review today assuming it checks out. Hopefully Leon can get it posted tomorrow, I know Don has some schedule pressure here.
>>>>>> Okay sounds good.  We're still at -rc4 so as long as we can get
>>>>>> something posted this week, or early next, I see no reason why it
>>>>>> can't make the upcoming merge window.
>>>>>>
>>>>>> I'm guessing Don's schedule pressure is more a RH deadline, and not an
>>>>>> upstream constraint.
>>>>> I just wanted to check in and see how this was progressing?  I didn't
>>>>> see anything in my inbox, but perhaps I missed it ...
>>>> It's passed internal review. Leon should send it soon (Sunday at the soonest, IL has Friday-Saturday weekends), he may be waiting for a regression run to finish.
>>>
>>> Great, thanks for the update.
>>
>> update? we're approaching the 2nd Sunday after 2/1/2019, when this email was sent ...
> 
> Leon sent out an updated patchset on February 2nd, and I sent a
> reply/ack for the SELinux relevant patch a few days later on the 6th.
> Archive link below, but it looks like you were CC'd ... ?
> 
> * https://lore.kernel.org/selinux/20190202090945.4106-1-leon@kernel.org
> 
Thanks for ptr.  I was looking for a posting from Daniel; forgot Leon was doing it.
I should have search for email from you! ;-)

Finally found it... lost in a several-thousand email stampede here. :-p
I thought I had seen something... and I viewed it while dealing with a massive head cold...
lousy meds! :-p

I guess the maintainers haven't pulled it into an rc branch yet, though,
since a pull/merge doesn't show it up in (rdma-linux's) for-rc or for-next
(or any other jgg or dl-for-* branch I dug around in).
/me needs c-p-able commit-id(s), so it's the '2nd Sunday' for me...



^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2019-02-08 22:29 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-27  8:10 [PATCH rdma-next] IB/core: Don't register MAD agents for LSM notifications Leon Romanovsky
2019-01-28 16:37 ` Paul Moore
2019-01-28 16:57   ` Daniel Jurgens
2019-01-28 23:03     ` Paul Moore
2019-01-29  2:57       ` Don Dutile
2019-01-29 17:30         ` Paul Moore
2019-01-29 17:48           ` Daniel Jurgens
2019-01-29 18:02             ` Daniel Jurgens
2019-01-29 20:51               ` Paul Moore
2019-01-29 20:58                 ` Daniel Jurgens
2019-01-29 21:13                   ` Paul Moore
2019-02-01 13:57                     ` Paul Moore
2019-02-01 14:16                       ` Daniel Jurgens
2019-02-01 16:09                         ` Paul Moore
2019-02-08 19:58                           ` Don Dutile
2019-02-08 20:04                             ` Paul Moore
2019-02-08 22:29                               ` Don Dutile
2019-02-01 14:21           ` Don Dutile
2019-02-01 16:25             ` Paul Moore

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).