All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] x86/irq: Skip unmap_domain_pirq XSM during destruction
@ 2022-03-30 18:17 Jason Andryuk
  2022-03-30 18:27 ` Daniel P. Smith
  2022-04-05  8:18 ` Jan Beulich
  0 siblings, 2 replies; 5+ messages in thread
From: Jason Andryuk @ 2022-03-30 18:17 UTC (permalink / raw)
  To: xen-devel
  Cc: Jason Andryuk, Jan Beulich, Andrew Cooper, Roger Pau Monné,
	Wei Liu, Daniel P. Smith

xsm_unmap_domain_irq was seen denying unmap_domain_pirq when called from
complete_domain_destroy as an RCU callback.  The source context was an
unexpected, random domain.  Since this is a xen-internal operation,
we don't want the XSM hook denying the operation.

Check d->is_dying and skip the check when the domain is dead.  The RCU
callback runs when a domain is in that state.

Suggested-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
---
Dan wants to change current to point at DOMID_IDLE when the RCU callback
runs.  I think Juergen's commit 53594c7bd197 "rcu: don't use
stop_machine_run() for rcu_barrier()" may have changed this since it
mentions stop_machine_run scheduled the idle vcpus to run the callbacks
for the old code.

Would that be as easy as changing rcu_do_batch() to do:

+        /* Run as "Xen" not a random domain's vcpu. */
+        vcpu = get_current();
+        set_current(idle_vcpu[smp_processor_id()]);
         list->func(list);
+        set_current(vcpu);

or is using set_current() only acceptable as part of context_switch?

 xen/arch/x86/irq.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
index 285ac399fb..16488d287c 100644
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -2340,10 +2340,14 @@ int unmap_domain_pirq(struct domain *d, int pirq)
         nr = msi_desc->msi.nvec;
     }
 
-    ret = xsm_unmap_domain_irq(XSM_HOOK, d, irq,
-                               msi_desc ? msi_desc->dev : NULL);
-    if ( ret )
-        goto done;
+    /* When called by complete_domain_destroy via RCU, current is a random
+     * domain.  Skip the XSM check since this is a Xen-initiated action. */
+    if ( d->is_dying != DOMDYING_dead ) {
+        ret = xsm_unmap_domain_irq(XSM_HOOK, d, irq,
+                                   msi_desc ? msi_desc->dev : NULL);
+        if ( ret )
+            goto done;
+    }
 
     forced_unbind = pirq_guest_force_unbind(d, info);
     if ( forced_unbind )
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] x86/irq: Skip unmap_domain_pirq XSM during destruction
  2022-03-30 18:17 [PATCH] x86/irq: Skip unmap_domain_pirq XSM during destruction Jason Andryuk
@ 2022-03-30 18:27 ` Daniel P. Smith
  2022-04-05  8:18 ` Jan Beulich
  1 sibling, 0 replies; 5+ messages in thread
From: Daniel P. Smith @ 2022-03-30 18:27 UTC (permalink / raw)
  To: Jason Andryuk, xen-devel
  Cc: Jan Beulich, Andrew Cooper, Roger Pau Monné, Wei Liu

On 3/30/22 14:17, Jason Andryuk wrote:
> xsm_unmap_domain_irq was seen denying unmap_domain_pirq when called from
> complete_domain_destroy as an RCU callback.  The source context was an
> unexpected, random domain.  Since this is a xen-internal operation,
> we don't want the XSM hook denying the operation.
> 
> Check d->is_dying and skip the check when the domain is dead.  The RCU
> callback runs when a domain is in that state.
> 
> Suggested-by: Roger Pau Monné <roger.pau@citrix.com>
> Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
> ---
> Dan wants to change current to point at DOMID_IDLE when the RCU callback
> runs.  I think Juergen's commit 53594c7bd197 "rcu: don't use
> stop_machine_run() for rcu_barrier()" may have changed this since it
> mentions stop_machine_run scheduled the idle vcpus to run the callbacks
> for the old code.
> 
> Would that be as easy as changing rcu_do_batch() to do:
> 
> +        /* Run as "Xen" not a random domain's vcpu. */
> +        vcpu = get_current();
> +        set_current(idle_vcpu[smp_processor_id()]);
>          list->func(list);
> +        set_current(vcpu);
> 
> or is using set_current() only acceptable as part of context_switch?
> 
>  xen/arch/x86/irq.c | 12 ++++++++----
>  1 file changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
> index 285ac399fb..16488d287c 100644
> --- a/xen/arch/x86/irq.c
> +++ b/xen/arch/x86/irq.c
> @@ -2340,10 +2340,14 @@ int unmap_domain_pirq(struct domain *d, int pirq)
>          nr = msi_desc->msi.nvec;
>      }
>  
> -    ret = xsm_unmap_domain_irq(XSM_HOOK, d, irq,
> -                               msi_desc ? msi_desc->dev : NULL);
> -    if ( ret )
> -        goto done;
> +    /* When called by complete_domain_destroy via RCU, current is a random
> +     * domain.  Skip the XSM check since this is a Xen-initiated action. */
> +    if ( d->is_dying != DOMDYING_dead ) {
> +        ret = xsm_unmap_domain_irq(XSM_HOOK, d, irq,
> +                                   msi_desc ? msi_desc->dev : NULL);
> +        if ( ret )
> +            goto done;
> +    }
>  
>      forced_unbind = pirq_guest_force_unbind(d, info);
>      if ( forced_unbind )

I think this is sufficient though IMHO it would make it stronger if
current accurately reflected the idle domain and the condition was added
to the if statement check this fact.

Reviewed-by: Daniel P. Smith <dpsmith@apertussolutions.com>


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] x86/irq: Skip unmap_domain_pirq XSM during destruction
  2022-03-30 18:17 [PATCH] x86/irq: Skip unmap_domain_pirq XSM during destruction Jason Andryuk
  2022-03-30 18:27 ` Daniel P. Smith
@ 2022-04-05  8:18 ` Jan Beulich
  2022-04-05 13:13   ` Daniel P. Smith
  2022-04-05 14:56   ` Jason Andryuk
  1 sibling, 2 replies; 5+ messages in thread
From: Jan Beulich @ 2022-04-05  8:18 UTC (permalink / raw)
  To: Jason Andryuk, Daniel P. Smith
  Cc: Andrew Cooper, Roger Pau Monné, Wei Liu, xen-devel

On 30.03.2022 20:17, Jason Andryuk wrote:
> xsm_unmap_domain_irq was seen denying unmap_domain_pirq when called from
> complete_domain_destroy as an RCU callback.  The source context was an
> unexpected, random domain.  Since this is a xen-internal operation,
> we don't want the XSM hook denying the operation.
> 
> Check d->is_dying and skip the check when the domain is dead.  The RCU
> callback runs when a domain is in that state.

One question which has always been puzzling me (perhaps to Daniel): While
I can see why mapping of an IRQ needs to be subject to an XSM check, it's
not really clear to me why unmapping would need to be, at least as long
as it's the domain itself which requests the unmap (and which I would
view to extend to the domain being cleaned up). But maybe that's why it's
XSM_HOOK ...

> ---
> Dan wants to change current to point at DOMID_IDLE when the RCU callback
> runs.  I think Juergen's commit 53594c7bd197 "rcu: don't use
> stop_machine_run() for rcu_barrier()" may have changed this since it
> mentions stop_machine_run scheduled the idle vcpus to run the callbacks
> for the old code.
> 
> Would that be as easy as changing rcu_do_batch() to do:
> 
> +        /* Run as "Xen" not a random domain's vcpu. */
> +        vcpu = get_current();
> +        set_current(idle_vcpu[smp_processor_id()]);
>          list->func(list);
> +        set_current(vcpu);
> 
> or is using set_current() only acceptable as part of context_switch?

Indeed I would question any uses outside of context_switch() (and
system bringup).

> --- a/xen/arch/x86/irq.c
> +++ b/xen/arch/x86/irq.c
> @@ -2340,10 +2340,14 @@ int unmap_domain_pirq(struct domain *d, int pirq)
>          nr = msi_desc->msi.nvec;
>      }
>  
> -    ret = xsm_unmap_domain_irq(XSM_HOOK, d, irq,
> -                               msi_desc ? msi_desc->dev : NULL);
> -    if ( ret )
> -        goto done;
> +    /* When called by complete_domain_destroy via RCU, current is a random
> +     * domain.  Skip the XSM check since this is a Xen-initiated action. */

Comment style.

> +    if ( d->is_dying != DOMDYING_dead ) {

Please use !d->is_dying. Also please correct the placement of the brace.
Or you could avoid the need for a brace by leveraging that ret is zero
ahead of this if(), i.e. ...

> +        ret = xsm_unmap_domain_irq(XSM_HOOK, d, irq,
> +                                   msi_desc ? msi_desc->dev : NULL);
> +        if ( ret )
> +            goto done;
> +    }


    if ( !d->is_dying )
        ret = xsm_unmap_domain_irq(XSM_HOOK, d, irq,
                                   msi_desc ? msi_desc->dev : NULL);
    if ( ret )
        goto done;

Jan



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] x86/irq: Skip unmap_domain_pirq XSM during destruction
  2022-04-05  8:18 ` Jan Beulich
@ 2022-04-05 13:13   ` Daniel P. Smith
  2022-04-05 14:56   ` Jason Andryuk
  1 sibling, 0 replies; 5+ messages in thread
From: Daniel P. Smith @ 2022-04-05 13:13 UTC (permalink / raw)
  To: Jan Beulich, Jason Andryuk
  Cc: Andrew Cooper, Roger Pau Monné, Wei Liu, xen-devel

On 4/5/22 04:18, Jan Beulich wrote:
> On 30.03.2022 20:17, Jason Andryuk wrote:
>> xsm_unmap_domain_irq was seen denying unmap_domain_pirq when called from
>> complete_domain_destroy as an RCU callback.  The source context was an
>> unexpected, random domain.  Since this is a xen-internal operation,
>> we don't want the XSM hook denying the operation.
>>
>> Check d->is_dying and skip the check when the domain is dead.  The RCU
>> callback runs when a domain is in that state.
> 
> One question which has always been puzzling me (perhaps to Daniel): While
> I can see why mapping of an IRQ needs to be subject to an XSM check, it's
> not really clear to me why unmapping would need to be, at least as long
> as it's the domain itself which requests the unmap (and which I would
> view to extend to the domain being cleaned up). But maybe that's why it's
> XSM_HOOK ...

There are situations for instance where there is a flask-based system
with one or more domains (v-platform-mgr) that are each responsible for
the management of a subset of domains and are responsible for
hotplugging in and out a device, i.e. granting the privilege to a
v-platform-mgr to call PHYSDEVOP_map_pirq/PHYSDEVOP_unmap_pirq, for the
domains each one is managing.

>> ---
>> Dan wants to change current to point at DOMID_IDLE when the RCU callback
>> runs.  I think Juergen's commit 53594c7bd197 "rcu: don't use
>> stop_machine_run() for rcu_barrier()" may have changed this since it
>> mentions stop_machine_run scheduled the idle vcpus to run the callbacks
>> for the old code.
>>
>> Would that be as easy as changing rcu_do_batch() to do:
>>
>> +        /* Run as "Xen" not a random domain's vcpu. */
>> +        vcpu = get_current();
>> +        set_current(idle_vcpu[smp_processor_id()]);
>>          list->func(list);
>> +        set_current(vcpu);
>>
>> or is using set_current() only acceptable as part of context_switch?
> 
> Indeed I would question any uses outside of context_switch() (and
> system bringup).

I am not familiar with the details of the scheduler, but from a higher
level, conceptual perspective, I do not understand why an idle domain
task is being executed without an explicit context switch to the idle
domain to ensure the current world view is consistent with the task
execution scope. Just seems to me like this is creating a situation
where things have the potential to go sideways/wrong.

v/r,
dps


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] x86/irq: Skip unmap_domain_pirq XSM during destruction
  2022-04-05  8:18 ` Jan Beulich
  2022-04-05 13:13   ` Daniel P. Smith
@ 2022-04-05 14:56   ` Jason Andryuk
  1 sibling, 0 replies; 5+ messages in thread
From: Jason Andryuk @ 2022-04-05 14:56 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Daniel P. Smith, Andrew Cooper, Roger Pau Monné, Wei Liu, xen-devel

On Tue, Apr 5, 2022 at 4:18 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 30.03.2022 20:17, Jason Andryuk wrote:
> > xsm_unmap_domain_irq was seen denying unmap_domain_pirq when called from
> > complete_domain_destroy as an RCU callback.  The source context was an
> > unexpected, random domain.  Since this is a xen-internal operation,
> > we don't want the XSM hook denying the operation.
> >
> > Check d->is_dying and skip the check when the domain is dead.  The RCU
> > callback runs when a domain is in that state.
>
> One question which has always been puzzling me (perhaps to Daniel): While
> I can see why mapping of an IRQ needs to be subject to an XSM check, it's
> not really clear to me why unmapping would need to be, at least as long
> as it's the domain itself which requests the unmap (and which I would
> view to extend to the domain being cleaned up). But maybe that's why it's
> XSM_HOOK ...
>
> > ---
> > Dan wants to change current to point at DOMID_IDLE when the RCU callback
> > runs.  I think Juergen's commit 53594c7bd197 "rcu: don't use
> > stop_machine_run() for rcu_barrier()" may have changed this since it
> > mentions stop_machine_run scheduled the idle vcpus to run the callbacks
> > for the old code.
> >
> > Would that be as easy as changing rcu_do_batch() to do:
> >
> > +        /* Run as "Xen" not a random domain's vcpu. */
> > +        vcpu = get_current();
> > +        set_current(idle_vcpu[smp_processor_id()]);
> >          list->func(list);
> > +        set_current(vcpu);
> >
> > or is using set_current() only acceptable as part of context_switch?
>
> Indeed I would question any uses outside of context_switch() (and
> system bringup).
>
> > --- a/xen/arch/x86/irq.c
> > +++ b/xen/arch/x86/irq.c
> > @@ -2340,10 +2340,14 @@ int unmap_domain_pirq(struct domain *d, int pirq)
> >          nr = msi_desc->msi.nvec;
> >      }
> >
> > -    ret = xsm_unmap_domain_irq(XSM_HOOK, d, irq,
> > -                               msi_desc ? msi_desc->dev : NULL);
> > -    if ( ret )
> > -        goto done;
> > +    /* When called by complete_domain_destroy via RCU, current is a random
> > +     * domain.  Skip the XSM check since this is a Xen-initiated action. */
>
> Comment style.

Yes.  Sorry about that.

> > +    if ( d->is_dying != DOMDYING_dead ) {
>
> Please use !d->is_dying. Also please correct the placement of the brace.
> Or you could avoid the need for a brace by leveraging that ret is zero
> ahead of this if(), i.e. ...

Here I was patting myself on the back for remembering the spaces
inside the parens, and I screwed up the brace...  Sorry.

I intentionally chose DOMDYING_dead because, from my reading of the
code, complete_domain_destroy should only reach here when dead (and
not dying).  If this function is reached when DOMDYING_dying, then
that is unexpected.  That would be a guest-initiated action and
therefore the XSM check should apply.

Just checking is_dying is fine, but I want to explain and highlight this aspect.

> > +        ret = xsm_unmap_domain_irq(XSM_HOOK, d, irq,
> > +                                   msi_desc ? msi_desc->dev : NULL);
> > +        if ( ret )
> > +            goto done;
> > +    }
>
>
>     if ( !d->is_dying )
>         ret = xsm_unmap_domain_irq(XSM_HOOK, d, irq,
>                                    msi_desc ? msi_desc->dev : NULL);
>     if ( ret )
>         goto done;

I'm planning to just do it this way.

Thank you for reviewing.

-Jason


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-04-05 14:56 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-30 18:17 [PATCH] x86/irq: Skip unmap_domain_pirq XSM during destruction Jason Andryuk
2022-03-30 18:27 ` Daniel P. Smith
2022-04-05  8:18 ` Jan Beulich
2022-04-05 13:13   ` Daniel P. Smith
2022-04-05 14:56   ` Jason Andryuk

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.