* [Xen-devel] [PATCH v4] x86: irq: Do not BUG_ON multiple unbind calls for shared pirqs
@ 2020-03-06 16:02 paul
2020-03-09 16:29 ` Jan Beulich
0 siblings, 1 reply; 14+ messages in thread
From: paul @ 2020-03-06 16:02 UTC (permalink / raw)
To: xen-devel
Cc: Julien Grall, Paul Durrant, Andrew Cooper, Varad Gautam,
Jan Beulich, Roger Pau Monné
From: Varad Gautam <vrd@amazon.de>
XEN_DOMCTL_destroydomain creates a continuation if domain_kill -ERESTARTS.
In that scenario, it is possible to receive multiple __pirq_guest_unbind
calls for the same pirq from domain_kill, if the pirq has not yet been
removed from the domain's pirq_tree, as:
domain_kill()
-> domain_relinquish_resources()
-> pci_release_devices()
-> pci_clean_dpci_irq()
-> pirq_guest_unbind()
-> __pirq_guest_unbind()
For a shared pirq (nr_guests > 1), the first call would zap the current
domain from the pirq's guests[] list, but the action handler is never freed
as there are other guests using this pirq. As a result, on the second call,
__pirq_guest_unbind searches for the current domain which has been removed
from the guests[] list, and hits a BUG_ON.
Make __pirq_guest_unbind safe to be called multiple times by letting xen
continue if a shared pirq has already been unbound from this guest. The
PIRQ will be cleaned up from the domain's pirq_tree during the destruction
in complete_domain_destroy anyway.
Signed-off-by: Varad Gautam <vrd@amazon.de>
[taking over from Varad at v4]
Signed-off-by: Paul Durrant <paul@xen.org>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Julien Grall <julien@xen.org>
Cc: Roger Pau Monné <roger.pau@citrix.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Roger suggested cleaning the entry from the domain pirq_tree so that
we need not make it safe to re-call __pirq_guest_unbind(). This seems like
a reasonable suggestion but the semantics of the code are almost
impenetrable (e.g. 'pirq' is used to mean an index, a pointer and is also
the name of struct so you generally have little idea what it actally means)
so I prefer to stick with a small fix that I can actually reason about.
v4:
- Re-work the guest array search to make it clearer
v3:
- Style fixups
v2:
- Split the check on action->nr_guests > 0 and make it an ASSERT
---
xen/arch/x86/irq.c | 20 +++++++++++++++++---
1 file changed, 17 insertions(+), 3 deletions(-)
diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
index cc2eb8e925..32fcb624dc 100644
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -1680,9 +1680,23 @@ static irq_guest_action_t *__pirq_guest_unbind(
BUG_ON(!(desc->status & IRQ_GUEST));
- for ( i = 0; (i < action->nr_guests) && (action->guest[i] != d); i++ )
- continue;
- BUG_ON(i == action->nr_guests);
+ for ( i = 0; i < action->nr_guests; i++ )
+ if ( action->guest[i] == d )
+ break;
+
+ if ( i == action->nr_guests ) /* No matching entry */
+ {
+ /*
+ * In case the pirq was shared, unbound for this domain in an earlier
+ * call, but still existed on the domain's pirq_tree, we still reach
+ * here if there are any later unbind calls on the same pirq. Return
+ * if such an unbind happens.
+ */
+ ASSERT(action->shareable);
+ return NULL;
+ }
+
+ ASSERT(action->nr_guests > 0);
memmove(&action->guest[i], &action->guest[i+1],
(action->nr_guests-i-1) * sizeof(action->guest[0]));
action->nr_guests--;
--
2.20.1
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [Xen-devel] [PATCH v4] x86: irq: Do not BUG_ON multiple unbind calls for shared pirqs
2020-03-06 16:02 [Xen-devel] [PATCH v4] x86: irq: Do not BUG_ON multiple unbind calls for shared pirqs paul
@ 2020-03-09 16:29 ` Jan Beulich
2020-03-09 17:47 ` Paul Durrant
0 siblings, 1 reply; 14+ messages in thread
From: Jan Beulich @ 2020-03-09 16:29 UTC (permalink / raw)
To: paul
Cc: xen-devel, Varad Gautam, Andrew Cooper, Julien Grall,
Roger Pau Monné
On 06.03.2020 17:02, paul@xen.org wrote:
> From: Varad Gautam <vrd@amazon.de>
>
> XEN_DOMCTL_destroydomain creates a continuation if domain_kill -ERESTARTS.
> In that scenario, it is possible to receive multiple __pirq_guest_unbind
> calls for the same pirq from domain_kill, if the pirq has not yet been
> removed from the domain's pirq_tree, as:
> domain_kill()
> -> domain_relinquish_resources()
> -> pci_release_devices()
> -> pci_clean_dpci_irq()
> -> pirq_guest_unbind()
> -> __pirq_guest_unbind()
>
> For a shared pirq (nr_guests > 1), the first call would zap the current
> domain from the pirq's guests[] list, but the action handler is never freed
> as there are other guests using this pirq. As a result, on the second call,
> __pirq_guest_unbind searches for the current domain which has been removed
> from the guests[] list, and hits a BUG_ON.
>
> Make __pirq_guest_unbind safe to be called multiple times by letting xen
> continue if a shared pirq has already been unbound from this guest. The
> PIRQ will be cleaned up from the domain's pirq_tree during the destruction
> in complete_domain_destroy anyway.
>
> Signed-off-by: Varad Gautam <vrd@amazon.de>
> [taking over from Varad at v4]
> Signed-off-by: Paul Durrant <paul@xen.org>
> ---
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Julien Grall <julien@xen.org>
> Cc: Roger Pau Monné <roger.pau@citrix.com>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
>
> Roger suggested cleaning the entry from the domain pirq_tree so that
> we need not make it safe to re-call __pirq_guest_unbind(). This seems like
> a reasonable suggestion but the semantics of the code are almost
> impenetrable (e.g. 'pirq' is used to mean an index, a pointer and is also
> the name of struct so you generally have little idea what it actally means)
> so I prefer to stick with a small fix that I can actually reason about.
>
> v4:
> - Re-work the guest array search to make it clearer
I.e. there are cosmetic differences to v3 (see below), but
technically it's still the same. I can't believe the re-use
of "pirq" for different entities is this big of a problem.
But anyway:
> --- a/xen/arch/x86/irq.c
> +++ b/xen/arch/x86/irq.c
> @@ -1680,9 +1680,23 @@ static irq_guest_action_t *__pirq_guest_unbind(
>
> BUG_ON(!(desc->status & IRQ_GUEST));
>
> - for ( i = 0; (i < action->nr_guests) && (action->guest[i] != d); i++ )
> - continue;
> - BUG_ON(i == action->nr_guests);
> + for ( i = 0; i < action->nr_guests; i++ )
> + if ( action->guest[i] == d )
> + break;
> +
> + if ( i == action->nr_guests ) /* No matching entry */
> + {
> + /*
> + * In case the pirq was shared, unbound for this domain in an earlier
> + * call, but still existed on the domain's pirq_tree, we still reach
> + * here if there are any later unbind calls on the same pirq. Return
> + * if such an unbind happens.
> + */
> + ASSERT(action->shareable);
> + return NULL;
> + }
> +
> + ASSERT(action->nr_guests > 0);
This seems pointless to have here - v3 had it inside the if(),
where it would actually guard against coming here with nr_guests
equal to zero. v3 also used if() and BUG() instead of ASSERT()
inside this if(), which to me would seem more in line with our
current ./CODING_STYLE guidelines of handling unexpected
conditions. Could you clarify why you switched things?
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Xen-devel] [PATCH v4] x86: irq: Do not BUG_ON multiple unbind calls for shared pirqs
2020-03-09 16:29 ` Jan Beulich
@ 2020-03-09 17:47 ` Paul Durrant
2020-03-10 11:23 ` Jan Beulich
2020-03-10 14:19 ` Jan Beulich
0 siblings, 2 replies; 14+ messages in thread
From: Paul Durrant @ 2020-03-09 17:47 UTC (permalink / raw)
To: 'Jan Beulich'
Cc: xen-devel, 'Varad Gautam', 'Andrew Cooper',
'Julien Grall', 'Roger Pau Monné'
> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: 09 March 2020 16:29
> To: paul@xen.org
> Cc: xen-devel@lists.xenproject.org; Varad Gautam <vrd@amazon.de>; Julien Grall <julien@xen.org>; Roger
> Pau Monné <roger.pau@citrix.com>; Andrew Cooper <andrew.cooper3@citrix.com>
> Subject: Re: [PATCH v4] x86: irq: Do not BUG_ON multiple unbind calls for shared pirqs
>
> On 06.03.2020 17:02, paul@xen.org wrote:
> > From: Varad Gautam <vrd@amazon.de>
> >
> > XEN_DOMCTL_destroydomain creates a continuation if domain_kill -ERESTARTS.
> > In that scenario, it is possible to receive multiple __pirq_guest_unbind
> > calls for the same pirq from domain_kill, if the pirq has not yet been
> > removed from the domain's pirq_tree, as:
> > domain_kill()
> > -> domain_relinquish_resources()
> > -> pci_release_devices()
> > -> pci_clean_dpci_irq()
> > -> pirq_guest_unbind()
> > -> __pirq_guest_unbind()
> >
> > For a shared pirq (nr_guests > 1), the first call would zap the current
> > domain from the pirq's guests[] list, but the action handler is never freed
> > as there are other guests using this pirq. As a result, on the second call,
> > __pirq_guest_unbind searches for the current domain which has been removed
> > from the guests[] list, and hits a BUG_ON.
> >
> > Make __pirq_guest_unbind safe to be called multiple times by letting xen
> > continue if a shared pirq has already been unbound from this guest. The
> > PIRQ will be cleaned up from the domain's pirq_tree during the destruction
> > in complete_domain_destroy anyway.
> >
> > Signed-off-by: Varad Gautam <vrd@amazon.de>
> > [taking over from Varad at v4]
> > Signed-off-by: Paul Durrant <paul@xen.org>
> > ---
> > Cc: Jan Beulich <jbeulich@suse.com>
> > Cc: Julien Grall <julien@xen.org>
> > Cc: Roger Pau Monné <roger.pau@citrix.com>
> > Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> >
> > Roger suggested cleaning the entry from the domain pirq_tree so that
> > we need not make it safe to re-call __pirq_guest_unbind(). This seems like
> > a reasonable suggestion but the semantics of the code are almost
> > impenetrable (e.g. 'pirq' is used to mean an index, a pointer and is also
> > the name of struct so you generally have little idea what it actally means)
> > so I prefer to stick with a small fix that I can actually reason about.
> >
> > v4:
> > - Re-work the guest array search to make it clearer
>
> I.e. there are cosmetic differences to v3 (see below), but
> technically it's still the same. I can't believe the re-use
> of "pirq" for different entities is this big of a problem.
Please suggest code if you think it ought to be done differentely. I tried.
> But anyway:
>
> > --- a/xen/arch/x86/irq.c
> > +++ b/xen/arch/x86/irq.c
> > @@ -1680,9 +1680,23 @@ static irq_guest_action_t *__pirq_guest_unbind(
> >
> > BUG_ON(!(desc->status & IRQ_GUEST));
> >
> > - for ( i = 0; (i < action->nr_guests) && (action->guest[i] != d); i++ )
> > - continue;
> > - BUG_ON(i == action->nr_guests);
> > + for ( i = 0; i < action->nr_guests; i++ )
> > + if ( action->guest[i] == d )
> > + break;
> > +
> > + if ( i == action->nr_guests ) /* No matching entry */
> > + {
> > + /*
> > + * In case the pirq was shared, unbound for this domain in an earlier
> > + * call, but still existed on the domain's pirq_tree, we still reach
> > + * here if there are any later unbind calls on the same pirq. Return
> > + * if such an unbind happens.
> > + */
> > + ASSERT(action->shareable);
> > + return NULL;
> > + }
> > +
> > + ASSERT(action->nr_guests > 0);
>
> This seems pointless to have here - v3 had it inside the if(),
> where it would actually guard against coming here with nr_guests
> equal to zero.
Why. The code just after this decrements nr_guests so it seems like entirely the right point to have the ASSERT. I can make it ASSERT >= 0, if that makes more sense.
> v3 also used if() and BUG() instead of ASSERT()
> inside this if(), which to me would seem more in line with our
> current ./CODING_STYLE guidelines of handling unexpected
> conditions. Could you clarify why you switched things?
>
Because I don't think there is need to kill the host in a non-debug context if we hit this condition; it is entirely survivable as far as I can tell so a BUG_ON() did not seem appropriate.
Paul
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Xen-devel] [PATCH v4] x86: irq: Do not BUG_ON multiple unbind calls for shared pirqs
2020-03-09 17:47 ` Paul Durrant
@ 2020-03-10 11:23 ` Jan Beulich
2020-03-10 12:36 ` Paul Durrant
2020-03-10 14:19 ` Jan Beulich
1 sibling, 1 reply; 14+ messages in thread
From: Jan Beulich @ 2020-03-10 11:23 UTC (permalink / raw)
To: paul
Cc: xen-devel, 'Varad Gautam', 'Andrew Cooper',
'Julien Grall', 'Roger Pau Monné'
On 09.03.2020 18:47, Paul Durrant wrote:
>> -----Original Message-----
>> From: Jan Beulich <jbeulich@suse.com>
>> Sent: 09 March 2020 16:29
>>
>> On 06.03.2020 17:02, paul@xen.org wrote:
>>> --- a/xen/arch/x86/irq.c
>>> +++ b/xen/arch/x86/irq.c
>>> @@ -1680,9 +1680,23 @@ static irq_guest_action_t *__pirq_guest_unbind(
>>>
>>> BUG_ON(!(desc->status & IRQ_GUEST));
>>>
>>> - for ( i = 0; (i < action->nr_guests) && (action->guest[i] != d); i++ )
>>> - continue;
>>> - BUG_ON(i == action->nr_guests);
>>> + for ( i = 0; i < action->nr_guests; i++ )
>>> + if ( action->guest[i] == d )
>>> + break;
>>> +
>>> + if ( i == action->nr_guests ) /* No matching entry */
>>> + {
>>> + /*
>>> + * In case the pirq was shared, unbound for this domain in an earlier
>>> + * call, but still existed on the domain's pirq_tree, we still reach
>>> + * here if there are any later unbind calls on the same pirq. Return
>>> + * if such an unbind happens.
>>> + */
>>> + ASSERT(action->shareable);
>>> + return NULL;
>>> + }
>>> +
>>> + ASSERT(action->nr_guests > 0);
>>
>> This seems pointless to have here - v3 had it inside the if(),
>> where it would actually guard against coming here with nr_guests
>> equal to zero.
>
> Why. The code just after this decrements nr_guests so it seems
> like entirely the right point to have the ASSERT. I can make it
> ASSERT >= 0, if that makes more sense.
There's no way to come here when nr_guests == 0. This is because
in this case the loop will be exited with i being zero, and hence
the earlier if()'s body will be entered.
(And no, >= 0 wouldn't make sense to me - it would mean we might
have a count of -1 after the decrement.)
>> v3 also used if() and BUG() instead of ASSERT()
>> inside this if(), which to me would seem more in line with our
>> current ./CODING_STYLE guidelines of handling unexpected
>> conditions. Could you clarify why you switched things?
>>
>
> Because I don't think there is need to kill the host in a
> non-debug context if we hit this condition; it is entirely
> survivable as far as I can tell so a BUG_ON() did not seem
> appropriate.
It'll mean we have a non-sharable IRQ in a place where this is
not supposed to happen. How can we be sure the system is in a
state allowing to safely continue? To me, if shareable / non-
shareable is of any concern here, then it ought to be BUG().
If it's not, then the ASSERT() ought to be dropped as well.
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Xen-devel] [PATCH v4] x86: irq: Do not BUG_ON multiple unbind calls for shared pirqs
2020-03-10 11:23 ` Jan Beulich
@ 2020-03-10 12:36 ` Paul Durrant
2020-03-10 13:38 ` Jan Beulich
0 siblings, 1 reply; 14+ messages in thread
From: Paul Durrant @ 2020-03-10 12:36 UTC (permalink / raw)
To: 'Jan Beulich'
Cc: xen-devel, 'Varad Gautam', 'Andrew Cooper',
'Julien Grall', 'Roger Pau Monné'
> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: 10 March 2020 11:23
> To: paul@xen.org
> Cc: xen-devel@lists.xenproject.org; 'Varad Gautam' <vrd@amazon.de>; 'Julien Grall' <julien@xen.org>;
> 'Roger Pau Monné' <roger.pau@citrix.com>; 'Andrew Cooper' <andrew.cooper3@citrix.com>
> Subject: Re: [PATCH v4] x86: irq: Do not BUG_ON multiple unbind calls for shared pirqs
>
> On 09.03.2020 18:47, Paul Durrant wrote:
> >> -----Original Message-----
> >> From: Jan Beulich <jbeulich@suse.com>
> >> Sent: 09 March 2020 16:29
> >>
> >> On 06.03.2020 17:02, paul@xen.org wrote:
> >>> --- a/xen/arch/x86/irq.c
> >>> +++ b/xen/arch/x86/irq.c
> >>> @@ -1680,9 +1680,23 @@ static irq_guest_action_t *__pirq_guest_unbind(
> >>>
> >>> BUG_ON(!(desc->status & IRQ_GUEST));
> >>>
> >>> - for ( i = 0; (i < action->nr_guests) && (action->guest[i] != d); i++ )
> >>> - continue;
> >>> - BUG_ON(i == action->nr_guests);
> >>> + for ( i = 0; i < action->nr_guests; i++ )
> >>> + if ( action->guest[i] == d )
> >>> + break;
> >>> +
> >>> + if ( i == action->nr_guests ) /* No matching entry */
> >>> + {
> >>> + /*
> >>> + * In case the pirq was shared, unbound for this domain in an earlier
> >>> + * call, but still existed on the domain's pirq_tree, we still reach
> >>> + * here if there are any later unbind calls on the same pirq. Return
> >>> + * if such an unbind happens.
> >>> + */
> >>> + ASSERT(action->shareable);
> >>> + return NULL;
> >>> + }
> >>> +
> >>> + ASSERT(action->nr_guests > 0);
> >>
> >> This seems pointless to have here - v3 had it inside the if(),
> >> where it would actually guard against coming here with nr_guests
> >> equal to zero.
> >
> > Why. The code just after this decrements nr_guests so it seems
> > like entirely the right point to have the ASSERT. I can make it
> > ASSERT >= 0, if that makes more sense.
>
> There's no way to come here when nr_guests == 0. This is because
> in this case the loop will be exited with i being zero, and hence
> the earlier if()'s body will be entered.
Ok, yes, that's true.
>
> (And no, >= 0 wouldn't make sense to me - it would mean we might
> have a count of -1 after the decrement.)
>
> >> v3 also used if() and BUG() instead of ASSERT()
> >> inside this if(), which to me would seem more in line with our
> >> current ./CODING_STYLE guidelines of handling unexpected
> >> conditions. Could you clarify why you switched things?
> >>
> >
> > Because I don't think there is need to kill the host in a
> > non-debug context if we hit this condition; it is entirely
> > survivable as far as I can tell so a BUG_ON() did not seem
> > appropriate.
>
> It'll mean we have a non-sharable IRQ in a place where this is
> not supposed to happen. How can we be sure the system is in a
> state allowing to safely continue? To me, if shareable / non-
> shareable is of any concern here, then it ought to be BUG().
> If it's not, then the ASSERT() ought to be dropped as well.
Ok, I'll convert back to a BUG().
Paul
>
> Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Xen-devel] [PATCH v4] x86: irq: Do not BUG_ON multiple unbind calls for shared pirqs
2020-03-10 12:36 ` Paul Durrant
@ 2020-03-10 13:38 ` Jan Beulich
0 siblings, 0 replies; 14+ messages in thread
From: Jan Beulich @ 2020-03-10 13:38 UTC (permalink / raw)
To: paul
Cc: xen-devel, 'Varad Gautam', 'Roger Pau Monné',
'Julien Grall', 'Andrew Cooper'
On 10.03.2020 13:36, Paul Durrant wrote:
> Ok, I'll convert back to a BUG().
Wait a little - I think I have an alternative proposal. Just want to
at least smoke test it before sending out.
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Xen-devel] [PATCH v4] x86: irq: Do not BUG_ON multiple unbind calls for shared pirqs
2020-03-09 17:47 ` Paul Durrant
2020-03-10 11:23 ` Jan Beulich
@ 2020-03-10 14:19 ` Jan Beulich
2020-03-17 15:23 ` Paul Durrant
2020-04-28 11:58 ` vrd
1 sibling, 2 replies; 14+ messages in thread
From: Jan Beulich @ 2020-03-10 14:19 UTC (permalink / raw)
To: paul
Cc: xen-devel, 'Varad Gautam', 'Andrew Cooper',
'Julien Grall', 'Roger Pau Monné'
On 09.03.2020 18:47, Paul Durrant wrote:
>> -----Original Message-----
>> From: Jan Beulich <jbeulich@suse.com>
>> Sent: 09 March 2020 16:29
>> To: paul@xen.org
>> Cc: xen-devel@lists.xenproject.org; Varad Gautam <vrd@amazon.de>; Julien Grall <julien@xen.org>; Roger
>> Pau Monné <roger.pau@citrix.com>; Andrew Cooper <andrew.cooper3@citrix.com>
>> Subject: Re: [PATCH v4] x86: irq: Do not BUG_ON multiple unbind calls for shared pirqs
>>
>> On 06.03.2020 17:02, paul@xen.org wrote:
>>> From: Varad Gautam <vrd@amazon.de>
>>>
>>> XEN_DOMCTL_destroydomain creates a continuation if domain_kill -ERESTARTS.
>>> In that scenario, it is possible to receive multiple __pirq_guest_unbind
>>> calls for the same pirq from domain_kill, if the pirq has not yet been
>>> removed from the domain's pirq_tree, as:
>>> domain_kill()
>>> -> domain_relinquish_resources()
>>> -> pci_release_devices()
>>> -> pci_clean_dpci_irq()
>>> -> pirq_guest_unbind()
>>> -> __pirq_guest_unbind()
>>>
>>> For a shared pirq (nr_guests > 1), the first call would zap the current
>>> domain from the pirq's guests[] list, but the action handler is never freed
>>> as there are other guests using this pirq. As a result, on the second call,
>>> __pirq_guest_unbind searches for the current domain which has been removed
>>> from the guests[] list, and hits a BUG_ON.
>>>
>>> Make __pirq_guest_unbind safe to be called multiple times by letting xen
>>> continue if a shared pirq has already been unbound from this guest. The
>>> PIRQ will be cleaned up from the domain's pirq_tree during the destruction
>>> in complete_domain_destroy anyway.
>>>
>>> Signed-off-by: Varad Gautam <vrd@amazon.de>
>>> [taking over from Varad at v4]
>>> Signed-off-by: Paul Durrant <paul@xen.org>
>>> ---
>>> Cc: Jan Beulich <jbeulich@suse.com>
>>> Cc: Julien Grall <julien@xen.org>
>>> Cc: Roger Pau Monné <roger.pau@citrix.com>
>>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
>>>
>>> Roger suggested cleaning the entry from the domain pirq_tree so that
>>> we need not make it safe to re-call __pirq_guest_unbind(). This seems like
>>> a reasonable suggestion but the semantics of the code are almost
>>> impenetrable (e.g. 'pirq' is used to mean an index, a pointer and is also
>>> the name of struct so you generally have little idea what it actally means)
>>> so I prefer to stick with a small fix that I can actually reason about.
>>>
>>> v4:
>>> - Re-work the guest array search to make it clearer
>>
>> I.e. there are cosmetic differences to v3 (see below), but
>> technically it's still the same. I can't believe the re-use
>> of "pirq" for different entities is this big of a problem.
>
> Please suggest code if you think it ought to be done differentely. I tried.
How about this? It's admittedly more code, but imo less ad hoc.
I've smoke tested it, but I depend on you or Varad to check that
it actually addresses the reported issue.
Jan
x86/pass-through: avoid double IRQ unbind during domain cleanup
XEN_DOMCTL_destroydomain creates a continuation if domain_kill -ERESTARTS.
In that scenario, it is possible to receive multiple _pirq_guest_unbind
calls for the same pirq from domain_kill, if the pirq has not yet been
removed from the domain's pirq_tree, as:
domain_kill()
-> domain_relinquish_resources()
-> pci_release_devices()
-> pci_clean_dpci_irq()
-> pirq_guest_unbind()
-> __pirq_guest_unbind()
Avoid recurring invocations of pirq_guest_unbind() by removing the pIRQ
from the tree being iterated after the first call there. In case such a
removed entry still has a softirq outstanding, record it and re-check
upon re-invocation.
Reported-by: Varad Gautam <vrd@amazon.de>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
--- unstable.orig/xen/arch/x86/irq.c
+++ unstable/xen/arch/x86/irq.c
@@ -1323,7 +1323,7 @@ void (pirq_cleanup_check)(struct pirq *p
}
if ( radix_tree_delete(&d->pirq_tree, pirq->pirq) != pirq )
- BUG();
+ BUG_ON(!d->is_dying);
}
/* Flush all ready EOIs from the top of this CPU's pending-EOI stack. */
--- unstable.orig/xen/drivers/passthrough/pci.c
+++ unstable/xen/drivers/passthrough/pci.c
@@ -873,7 +873,14 @@ static int pci_clean_dpci_irq(struct dom
xfree(digl);
}
- return pt_pirq_softirq_active(pirq_dpci) ? -ERESTART : 0;
+ radix_tree_delete(&d->pirq_tree, dpci_pirq(pirq_dpci)->pirq);
+
+ if ( !pt_pirq_softirq_active(pirq_dpci) )
+ return 0;
+
+ domain_get_irq_dpci(d)->pending_pirq_dpci = pirq_dpci;
+
+ return -ERESTART;
}
static int pci_clean_dpci_irqs(struct domain *d)
@@ -890,8 +897,18 @@ static int pci_clean_dpci_irqs(struct do
hvm_irq_dpci = domain_get_irq_dpci(d);
if ( hvm_irq_dpci != NULL )
{
- int ret = pt_pirq_iterate(d, pci_clean_dpci_irq, NULL);
+ int ret = 0;
+
+ if ( hvm_irq_dpci->pending_pirq_dpci )
+ {
+ if ( pt_pirq_softirq_active(hvm_irq_dpci->pending_pirq_dpci) )
+ ret = -ERESTART;
+ else
+ hvm_irq_dpci->pending_pirq_dpci = NULL;
+ }
+ if ( !ret )
+ ret = pt_pirq_iterate(d, pci_clean_dpci_irq, NULL);
if ( ret )
{
spin_unlock(&d->event_lock);
--- unstable.orig/xen/include/asm-x86/hvm/irq.h
+++ unstable/xen/include/asm-x86/hvm/irq.h
@@ -158,6 +158,8 @@ struct hvm_irq_dpci {
DECLARE_BITMAP(isairq_map, NR_ISAIRQS);
/* Record of mapped Links */
uint8_t link_cnt[NR_LINK];
+ /* Clean up: Entry with a softirq invocation pending / in progress. */
+ struct hvm_pirq_dpci *pending_pirq_dpci;
};
/* Machine IRQ to guest device/intx mapping. */
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Xen-devel] [PATCH v4] x86: irq: Do not BUG_ON multiple unbind calls for shared pirqs
2020-03-10 14:19 ` Jan Beulich
@ 2020-03-17 15:23 ` Paul Durrant
2020-03-31 7:40 ` Jan Beulich
2020-04-28 11:58 ` vrd
1 sibling, 1 reply; 14+ messages in thread
From: Paul Durrant @ 2020-03-17 15:23 UTC (permalink / raw)
To: 'Jan Beulich'
Cc: xen-devel, 'Varad Gautam', 'Andrew Cooper',
'Julien Grall', 'Roger Pau Monné'
> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: 10 March 2020 14:19
> To: paul@xen.org
> Cc: xen-devel@lists.xenproject.org; 'Varad Gautam' <vrd@amazon.de>; 'Julien Grall' <julien@xen.org>;
> 'Roger Pau Monné' <roger.pau@citrix.com>; 'Andrew Cooper' <andrew.cooper3@citrix.com>
> Subject: Re: [PATCH v4] x86: irq: Do not BUG_ON multiple unbind calls for shared pirqs
>
> On 09.03.2020 18:47, Paul Durrant wrote:
> >> -----Original Message-----
> >> From: Jan Beulich <jbeulich@suse.com>
> >> Sent: 09 March 2020 16:29
> >> To: paul@xen.org
> >> Cc: xen-devel@lists.xenproject.org; Varad Gautam <vrd@amazon.de>; Julien Grall <julien@xen.org>;
> Roger
> >> Pau Monné <roger.pau@citrix.com>; Andrew Cooper <andrew.cooper3@citrix.com>
> >> Subject: Re: [PATCH v4] x86: irq: Do not BUG_ON multiple unbind calls for shared pirqs
> >>
> >> On 06.03.2020 17:02, paul@xen.org wrote:
> >>> From: Varad Gautam <vrd@amazon.de>
> >>>
> >>> XEN_DOMCTL_destroydomain creates a continuation if domain_kill -ERESTARTS.
> >>> In that scenario, it is possible to receive multiple __pirq_guest_unbind
> >>> calls for the same pirq from domain_kill, if the pirq has not yet been
> >>> removed from the domain's pirq_tree, as:
> >>> domain_kill()
> >>> -> domain_relinquish_resources()
> >>> -> pci_release_devices()
> >>> -> pci_clean_dpci_irq()
> >>> -> pirq_guest_unbind()
> >>> -> __pirq_guest_unbind()
> >>>
> >>> For a shared pirq (nr_guests > 1), the first call would zap the current
> >>> domain from the pirq's guests[] list, but the action handler is never freed
> >>> as there are other guests using this pirq. As a result, on the second call,
> >>> __pirq_guest_unbind searches for the current domain which has been removed
> >>> from the guests[] list, and hits a BUG_ON.
> >>>
> >>> Make __pirq_guest_unbind safe to be called multiple times by letting xen
> >>> continue if a shared pirq has already been unbound from this guest. The
> >>> PIRQ will be cleaned up from the domain's pirq_tree during the destruction
> >>> in complete_domain_destroy anyway.
> >>>
> >>> Signed-off-by: Varad Gautam <vrd@amazon.de>
> >>> [taking over from Varad at v4]
> >>> Signed-off-by: Paul Durrant <paul@xen.org>
> >>> ---
> >>> Cc: Jan Beulich <jbeulich@suse.com>
> >>> Cc: Julien Grall <julien@xen.org>
> >>> Cc: Roger Pau Monné <roger.pau@citrix.com>
> >>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> >>>
> >>> Roger suggested cleaning the entry from the domain pirq_tree so that
> >>> we need not make it safe to re-call __pirq_guest_unbind(). This seems like
> >>> a reasonable suggestion but the semantics of the code are almost
> >>> impenetrable (e.g. 'pirq' is used to mean an index, a pointer and is also
> >>> the name of struct so you generally have little idea what it actally means)
> >>> so I prefer to stick with a small fix that I can actually reason about.
> >>>
> >>> v4:
> >>> - Re-work the guest array search to make it clearer
> >>
> >> I.e. there are cosmetic differences to v3 (see below), but
> >> technically it's still the same. I can't believe the re-use
> >> of "pirq" for different entities is this big of a problem.
> >
> > Please suggest code if you think it ought to be done differentely. I tried.
>
> How about this? It's admittedly more code, but imo less ad hoc.
> I've smoke tested it, but I depend on you or Varad to check that
> it actually addresses the reported issue.
It's fairly hard to hit IIRC but we could probably engineer it with a one off ERESTART in the right place.
>
> Jan
>
> x86/pass-through: avoid double IRQ unbind during domain cleanup
>
> XEN_DOMCTL_destroydomain creates a continuation if domain_kill -ERESTARTS.
> In that scenario, it is possible to receive multiple _pirq_guest_unbind
> calls for the same pirq from domain_kill, if the pirq has not yet been
> removed from the domain's pirq_tree, as:
> domain_kill()
> -> domain_relinquish_resources()
> -> pci_release_devices()
> -> pci_clean_dpci_irq()
> -> pirq_guest_unbind()
> -> __pirq_guest_unbind()
>
> Avoid recurring invocations of pirq_guest_unbind() by removing the pIRQ
> from the tree being iterated after the first call there. In case such a
> removed entry still has a softirq outstanding, record it and re-check
> upon re-invocation.
>
> Reported-by: Varad Gautam <vrd@amazon.de>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>
> --- unstable.orig/xen/arch/x86/irq.c
> +++ unstable/xen/arch/x86/irq.c
> @@ -1323,7 +1323,7 @@ void (pirq_cleanup_check)(struct pirq *p
> }
>
> if ( radix_tree_delete(&d->pirq_tree, pirq->pirq) != pirq )
> - BUG();
> + BUG_ON(!d->is_dying);
> }
>
> /* Flush all ready EOIs from the top of this CPU's pending-EOI stack. */
> --- unstable.orig/xen/drivers/passthrough/pci.c
> +++ unstable/xen/drivers/passthrough/pci.c
> @@ -873,7 +873,14 @@ static int pci_clean_dpci_irq(struct dom
> xfree(digl);
> }
>
> - return pt_pirq_softirq_active(pirq_dpci) ? -ERESTART : 0;
> + radix_tree_delete(&d->pirq_tree, dpci_pirq(pirq_dpci)->pirq);
> +
> + if ( !pt_pirq_softirq_active(pirq_dpci) )
> + return 0;
> +
> + domain_get_irq_dpci(d)->pending_pirq_dpci = pirq_dpci;
> +
> + return -ERESTART;
> }
>
> static int pci_clean_dpci_irqs(struct domain *d)
> @@ -890,8 +897,18 @@ static int pci_clean_dpci_irqs(struct do
> hvm_irq_dpci = domain_get_irq_dpci(d);
> if ( hvm_irq_dpci != NULL )
> {
> - int ret = pt_pirq_iterate(d, pci_clean_dpci_irq, NULL);
> + int ret = 0;
> +
> + if ( hvm_irq_dpci->pending_pirq_dpci )
> + {
> + if ( pt_pirq_softirq_active(hvm_irq_dpci->pending_pirq_dpci) )
> + ret = -ERESTART;
> + else
> + hvm_irq_dpci->pending_pirq_dpci = NULL;
> + }
>
> + if ( !ret )
> + ret = pt_pirq_iterate(d, pci_clean_dpci_irq, NULL);
> if ( ret )
> {
> spin_unlock(&d->event_lock);
> --- unstable.orig/xen/include/asm-x86/hvm/irq.h
> +++ unstable/xen/include/asm-x86/hvm/irq.h
> @@ -158,6 +158,8 @@ struct hvm_irq_dpci {
> DECLARE_BITMAP(isairq_map, NR_ISAIRQS);
> /* Record of mapped Links */
> uint8_t link_cnt[NR_LINK];
> + /* Clean up: Entry with a softirq invocation pending / in progress. */
> + struct hvm_pirq_dpci *pending_pirq_dpci;
> };
>
That looks like it will do the job. I'll see if I can get it tested.
Paul
> /* Machine IRQ to guest device/intx mapping. */
>
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v4] x86: irq: Do not BUG_ON multiple unbind calls for shared pirqs
2020-03-17 15:23 ` Paul Durrant
@ 2020-03-31 7:40 ` Jan Beulich
2020-03-31 11:51 ` Paul Durrant
0 siblings, 1 reply; 14+ messages in thread
From: Jan Beulich @ 2020-03-31 7:40 UTC (permalink / raw)
To: paul
Cc: xen-devel, 'Varad Gautam', 'Andrew Cooper',
'Julien Grall', 'Roger Pau Monné'
On 17.03.2020 16:23, Paul Durrant wrote:
> That looks like it will do the job. I'll see if I can get it tested.
Any luck with this, yet?
Jan
^ permalink raw reply [flat|nested] 14+ messages in thread
* RE: [PATCH v4] x86: irq: Do not BUG_ON multiple unbind calls for shared pirqs
2020-03-31 7:40 ` Jan Beulich
@ 2020-03-31 11:51 ` Paul Durrant
2020-04-23 11:08 ` Jan Beulich
0 siblings, 1 reply; 14+ messages in thread
From: Paul Durrant @ 2020-03-31 11:51 UTC (permalink / raw)
To: 'Jan Beulich'
Cc: xen-devel, 'Varad Gautam', 'Andrew Cooper',
'Julien Grall', 'Roger Pau Monné'
> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: 31 March 2020 08:41
> To: paul@xen.org
> Cc: xen-devel@lists.xenproject.org; 'Varad Gautam' <vrd@amazon.de>; 'Julien Grall' <julien@xen.org>;
> 'Roger Pau Monné' <roger.pau@citrix.com>; 'Andrew Cooper' <andrew.cooper3@citrix.com>
> Subject: Re: [PATCH v4] x86: irq: Do not BUG_ON multiple unbind calls for shared pirqs
>
> On 17.03.2020 16:23, Paul Durrant wrote:
> > That looks like it will do the job. I'll see if I can get it tested.
>
> Any luck with this, yet?
>
I have asked Varad to test it. He hopes to get to it this week.
Paul
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v4] x86: irq: Do not BUG_ON multiple unbind calls for shared pirqs
2020-03-31 11:51 ` Paul Durrant
@ 2020-04-23 11:08 ` Jan Beulich
2020-04-23 15:14 ` Paul Durrant
0 siblings, 1 reply; 14+ messages in thread
From: Jan Beulich @ 2020-04-23 11:08 UTC (permalink / raw)
To: paul, 'Varad Gautam'
Cc: xen-devel, 'Andrew Cooper', 'Julien Grall',
'Roger Pau Monné'
On 31.03.2020 13:51, Paul Durrant wrote:
>> From: Jan Beulich <jbeulich@suse.com>
>> Sent: 31 March 2020 08:41
>>
>> On 17.03.2020 16:23, Paul Durrant wrote:
>>> That looks like it will do the job. I'll see if I can get it tested.
>>
>> Any luck with this, yet?
>
> I have asked Varad to test it. He hopes to get to it this week.
Any news here?
Jan
^ permalink raw reply [flat|nested] 14+ messages in thread
* RE: [PATCH v4] x86: irq: Do not BUG_ON multiple unbind calls for shared pirqs
2020-04-23 11:08 ` Jan Beulich
@ 2020-04-23 15:14 ` Paul Durrant
0 siblings, 0 replies; 14+ messages in thread
From: Paul Durrant @ 2020-04-23 15:14 UTC (permalink / raw)
To: 'Jan Beulich', 'Varad Gautam'
Cc: xen-devel, 'Andrew Cooper', 'Julien Grall',
'Roger Pau Monné'
> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: 23 April 2020 12:09
> To: paul@xen.org; 'Varad Gautam' <vrd@amazon.de>
> Cc: xen-devel@lists.xenproject.org; 'Julien Grall' <julien@xen.org>; 'Roger Pau Monné'
> <roger.pau@citrix.com>; 'Andrew Cooper' <andrew.cooper3@citrix.com>
> Subject: Re: [PATCH v4] x86: irq: Do not BUG_ON multiple unbind calls for shared pirqs
>
> On 31.03.2020 13:51, Paul Durrant wrote:
> >> From: Jan Beulich <jbeulich@suse.com>
> >> Sent: 31 March 2020 08:41
> >>
> >> On 17.03.2020 16:23, Paul Durrant wrote:
> >>> That looks like it will do the job. I'll see if I can get it tested.
> >>
> >> Any luck with this, yet?
> >
> > I have asked Varad to test it. He hopes to get to it this week.
>
> Any news here?
>
Varad tells me he is currently testing and will get back to you soon (hopefully today).
Paul
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v4] x86: irq: Do not BUG_ON multiple unbind calls for shared pirqs
2020-03-10 14:19 ` Jan Beulich
2020-03-17 15:23 ` Paul Durrant
@ 2020-04-28 11:58 ` vrd
2020-04-28 12:18 ` Jan Beulich
1 sibling, 1 reply; 14+ messages in thread
From: vrd @ 2020-04-28 11:58 UTC (permalink / raw)
To: Jan Beulich, paul
Cc: xen-devel, 'Varad Gautam', 'Andrew Cooper',
'Julien Grall', 'Roger Pau Monné'
Hi Jan,
On 3/10/20 3:19 PM, Jan Beulich wrote:
> On 09.03.2020 18:47, Paul Durrant wrote:
>>> -----Original Message-----
>>> From: Jan Beulich <jbeulich@suse.com>
>>> Sent: 09 March 2020 16:29
>>> To: paul@xen.org
>>> Cc: xen-devel@lists.xenproject.org; Varad Gautam <vrd@amazon.de>; Julien Grall <julien@xen.org>; Roger
>>> Pau Monné <roger.pau@citrix.com>; Andrew Cooper <andrew.cooper3@citrix.com>
>>> Subject: Re: [PATCH v4] x86: irq: Do not BUG_ON multiple unbind calls for shared pirqs
>>>
>>> On 06.03.2020 17:02, paul@xen.org wrote:
>>>> From: Varad Gautam <vrd@amazon.de>
>>>>
>>>> XEN_DOMCTL_destroydomain creates a continuation if domain_kill -ERESTARTS.
>>>> In that scenario, it is possible to receive multiple __pirq_guest_unbind
>>>> calls for the same pirq from domain_kill, if the pirq has not yet been
>>>> removed from the domain's pirq_tree, as:
>>>> domain_kill()
>>>> -> domain_relinquish_resources()
>>>> -> pci_release_devices()
>>>> -> pci_clean_dpci_irq()
>>>> -> pirq_guest_unbind()
>>>> -> __pirq_guest_unbind()
>>>>
>>>> For a shared pirq (nr_guests > 1), the first call would zap the current
>>>> domain from the pirq's guests[] list, but the action handler is never freed
>>>> as there are other guests using this pirq. As a result, on the second call,
>>>> __pirq_guest_unbind searches for the current domain which has been removed
>>>> from the guests[] list, and hits a BUG_ON.
>>>>
>>>> Make __pirq_guest_unbind safe to be called multiple times by letting xen
>>>> continue if a shared pirq has already been unbound from this guest. The
>>>> PIRQ will be cleaned up from the domain's pirq_tree during the destruction
>>>> in complete_domain_destroy anyway.
>>>>
>>>> Signed-off-by: Varad Gautam <vrd@amazon.de>
>>>> [taking over from Varad at v4]
>>>> Signed-off-by: Paul Durrant <paul@xen.org>
>>>> ---
>>>> Cc: Jan Beulich <jbeulich@suse.com>
>>>> Cc: Julien Grall <julien@xen.org>
>>>> Cc: Roger Pau Monné <roger.pau@citrix.com>
>>>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
>>>>
>>>> Roger suggested cleaning the entry from the domain pirq_tree so that
>>>> we need not make it safe to re-call __pirq_guest_unbind(). This seems like
>>>> a reasonable suggestion but the semantics of the code are almost
>>>> impenetrable (e.g. 'pirq' is used to mean an index, a pointer and is also
>>>> the name of struct so you generally have little idea what it actally means)
>>>> so I prefer to stick with a small fix that I can actually reason about.
>>>>
>>>> v4:
>>>> - Re-work the guest array search to make it clearer
>>> I.e. there are cosmetic differences to v3 (see below), but
>>> technically it's still the same. I can't believe the re-use
>>> of "pirq" for different entities is this big of a problem.
>> Please suggest code if you think it ought to be done differentely. I tried.
> How about this? It's admittedly more code, but imo less ad hoc.
> I've smoke tested it, but I depend on you or Varad to check that
> it actually addresses the reported issue.
>
> Jan
>
> x86/pass-through: avoid double IRQ unbind during domain cleanup
I have tested that this patch prevents __pirq_guest_unbind on an
already-unbound pirq
during the continuation call for domain_kill -ERESTART, by using a
modified xen that
forces an -ERESTART from pirq_guest_unbind to create the continuation.
It fixes the
underlying issue.
Tested-by: Varad Gautam <vrd@amazon.de>
>
> XEN_DOMCTL_destroydomain creates a continuation if domain_kill -ERESTARTS.
> In that scenario, it is possible to receive multiple _pirq_guest_unbind
> calls for the same pirq from domain_kill, if the pirq has not yet been
> removed from the domain's pirq_tree, as:
> domain_kill()
> -> domain_relinquish_resources()
> -> pci_release_devices()
> -> pci_clean_dpci_irq()
> -> pirq_guest_unbind()
> -> __pirq_guest_unbind()
>
> Avoid recurring invocations of pirq_guest_unbind() by removing the pIRQ
> from the tree being iterated after the first call there. In case such a
> removed entry still has a softirq outstanding, record it and re-check
> upon re-invocation.
>
> Reported-by: Varad Gautam <vrd@amazon.de>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>
> --- unstable.orig/xen/arch/x86/irq.c
> +++ unstable/xen/arch/x86/irq.c
> @@ -1323,7 +1323,7 @@ void (pirq_cleanup_check)(struct pirq *p
> }
>
> if ( radix_tree_delete(&d->pirq_tree, pirq->pirq) != pirq )
> - BUG();
> + BUG_ON(!d->is_dying);
> }
>
> /* Flush all ready EOIs from the top of this CPU's pending-EOI stack. */
> --- unstable.orig/xen/drivers/passthrough/pci.c
> +++ unstable/xen/drivers/passthrough/pci.c
> @@ -873,7 +873,14 @@ static int pci_clean_dpci_irq(struct dom
> xfree(digl);
> }
>
> - return pt_pirq_softirq_active(pirq_dpci) ? -ERESTART : 0;
> + radix_tree_delete(&d->pirq_tree, dpci_pirq(pirq_dpci)->pirq);
> +
> + if ( !pt_pirq_softirq_active(pirq_dpci) )
> + return 0;
> +
> + domain_get_irq_dpci(d)->pending_pirq_dpci = pirq_dpci;
> +
> + return -ERESTART;
> }
>
> static int pci_clean_dpci_irqs(struct domain *d)
> @@ -890,8 +897,18 @@ static int pci_clean_dpci_irqs(struct do
> hvm_irq_dpci = domain_get_irq_dpci(d);
> if ( hvm_irq_dpci != NULL )
> {
> - int ret = pt_pirq_iterate(d, pci_clean_dpci_irq, NULL);
> + int ret = 0;
> +
> + if ( hvm_irq_dpci->pending_pirq_dpci )
> + {
> + if ( pt_pirq_softirq_active(hvm_irq_dpci->pending_pirq_dpci) )
> + ret = -ERESTART;
> + else
> + hvm_irq_dpci->pending_pirq_dpci = NULL;
> + }
>
> + if ( !ret )
> + ret = pt_pirq_iterate(d, pci_clean_dpci_irq, NULL);
> if ( ret )
> {
> spin_unlock(&d->event_lock);
> --- unstable.orig/xen/include/asm-x86/hvm/irq.h
> +++ unstable/xen/include/asm-x86/hvm/irq.h
> @@ -158,6 +158,8 @@ struct hvm_irq_dpci {
> DECLARE_BITMAP(isairq_map, NR_ISAIRQS);
> /* Record of mapped Links */
> uint8_t link_cnt[NR_LINK];
> + /* Clean up: Entry with a softirq invocation pending / in progress. */
> + struct hvm_pirq_dpci *pending_pirq_dpci;
> };
>
> /* Machine IRQ to guest device/intx mapping. */
>
>
Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v4] x86: irq: Do not BUG_ON multiple unbind calls for shared pirqs
2020-04-28 11:58 ` vrd
@ 2020-04-28 12:18 ` Jan Beulich
0 siblings, 0 replies; 14+ messages in thread
From: Jan Beulich @ 2020-04-28 12:18 UTC (permalink / raw)
To: vrd
Cc: 'Julien Grall', paul, 'Andrew Cooper',
'Varad Gautam', xen-devel, 'Roger Pau Monné'
On 28.04.2020 13:58, vrd@amazon.com wrote:
> On 3/10/20 3:19 PM, Jan Beulich wrote:
>> On 09.03.2020 18:47, Paul Durrant wrote:
>>> Please suggest code if you think it ought to be done differentely. I tried.
>> How about this? It's admittedly more code, but imo less ad hoc.
>> I've smoke tested it, but I depend on you or Varad to check that
>> it actually addresses the reported issue.
>>
>> Jan
>>
>> x86/pass-through: avoid double IRQ unbind during domain cleanup
>
>
> I have tested that this patch prevents __pirq_guest_unbind on an already-unbound pirq
> during the continuation call for domain_kill -ERESTART, by using a modified xen that
> forces an -ERESTART from pirq_guest_unbind to create the continuation. It fixes the
> underlying issue.
>
> Tested-by: Varad Gautam <vrd@amazon.de>
Thanks much; I'll formally submit the patch then.
Jan
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2020-04-28 12:19 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-06 16:02 [Xen-devel] [PATCH v4] x86: irq: Do not BUG_ON multiple unbind calls for shared pirqs paul
2020-03-09 16:29 ` Jan Beulich
2020-03-09 17:47 ` Paul Durrant
2020-03-10 11:23 ` Jan Beulich
2020-03-10 12:36 ` Paul Durrant
2020-03-10 13:38 ` Jan Beulich
2020-03-10 14:19 ` Jan Beulich
2020-03-17 15:23 ` Paul Durrant
2020-03-31 7:40 ` Jan Beulich
2020-03-31 11:51 ` Paul Durrant
2020-04-23 11:08 ` Jan Beulich
2020-04-23 15:14 ` Paul Durrant
2020-04-28 11:58 ` vrd
2020-04-28 12:18 ` Jan Beulich
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.