All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] [RFC] VT-d: always clean up dpci timers.
@ 2011-07-18 16:38 Tim Deegan
  2011-07-21  1:08 ` Kay, Allen M
  0 siblings, 1 reply; 8+ messages in thread
From: Tim Deegan @ 2011-07-18 16:38 UTC (permalink / raw)
  To: xen-devel; +Cc: Allen Kay, keir

If a VM has all its PCI devices deassigned, need_iommu(d) becomes false
but it might still have DPCI EOI timers that were init_timer()d but not
yet kill_timer()d.  That causes xen to crash later because the linked
list of inactive timers gets corrupted, e.g.:

(XEN) Xen call trace:
(XEN)    [<ffff82c480126256>] set_timer+0x1c2/0x24f
(XEN)    [<ffff82c48011fbf8>] schedule+0x129/0x5dd
(XEN)    [<ffff82c480122c1e>] __do_softirq+0x7e/0x89
(XEN)    [<ffff82c480122c9d>] do_softirq+0x26/0x28
(XEN)    [<ffff82c480153c85>] idle_loop+0x5a/0x5c
(XEN)    
(XEN) 
(XEN) ****************************************
(XEN) Panic on CPU 0:
(XEN) Assertion 'entry->next->prev == entry' failed at /local/scratch/tdeegan/xen-unstable.hg/xen/include:172
(XEN) ****************************************

The following patch makes sure that the domain destruction path always
clears up the DPCI state even if !needs_iommu(d). 

Although it fixes the crash for me, I'm sufficiently confused by this
code that I don't know whether it's enough.  If the dpci timer state
gets freed earlier than pci_clean_dpci_irqs() then there's still a race,
and some other function (reassign_device_ownership() ?) needs to sort
out the timers when the PCI card is deassigned.

Allen, can you comment?

Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>

diff -r ab6551e30841 xen/drivers/passthrough/pci.c
--- a/xen/drivers/passthrough/pci.c	Mon Jul 18 10:59:44 2011 +0100
+++ b/xen/drivers/passthrough/pci.c	Mon Jul 18 17:22:48 2011 +0100
@@ -269,7 +269,7 @@ static void pci_clean_dpci_irqs(struct d
     if ( !iommu_enabled )
         return;
 
-    if ( !is_hvm_domain(d) || !need_iommu(d) )
+    if ( !is_hvm_domain(d) )
         return;
 
     spin_lock(&d->event_lock);

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH] [RFC] VT-d: always clean up dpci timers.
  2011-07-18 16:38 [PATCH] [RFC] VT-d: always clean up dpci timers Tim Deegan
@ 2011-07-21  1:08 ` Kay, Allen M
  2011-07-21  8:50   ` Tim Deegan
  0 siblings, 1 reply; 8+ messages in thread
From: Kay, Allen M @ 2011-07-21  1:08 UTC (permalink / raw)
  To: Tim Deegan, xen-devel; +Cc: keir

Hi Tim,

Can you provide the code flow that can cause this failure?

In pci_release_devices(), pci_clean_dpci_irqs() is called before "d->need_iommu = 0" in deassign_device().  If this path is taken, then it should not return from !need_iommu(d) in pci_clean_dpci_irqs().

Allen

-----Original Message-----
From: Tim Deegan [mailto:Tim.Deegan@citrix.com] 
Sent: Monday, July 18, 2011 9:39 AM
To: xen-devel@lists.xensource.com
Cc: keir@xen.org; Kay, Allen M
Subject: [PATCH] [RFC] VT-d: always clean up dpci timers.

If a VM has all its PCI devices deassigned, need_iommu(d) becomes false
but it might still have DPCI EOI timers that were init_timer()d but not
yet kill_timer()d.  That causes xen to crash later because the linked
list of inactive timers gets corrupted, e.g.:

(XEN) Xen call trace:
(XEN)    [<ffff82c480126256>] set_timer+0x1c2/0x24f
(XEN)    [<ffff82c48011fbf8>] schedule+0x129/0x5dd
(XEN)    [<ffff82c480122c1e>] __do_softirq+0x7e/0x89
(XEN)    [<ffff82c480122c9d>] do_softirq+0x26/0x28
(XEN)    [<ffff82c480153c85>] idle_loop+0x5a/0x5c
(XEN)    
(XEN) 
(XEN) ****************************************
(XEN) Panic on CPU 0:
(XEN) Assertion 'entry->next->prev == entry' failed at /local/scratch/tdeegan/xen-unstable.hg/xen/include:172
(XEN) ****************************************

The following patch makes sure that the domain destruction path always
clears up the DPCI state even if !needs_iommu(d). 

Although it fixes the crash for me, I'm sufficiently confused by this
code that I don't know whether it's enough.  If the dpci timer state
gets freed earlier than pci_clean_dpci_irqs() then there's still a race,
and some other function (reassign_device_ownership() ?) needs to sort
out the timers when the PCI card is deassigned.

Allen, can you comment?

Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>

diff -r ab6551e30841 xen/drivers/passthrough/pci.c
--- a/xen/drivers/passthrough/pci.c	Mon Jul 18 10:59:44 2011 +0100
+++ b/xen/drivers/passthrough/pci.c	Mon Jul 18 17:22:48 2011 +0100
@@ -269,7 +269,7 @@ static void pci_clean_dpci_irqs(struct d
     if ( !iommu_enabled )
         return;
 
-    if ( !is_hvm_domain(d) || !need_iommu(d) )
+    if ( !is_hvm_domain(d) )
         return;
 
     spin_lock(&d->event_lock);

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: RE: [PATCH] [RFC] VT-d: always clean up dpci timers.
  2011-07-21  1:08 ` Kay, Allen M
@ 2011-07-21  8:50   ` Tim Deegan
  2011-07-21  9:01     ` Keir Fraser
  2011-07-22  7:33     ` Kay, Allen M
  0 siblings, 2 replies; 8+ messages in thread
From: Tim Deegan @ 2011-07-21  8:50 UTC (permalink / raw)
  To: Kay, Allen M; +Cc: xen-devel, keir

At 18:08 -0700 on 20 Jul (1311185311), Kay, Allen M wrote:
> Hi Tim,
> 
> Can you provide the code flow that can cause this failure?
> 
> In pci_release_devices(), pci_clean_dpci_irqs() is called before
> "d->need_iommu = 0" in deassign_device().  If this path is taken, then
> it should not return from !need_iommu(d) in pci_clean_dpci_irqs().

The problem is that the xl toolstack has already deassigned the domain's
devices, using a hypercall to invoke deassign_device(), so by the time
the domain is destroyed, pci_release_devices() can't tell that it once
had a PCI device passed through.

It seems like the Right Thing[tm] would be for deassign_device() to find
and undo the relevant IRQ plumbing but I couldn't see how.  Is there a
mapping from bdf to irq in the iommu code or are they handled entirely
separately?

Tim.

> Allen
> 
> -----Original Message-----
> From: Tim Deegan [mailto:Tim.Deegan@citrix.com] 
> Sent: Monday, July 18, 2011 9:39 AM
> To: xen-devel@lists.xensource.com
> Cc: keir@xen.org; Kay, Allen M
> Subject: [PATCH] [RFC] VT-d: always clean up dpci timers.
> 
> If a VM has all its PCI devices deassigned, need_iommu(d) becomes false
> but it might still have DPCI EOI timers that were init_timer()d but not
> yet kill_timer()d.  That causes xen to crash later because the linked
> list of inactive timers gets corrupted, e.g.:
> 
> (XEN) Xen call trace:
> (XEN)    [<ffff82c480126256>] set_timer+0x1c2/0x24f
> (XEN)    [<ffff82c48011fbf8>] schedule+0x129/0x5dd
> (XEN)    [<ffff82c480122c1e>] __do_softirq+0x7e/0x89
> (XEN)    [<ffff82c480122c9d>] do_softirq+0x26/0x28
> (XEN)    [<ffff82c480153c85>] idle_loop+0x5a/0x5c
> (XEN)    
> (XEN) 
> (XEN) ****************************************
> (XEN) Panic on CPU 0:
> (XEN) Assertion 'entry->next->prev == entry' failed at /local/scratch/tdeegan/xen-unstable.hg/xen/include:172
> (XEN) ****************************************
> 
> The following patch makes sure that the domain destruction path always
> clears up the DPCI state even if !needs_iommu(d). 
> 
> Although it fixes the crash for me, I'm sufficiently confused by this
> code that I don't know whether it's enough.  If the dpci timer state
> gets freed earlier than pci_clean_dpci_irqs() then there's still a race,
> and some other function (reassign_device_ownership() ?) needs to sort
> out the timers when the PCI card is deassigned.
> 
> Allen, can you comment?
> 
> Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
> 
> diff -r ab6551e30841 xen/drivers/passthrough/pci.c
> --- a/xen/drivers/passthrough/pci.c	Mon Jul 18 10:59:44 2011 +0100
> +++ b/xen/drivers/passthrough/pci.c	Mon Jul 18 17:22:48 2011 +0100
> @@ -269,7 +269,7 @@ static void pci_clean_dpci_irqs(struct d
>      if ( !iommu_enabled )
>          return;
>  
> -    if ( !is_hvm_domain(d) || !need_iommu(d) )
> +    if ( !is_hvm_domain(d) )
>          return;
>  
>      spin_lock(&d->event_lock);
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: RE: [PATCH] [RFC] VT-d: always clean up dpci timers.
  2011-07-21  8:50   ` Tim Deegan
@ 2011-07-21  9:01     ` Keir Fraser
  2011-07-25 14:21       ` Tim Deegan
  2011-07-22  7:33     ` Kay, Allen M
  1 sibling, 1 reply; 8+ messages in thread
From: Keir Fraser @ 2011-07-21  9:01 UTC (permalink / raw)
  To: Tim Deegan, Kay, Allen M; +Cc: xen-devel

On 21/07/2011 09:50, "Tim Deegan" <Tim.Deegan@citrix.com> wrote:

> At 18:08 -0700 on 20 Jul (1311185311), Kay, Allen M wrote:
>> Hi Tim,
>> 
>> Can you provide the code flow that can cause this failure?
>> 
>> In pci_release_devices(), pci_clean_dpci_irqs() is called before
>> "d->need_iommu = 0" in deassign_device().  If this path is taken, then
>> it should not return from !need_iommu(d) in pci_clean_dpci_irqs().
> 
> The problem is that the xl toolstack has already deassigned the domain's
> devices, using a hypercall to invoke deassign_device(), so by the time
> the domain is destroyed, pci_release_devices() can't tell that it once
> had a PCI device passed through.
> 
> It seems like the Right Thing[tm] would be for deassign_device() to find
> and undo the relevant IRQ plumbing but I couldn't see how.  Is there a
> mapping from bdf to irq in the iommu code or are they handled entirely
> separately?

Could we make need_iommu(d) sticky? Being able to clear it doesn't seem an
important case (such a domain is probably being torn down anyway) and
clearly it can lead to fragility. The fact that presumably we'd end up doing
unnecessary IOMMU PT work for the remaining lifetime of the domain doesn't
seem a major downside to me.

 -- Keir

> Tim.
> 
>> Allen
>> 
>> -----Original Message-----
>> From: Tim Deegan [mailto:Tim.Deegan@citrix.com]
>> Sent: Monday, July 18, 2011 9:39 AM
>> To: xen-devel@lists.xensource.com
>> Cc: keir@xen.org; Kay, Allen M
>> Subject: [PATCH] [RFC] VT-d: always clean up dpci timers.
>> 
>> If a VM has all its PCI devices deassigned, need_iommu(d) becomes false
>> but it might still have DPCI EOI timers that were init_timer()d but not
>> yet kill_timer()d.  That causes xen to crash later because the linked
>> list of inactive timers gets corrupted, e.g.:
>> 
>> (XEN) Xen call trace:
>> (XEN)    [<ffff82c480126256>] set_timer+0x1c2/0x24f
>> (XEN)    [<ffff82c48011fbf8>] schedule+0x129/0x5dd
>> (XEN)    [<ffff82c480122c1e>] __do_softirq+0x7e/0x89
>> (XEN)    [<ffff82c480122c9d>] do_softirq+0x26/0x28
>> (XEN)    [<ffff82c480153c85>] idle_loop+0x5a/0x5c
>> (XEN)    
>> (XEN) 
>> (XEN) ****************************************
>> (XEN) Panic on CPU 0:
>> (XEN) Assertion 'entry->next->prev == entry' failed at
>> /local/scratch/tdeegan/xen-unstable.hg/xen/include:172
>> (XEN) ****************************************
>> 
>> The following patch makes sure that the domain destruction path always
>> clears up the DPCI state even if !needs_iommu(d).
>> 
>> Although it fixes the crash for me, I'm sufficiently confused by this
>> code that I don't know whether it's enough.  If the dpci timer state
>> gets freed earlier than pci_clean_dpci_irqs() then there's still a race,
>> and some other function (reassign_device_ownership() ?) needs to sort
>> out the timers when the PCI card is deassigned.
>> 
>> Allen, can you comment?
>> 
>> Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
>> 
>> diff -r ab6551e30841 xen/drivers/passthrough/pci.c
>> --- a/xen/drivers/passthrough/pci.c Mon Jul 18 10:59:44 2011 +0100
>> +++ b/xen/drivers/passthrough/pci.c Mon Jul 18 17:22:48 2011 +0100
>> @@ -269,7 +269,7 @@ static void pci_clean_dpci_irqs(struct d
>>      if ( !iommu_enabled )
>>          return;
>>  
>> -    if ( !is_hvm_domain(d) || !need_iommu(d) )
>> +    if ( !is_hvm_domain(d) )
>>          return;
>>  
>>      spin_lock(&d->event_lock);
>> 
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xensource.com
>> http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: RE: [PATCH] [RFC] VT-d: always clean up dpci timers.
  2011-07-21  8:50   ` Tim Deegan
  2011-07-21  9:01     ` Keir Fraser
@ 2011-07-22  7:33     ` Kay, Allen M
  1 sibling, 0 replies; 8+ messages in thread
From: Kay, Allen M @ 2011-07-22  7:33 UTC (permalink / raw)
  To: Tim Deegan; +Cc: xen-devel, keir

> Is there a mapping from bdf to irq in the iommu code or are they handled
> entirely separately?

I think they are handled separately inside of xen.  Irq stuff is mostly about mirq to girq mapping.  There should be BDF to irq correlation in user mode code.

Allen

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: RE: [PATCH] [RFC] VT-d: always clean up dpci timers.
  2011-07-21  9:01     ` Keir Fraser
@ 2011-07-25 14:21       ` Tim Deegan
  2011-07-25 14:47         ` Keir Fraser
  0 siblings, 1 reply; 8+ messages in thread
From: Tim Deegan @ 2011-07-25 14:21 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel, Kay, Allen M

At 10:01 +0100 on 21 Jul (1311242513), Keir Fraser wrote:
> On 21/07/2011 09:50, "Tim Deegan" <Tim.Deegan@citrix.com> wrote:
> 
> > At 18:08 -0700 on 20 Jul (1311185311), Kay, Allen M wrote:
> >> Hi Tim,
> >> 
> >> Can you provide the code flow that can cause this failure?
> >> 
> >> In pci_release_devices(), pci_clean_dpci_irqs() is called before
> >> "d->need_iommu = 0" in deassign_device().  If this path is taken, then
> >> it should not return from !need_iommu(d) in pci_clean_dpci_irqs().
> > 
> > The problem is that the xl toolstack has already deassigned the domain's
> > devices, using a hypercall to invoke deassign_device(), so by the time
> > the domain is destroyed, pci_release_devices() can't tell that it once
> > had a PCI device passed through.
> > 
> > It seems like the Right Thing[tm] would be for deassign_device() to find
> > and undo the relevant IRQ plumbing but I couldn't see how.  Is there a
> > mapping from bdf to irq in the iommu code or are they handled entirely
> > separately?
> 
> Could we make need_iommu(d) sticky? Being able to clear it doesn't seem an
> important case (such a domain is probably being torn down anyway) and
> clearly it can lead to fragility. The fact that presumably we'd end up doing
> unnecessary IOMMU PT work for the remaining lifetime of the domain doesn't
> seem a major downside to me.

If you prefer it that way.  TBH I think I prefer the other way though:
things that gate on need_iommu() should be cleaned up by
iommu->teardown().

--8<---- 

PCI passthrough: don't tear down IOMMU when the last device is deassigned.

Otherwise there's a risk that not all iommu-related state will get torn
down during domain destruction.

Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>

diff -r 9dbbf1631193 xen/drivers/passthrough/iommu.c
--- a/xen/drivers/passthrough/iommu.c	Mon Jul 25 14:21:13 2011 +0100
+++ b/xen/drivers/passthrough/iommu.c	Mon Jul 25 15:14:31 2011 +0100
@@ -296,12 +296,6 @@ int deassign_device(struct domain *d, u8
         return ret;
     }
 
-    if ( !has_arch_pdevs(d) && need_iommu(d) )
-    {
-        d->need_iommu = 0;
-        hd->platform_ops->teardown(d);
-    }
-
     return ret;
 }

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: RE: [PATCH] [RFC] VT-d: always clean up dpci timers.
  2011-07-25 14:21       ` Tim Deegan
@ 2011-07-25 14:47         ` Keir Fraser
  2011-07-25 15:04           ` Tim Deegan
  0 siblings, 1 reply; 8+ messages in thread
From: Keir Fraser @ 2011-07-25 14:47 UTC (permalink / raw)
  To: Tim Deegan; +Cc: xen-devel, Kay, Allen M

On 25/07/2011 15:21, "Tim Deegan" <Tim.Deegan@citrix.com> wrote:

>> Could we make need_iommu(d) sticky? Being able to clear it doesn't seem an
>> important case (such a domain is probably being torn down anyway) and
>> clearly it can lead to fragility. The fact that presumably we'd end up doing
>> unnecessary IOMMU PT work for the remaining lifetime of the domain doesn't
>> seem a major downside to me.
> 
> If you prefer it that way.  TBH I think I prefer the other way though:
> things that gate on need_iommu() should be cleaned up by
> iommu->teardown().

Is your original patch still proposed as a valid alternative to this one?

 -- Keir

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: RE: [PATCH] [RFC] VT-d: always clean up dpci timers.
  2011-07-25 14:47         ` Keir Fraser
@ 2011-07-25 15:04           ` Tim Deegan
  0 siblings, 0 replies; 8+ messages in thread
From: Tim Deegan @ 2011-07-25 15:04 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel, Kay, Allen M

At 15:47 +0100 on 25 Jul (1311608879), Keir Fraser wrote:
> On 25/07/2011 15:21, "Tim Deegan" <Tim.Deegan@citrix.com> wrote:
> 
> >> Could we make need_iommu(d) sticky? Being able to clear it doesn't seem an
> >> important case (such a domain is probably being torn down anyway) and
> >> clearly it can lead to fragility. The fact that presumably we'd end up doing
> >> unnecessary IOMMU PT work for the remaining lifetime of the domain doesn't
> >> seem a major downside to me.
> > 
> > If you prefer it that way.  TBH I think I prefer the other way though:
> > things that gate on need_iommu() should be cleaned up by
> > iommu->teardown().
> 
> Is your original patch still proposed as a valid alternative to this one?

Yes; I prefer the original. 

Tim.

-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2011-07-25 15:04 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-07-18 16:38 [PATCH] [RFC] VT-d: always clean up dpci timers Tim Deegan
2011-07-21  1:08 ` Kay, Allen M
2011-07-21  8:50   ` Tim Deegan
2011-07-21  9:01     ` Keir Fraser
2011-07-25 14:21       ` Tim Deegan
2011-07-25 14:47         ` Keir Fraser
2011-07-25 15:04           ` Tim Deegan
2011-07-22  7:33     ` Kay, Allen M

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.