All of lore.kernel.org
 help / color / mirror / Atom feed
* xsa46-4.2.patch breaks PCI passthrough?
@ 2013-05-01  5:29 Steven Haigh
  2013-05-01 11:09 ` Andrew Cooper
  2013-05-01 15:18 ` George Dunlap
  0 siblings, 2 replies; 15+ messages in thread
From: Steven Haigh @ 2013-05-01  5:29 UTC (permalink / raw)
  To: xen-devel

Hi all,

I've had a report lodged against my packages that the patch provided for 
XSA46 against Xen 4.2.1 causes PCI passthru to break.

It seems that 4.2.1 *without* the XSA46 patch works perfectly. 4.2.2 
does not work.

I added this patch in xen-4.2.1-6 of my RPMs (http://xen.crc.id.au) and 
the reporter has built the same SRPM with xsa46 patch removed and PCI 
passthrough works as intended.

Reapplying the XSA46 patch causes it to break again.

The bug report and logs can be found here:
	http://xen.crc.id.au/bugs/view.php?id=5

Has anyone come across this?

-- 
Steven Haigh

Email: netwiz@crc.id.au
Web: https://www.crc.id.au
Phone: (03) 9001 6090 - 0412 935 897
Fax: (03) 8338 0299

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: xsa46-4.2.patch breaks PCI passthrough?
  2013-05-01  5:29 xsa46-4.2.patch breaks PCI passthrough? Steven Haigh
@ 2013-05-01 11:09 ` Andrew Cooper
  2013-05-01 11:28   ` Andrew Cooper
  2013-05-01 15:18 ` George Dunlap
  1 sibling, 1 reply; 15+ messages in thread
From: Andrew Cooper @ 2013-05-01 11:09 UTC (permalink / raw)
  To: Steven Haigh; +Cc: xen-devel

On 01/05/13 06:29, Steven Haigh wrote:
> Hi all,
>
> I've had a report lodged against my packages that the patch provided for 
> XSA46 against Xen 4.2.1 causes PCI passthru to break.
>
> It seems that 4.2.1 *without* the XSA46 patch works perfectly. 4.2.2 
> does not work.
>
> I added this patch in xen-4.2.1-6 of my RPMs (http://xen.crc.id.au) and 
> the reporter has built the same SRPM with xsa46 patch removed and PCI 
> passthrough works as intended.
>
> Reapplying the XSA46 patch causes it to break again.
>
> The bug report and logs can be found here:
> 	http://xen.crc.id.au/bugs/view.php?id=5
>
> Has anyone come across this?
>

XSA-46 was to do with PCI passthrough of PV domains, and in particular
changing some of the rules regarding interrupts.

One thing which is not clear from the bug report so far is what exactly
is failing with an EINVAL.  The implication is that it is the toolstack
which is bailing before the VM is created.

I will take a closer look at the two log files.

~Andrew

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: xsa46-4.2.patch breaks PCI passthrough?
  2013-05-01 11:09 ` Andrew Cooper
@ 2013-05-01 11:28   ` Andrew Cooper
  2013-05-02  8:49     ` Jan Beulich
  0 siblings, 1 reply; 15+ messages in thread
From: Andrew Cooper @ 2013-05-01 11:28 UTC (permalink / raw)
  To: Steven Haigh; +Cc: xen-devel

[-- Attachment #1: Type: text/plain, Size: 1535 bytes --]

On 01/05/13 12:09, Andrew Cooper wrote:
> On 01/05/13 06:29, Steven Haigh wrote:
>> Hi all,
>>
>> I've had a report lodged against my packages that the patch provided for 
>> XSA46 against Xen 4.2.1 causes PCI passthru to break.
>>
>> It seems that 4.2.1 *without* the XSA46 patch works perfectly. 4.2.2 
>> does not work.
>>
>> I added this patch in xen-4.2.1-6 of my RPMs (http://xen.crc.id.au) and 
>> the reporter has built the same SRPM with xsa46 patch removed and PCI 
>> passthrough works as intended.
>>
>> Reapplying the XSA46 patch causes it to break again.
>>
>> The bug report and logs can be found here:
>> 	http://xen.crc.id.au/bugs/view.php?id=5
>>
>> Has anyone come across this?
>>
> XSA-46 was to do with PCI passthrough of PV domains, and in particular
> changing some of the rules regarding interrupts.
>
> One thing which is not clear from the bug report so far is what exactly
> is failing with an EINVAL.  The implication is that it is the toolstack
> which is bailing before the VM is created.
>
> I will take a closer look at the two log files.
>
> ~Andrew
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

Ok - please ignore my previous email somewhat.

Xend is failing a xc.domain_irq_permission() call.  As the toolstack
side of things have not changed, it must be the changed in the
hypervisor which are causing the issues.

Can you please try the attached patch, and pass along xl dmesg in the
failing case?

~Andrew

[-- Attachment #2: XSA-46-debug.patch --]
[-- Type: text/x-patch, Size: 395 bytes --]

diff --git a/xen/common/domctl.c b/xen/common/domctl.c
index cbc8146..307848f 100644
--- a/xen/common/domctl.c
+++ b/xen/common/domctl.c
@@ -902,6 +902,8 @@ long do_domctl(XEN_GUEST_HANDLE(xen_domctl_t) u_domctl)
         else
             ret = pirq_deny_access(d, pirq);
 
+        printk("**DBG perms { %u, %d } = %d\n", pirq, allow, ret);
+
         rcu_unlock_domain(d);
     }
     break;

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: xsa46-4.2.patch breaks PCI passthrough?
  2013-05-01  5:29 xsa46-4.2.patch breaks PCI passthrough? Steven Haigh
  2013-05-01 11:09 ` Andrew Cooper
@ 2013-05-01 15:18 ` George Dunlap
  2013-05-01 15:26   ` Steven Haigh
  1 sibling, 1 reply; 15+ messages in thread
From: George Dunlap @ 2013-05-01 15:18 UTC (permalink / raw)
  To: Steven Haigh; +Cc: xen-devel

On Wed, May 1, 2013 at 6:29 AM, Steven Haigh <netwiz@crc.id.au> wrote:
> Hi all,
>
> I've had a report lodged against my packages that the patch provided for
> XSA46 against Xen 4.2.1 causes PCI passthru to break.
>
> It seems that 4.2.1 *without* the XSA46 patch works perfectly. 4.2.2 does
> not work.

Have you tried this with xen-unstable tip?  That would be a blocker
bug for the 4.3 release.

 -George

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: xsa46-4.2.patch breaks PCI passthrough?
  2013-05-01 15:18 ` George Dunlap
@ 2013-05-01 15:26   ` Steven Haigh
  2013-05-01 16:07     ` Andrew Cooper
  0 siblings, 1 reply; 15+ messages in thread
From: Steven Haigh @ 2013-05-01 15:26 UTC (permalink / raw)
  To: xen-devel

On 2/05/2013 1:18 AM, George Dunlap wrote:
> On Wed, May 1, 2013 at 6:29 AM, Steven Haigh <netwiz@crc.id.au> wrote:
>> Hi all,
>>
>> I've had a report lodged against my packages that the patch provided for
>> XSA46 against Xen 4.2.1 causes PCI passthru to break.
>>
>> It seems that 4.2.1 *without* the XSA46 patch works perfectly. 4.2.2 does
>> not work.
>
> Have you tried this with xen-unstable tip?  That would be a blocker
> bug for the 4.3 release.

Hi George,

It hasn't been tried it against anything other than 4.2.1 & 4.2.2 as 
yet. As I'm not the end user with the problem here, I need to wait for 
feedback.

I have passed the patch provided by Andrew to the bug author - when I've 
got feedback on this I'll be able to provide more information. I think 
when we've got a root cause for this then it should be simple to verify 
it on 4.3.

-- 
Steven Haigh

Email: netwiz@crc.id.au
Web: https://www.crc.id.au
Phone: (03) 9001 6090 - 0412 935 897
Fax: (03) 8338 0299

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: xsa46-4.2.patch breaks PCI passthrough?
  2013-05-01 15:26   ` Steven Haigh
@ 2013-05-01 16:07     ` Andrew Cooper
  2013-05-03 22:15       ` Steven Haigh
  0 siblings, 1 reply; 15+ messages in thread
From: Andrew Cooper @ 2013-05-01 16:07 UTC (permalink / raw)
  To: Steven Haigh, George Dunlap; +Cc: xen-devel

On 01/05/13 16:26, Steven Haigh wrote:
> On 2/05/2013 1:18 AM, George Dunlap wrote:
>> On Wed, May 1, 2013 at 6:29 AM, Steven Haigh <netwiz@crc.id.au> wrote:
>>> Hi all,
>>>
>>> I've had a report lodged against my packages that the patch provided for
>>> XSA46 against Xen 4.2.1 causes PCI passthru to break.
>>>
>>> It seems that 4.2.1 *without* the XSA46 patch works perfectly. 4.2.2 does
>>> not work.
>> Have you tried this with xen-unstable tip?  That would be a blocker
>> bug for the 4.3 release.
> Hi George,
>
> It hasn't been tried it against anything other than 4.2.1 & 4.2.2 as 
> yet. As I'm not the end user with the problem here, I need to wait for 
> feedback.
>
> I have passed the patch provided by Andrew to the bug author - when I've 
> got feedback on this I'll be able to provide more information. I think 
> when we've got a root cause for this then it should be simple to verify 
> it on 4.3.
>

I have been investigating this issue on XenServer.

On XenServer, PCIPassthrough to a SLES11SP1 guest is working correctly,
even with the XSA-46 patch applied.

When passing through physical devices, my hypervisor debugging is being
triggered, but the actions of XEN_DOMCTL_irq_permission appear to be
correct, given sensible input from the Xapi toolstack.  When passing
through an SRIOV virtual function, no hypervisor debugging is being
triggered.

At a preliminary guess, I would say that XM looks to be doing something
stupid which it used to be getting away with, but is not now given the
changed in the hypervisor.

I suspect that it will be hard to progress this issue until Gordan
applied my debugging patch and gets back with the results.

~Andrew

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: xsa46-4.2.patch breaks PCI passthrough?
  2013-05-01 11:28   ` Andrew Cooper
@ 2013-05-02  8:49     ` Jan Beulich
  2013-05-02 10:43       ` Ian Campbell
  0 siblings, 1 reply; 15+ messages in thread
From: Jan Beulich @ 2013-05-02  8:49 UTC (permalink / raw)
  To: Andrew Cooper, Ian Campbell, Steven Haigh, Ian Jackson; +Cc: xen-devel

>>> On 01.05.13 at 13:28, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
> On 01/05/13 12:09, Andrew Cooper wrote:
>> On 01/05/13 06:29, Steven Haigh wrote:
>>> Hi all,
>>>
>>> I've had a report lodged against my packages that the patch provided for 
>>> XSA46 against Xen 4.2.1 causes PCI passthru to break.
>>>
>>> It seems that 4.2.1 *without* the XSA46 patch works perfectly. 4.2.2 
>>> does not work.
>>>
>>> I added this patch in xen-4.2.1-6 of my RPMs (http://xen.crc.id.au) and 
>>> the reporter has built the same SRPM with xsa46 patch removed and PCI 
>>> passthrough works as intended.
>>>
>>> Reapplying the XSA46 patch causes it to break again.
>>>
>>> The bug report and logs can be found here:
>>> 	http://xen.crc.id.au/bugs/view.php?id=5 
>>>
>>> Has anyone come across this?
>>>
>> XSA-46 was to do with PCI passthrough of PV domains, and in particular
>> changing some of the rules regarding interrupts.

This was misguiding me - I somehow concluded that the problems
here are being observed with PV domains, but considering the
second report we got as well as looking through the log files I'm
now rather guessing that the problem is (only) with HVM domains.
That in turn would match up with the code in pciif.py:

        if not self.vm.info.is_hvm() and dev.irq:
            rc = xc.physdev_map_pirq(domid = fe_domid,
                                   index = dev.irq,
                                   pirq  = dev.irq)
            if rc < 0:
                raise VmError(('pci: failed to map irq on device '+
                            '%s - errno=%d')%(dev.name,rc))
        if dev.irq>0:
            log.debug('pci: enabling irq %d'%dev.irq)
            rc = xc.domain_irq_permission(domid =  fe_domid, pirq = dev.irq,
                    allow_access = True)
            if rc<0:
                raise VmError(('pci: failed to configure irq on device '+
                            '%s - errno=%d')%(dev.name,rc))

i.e. the first portion of the setup is only being done for PV
guests. I have no idea why this is so (irqif.py doesn't special
case the guest kind, nor does libxl). Quite likely dropping that
check would be sufficient, but of course that should be
confirmed by someone knowing that code (and ideally also
knowing why this was being special cased in the first place) -
Ian, Ian?

(Oddly enough, the first check also does "dev.irq != 0) while
the second uses "dev.irq > 0" - I wouldn't expect negative
values to ever appear here, but it's another hint at there
being unnecessary inconsistencies here.)

Jan

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: xsa46-4.2.patch breaks PCI passthrough?
  2013-05-02  8:49     ` Jan Beulich
@ 2013-05-02 10:43       ` Ian Campbell
  2013-05-02 11:54         ` Jan Beulich
  0 siblings, 1 reply; 15+ messages in thread
From: Ian Campbell @ 2013-05-02 10:43 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Steven Haigh, Ian Jackson, xen-devel

On Thu, 2013-05-02 at 09:49 +0100, Jan Beulich wrote:
> >>> On 01.05.13 at 13:28, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
> > On 01/05/13 12:09, Andrew Cooper wrote:
> >> On 01/05/13 06:29, Steven Haigh wrote:
> >>> Hi all,
> >>>
> >>> I've had a report lodged against my packages that the patch provided for 
> >>> XSA46 against Xen 4.2.1 causes PCI passthru to break.
> >>>
> >>> It seems that 4.2.1 *without* the XSA46 patch works perfectly. 4.2.2 
> >>> does not work.
> >>>
> >>> I added this patch in xen-4.2.1-6 of my RPMs (http://xen.crc.id.au) and 
> >>> the reporter has built the same SRPM with xsa46 patch removed and PCI 
> >>> passthrough works as intended.
> >>>
> >>> Reapplying the XSA46 patch causes it to break again.
> >>>
> >>> The bug report and logs can be found here:
> >>> 	http://xen.crc.id.au/bugs/view.php?id=5 
> >>>
> >>> Has anyone come across this?
> >>>
> >> XSA-46 was to do with PCI passthrough of PV domains, and in particular
> >> changing some of the rules regarding interrupts.
> 
> This was misguiding me - I somehow concluded that the problems
> here are being observed with PV domains, but considering the
> second report we got as well as looking through the log files I'm
> now rather guessing that the problem is (only) with HVM domains.
> That in turn would match up with the code in pciif.py:
> 
>         if not self.vm.info.is_hvm() and dev.irq:
>             rc = xc.physdev_map_pirq(domid = fe_domid,
>                                    index = dev.irq,
>                                    pirq  = dev.irq)
>             if rc < 0:
>                 raise VmError(('pci: failed to map irq on device '+
>                             '%s - errno=%d')%(dev.name,rc))
>         if dev.irq>0:
>             log.debug('pci: enabling irq %d'%dev.irq)
>             rc = xc.domain_irq_permission(domid =  fe_domid, pirq = dev.irq,
>                     allow_access = True)
>             if rc<0:
>                 raise VmError(('pci: failed to configure irq on device '+
>                             '%s - errno=%d')%(dev.name,rc))
> 
> i.e. the first portion of the setup is only being done for PV
> guests. I have no idea why this is so (irqif.py doesn't special
> case the guest kind, nor does libxl). Quite likely dropping that
> check would be sufficient, but of course that should be
> confirmed by someone knowing that code (and ideally also
> knowing why this was being special cased in the first place) -
> Ian, Ian?

If you are asking me why xend behaves this way then I have no clue.
Finding someone who does is probably a big ask, unless the changelog
offers any clues, the commit in question seems to be:

        commit 345fbe6cb410fb43c7b269a54d1c60e1e025f393
        Author: Keir Fraser <keir.fraser@citrix.com>
        Date:   Mon Sep 7 08:38:39 2009 +0100
        
            xend: passthrough: fix physdev_map_pirq invocation
            
            For those devices not having INTx (like VFs), avoid calling map_pirq,
            otherwise the guest cannot be started successfully.
            
            Also avoid calling this hypercall for hvm guest, this is done in the
            device model.
            
            Signed-off-by: Qing He <qing.he@intel.com>

Seems like "For those devices" is the "and dev.irq" bit and the "Also
avoid" is the "is_hvm()" bit. I have no idea about the validity of any
of that reasoning though...

Ian.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: xsa46-4.2.patch breaks PCI passthrough?
  2013-05-02 10:43       ` Ian Campbell
@ 2013-05-02 11:54         ` Jan Beulich
  0 siblings, 0 replies; 15+ messages in thread
From: Jan Beulich @ 2013-05-02 11:54 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Andrew Cooper, Steven Haigh, Ian Jackson, xen-devel

>>> On 02.05.13 at 12:43, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> On Thu, 2013-05-02 at 09:49 +0100, Jan Beulich wrote:
>> >>> On 01.05.13 at 13:28, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>> >> XSA-46 was to do with PCI passthrough of PV domains, and in particular
>> >> changing some of the rules regarding interrupts.
>> 
>> This was misguiding me - I somehow concluded that the problems
>> here are being observed with PV domains, but considering the
>> second report we got as well as looking through the log files I'm
>> now rather guessing that the problem is (only) with HVM domains.
>> That in turn would match up with the code in pciif.py:
>> 
>>         if not self.vm.info.is_hvm() and dev.irq:
>>             rc = xc.physdev_map_pirq(domid = fe_domid,
>>                                    index = dev.irq,
>>                                    pirq  = dev.irq)
>>             if rc < 0:
>>                 raise VmError(('pci: failed to map irq on device '+
>>                             '%s - errno=%d')%(dev.name,rc))
>>         if dev.irq>0:
>>             log.debug('pci: enabling irq %d'%dev.irq)
>>             rc = xc.domain_irq_permission(domid =  fe_domid, pirq = dev.irq,
>>                     allow_access = True)
>>             if rc<0:
>>                 raise VmError(('pci: failed to configure irq on device '+
>>                             '%s - errno=%d')%(dev.name,rc))
>> 
>> i.e. the first portion of the setup is only being done for PV
>> guests. I have no idea why this is so (irqif.py doesn't special
>> case the guest kind, nor does libxl). Quite likely dropping that
>> check would be sufficient, but of course that should be
>> confirmed by someone knowing that code (and ideally also
>> knowing why this was being special cased in the first place) -
>> Ian, Ian?
> 
> If you are asking me why xend behaves this way then I have no clue.
> Finding someone who does is probably a big ask, unless the changelog
> offers any clues, the commit in question seems to be:
> 
>         commit 345fbe6cb410fb43c7b269a54d1c60e1e025f393
>         Author: Keir Fraser <keir.fraser@citrix.com>
>         Date:   Mon Sep 7 08:38:39 2009 +0100
>         
>             xend: passthrough: fix physdev_map_pirq invocation
>             
>             For those devices not having INTx (like VFs), avoid calling map_pirq,
>             otherwise the guest cannot be started successfully.
>             
>             Also avoid calling this hypercall for hvm guest, this is done in the
>             device model.
>             
>             Signed-off-by: Qing He <qing.he@intel.com>
> 
> Seems like "For those devices" is the "and dev.irq" bit and the "Also
> avoid" is the "is_hvm()" bit. I have no idea about the validity of any
> of that reasoning though...

I think I agree with this interpretation, and on that basis I just
went through the involved hypervisor side code path - afaict
there should be no problem with this being done in xend and
then a second time in the device model. Therefore I think we
ought to see whether the suggested adjustment actually works
for the reporters of the problem, and just go with it if so.

Jan

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: xsa46-4.2.patch breaks PCI passthrough?
  2013-05-01 16:07     ` Andrew Cooper
@ 2013-05-03 22:15       ` Steven Haigh
  2013-05-04 17:23         ` Andrew Cooper
  0 siblings, 1 reply; 15+ messages in thread
From: Steven Haigh @ 2013-05-03 22:15 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: George Dunlap, xen-devel

On 05/02/2013 02:07 AM, Andrew Cooper wrote:
> On 01/05/13 16:26, Steven Haigh wrote:
>> On 2/05/2013 1:18 AM, George Dunlap wrote:
>>> On Wed, May 1, 2013 at 6:29 AM, Steven Haigh <netwiz@crc.id.au> wrote:
>>>> Hi all,
>>>>
>>>> I've had a report lodged against my packages that the patch provided for
>>>> XSA46 against Xen 4.2.1 causes PCI passthru to break.
>>>>
>>>> It seems that 4.2.1 *without* the XSA46 patch works perfectly. 4.2.2 does
>>>> not work.
>>> Have you tried this with xen-unstable tip?  That would be a blocker
>>> bug for the 4.3 release.
>> Hi George,
>>
>> It hasn't been tried it against anything other than 4.2.1 & 4.2.2 as
>> yet. As I'm not the end user with the problem here, I need to wait for
>> feedback.
>>
>> I have passed the patch provided by Andrew to the bug author - when I've
>> got feedback on this I'll be able to provide more information. I think
>> when we've got a root cause for this then it should be simple to verify
>> it on 4.3.
>>
> I have been investigating this issue on XenServer.
>
> On XenServer, PCIPassthrough to a SLES11SP1 guest is working correctly,
> even with the XSA-46 patch applied.
>
> When passing through physical devices, my hypervisor debugging is being
> triggered, but the actions of XEN_DOMCTL_irq_permission appear to be
> correct, given sensible input from the Xapi toolstack.  When passing
> through an SRIOV virtual function, no hypervisor debugging is being
> triggered.
>
> At a preliminary guess, I would say that XM looks to be doing something
> stupid which it used to be getting away with, but is not now given the
> changed in the hypervisor.
>
> I suspect that it will be hard to progress this issue until Gordan
> applied my debugging patch and gets back with the results.
>
Hi Andrew,

Got a reply from Gordon -> http://xen.crc.id.au/bugs/view.php?id=5

The 'xm dmesg' output shows the following:
(XEN) sh error: sh_remove_all_mappings(): can't find all mappings of mfn 
4bc435: c=c000000000000002 t=7400000000000001
(XEN) **DBG perms { 16, 1 } = 0
(XEN) **DBG perms { 34, 1 } = -22
(XEN) sh error: sh_remove_all_mappings(): can't find all mappings of mfn 
4be230: c=c000000000000002 t=7400000000000001
(XEN) **DBG perms { 16, 1 } = 0
(XEN) **DBG perms { 34, 1 } = -22

The full dmesg is attached to the bug report.

--
Steven Haigh

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: xsa46-4.2.patch breaks PCI passthrough?
  2013-05-03 22:15       ` Steven Haigh
@ 2013-05-04 17:23         ` Andrew Cooper
  2013-05-05 10:53           ` Steven Haigh
  0 siblings, 1 reply; 15+ messages in thread
From: Andrew Cooper @ 2013-05-04 17:23 UTC (permalink / raw)
  To: Steven Haigh; +Cc: George Dunlap, xen-devel

[-- Attachment #1: Type: text/plain, Size: 3046 bytes --]

On 03/05/2013 23:15, Steven Haigh wrote:
> On 05/02/2013 02:07 AM, Andrew Cooper wrote:
>> On 01/05/13 16:26, Steven Haigh wrote:
>>> On 2/05/2013 1:18 AM, George Dunlap wrote:
>>>> On Wed, May 1, 2013 at 6:29 AM, Steven Haigh <netwiz@crc.id.au> wrote:
>>>>> Hi all,
>>>>>
>>>>> I've had a report lodged against my packages that the patch provided for
>>>>> XSA46 against Xen 4.2.1 causes PCI passthru to break.
>>>>>
>>>>> It seems that 4.2.1 *without* the XSA46 patch works perfectly. 4.2.2 does
>>>>> not work.
>>>> Have you tried this with xen-unstable tip?  That would be a blocker
>>>> bug for the 4.3 release.
>>> Hi George,
>>>
>>> It hasn't been tried it against anything other than 4.2.1 & 4.2.2 as
>>> yet. As I'm not the end user with the problem here, I need to wait for
>>> feedback.
>>>
>>> I have passed the patch provided by Andrew to the bug author - when I've
>>> got feedback on this I'll be able to provide more information. I think
>>> when we've got a root cause for this then it should be simple to verify
>>> it on 4.3.
>>>
>> I have been investigating this issue on XenServer.
>>
>> On XenServer, PCIPassthrough to a SLES11SP1 guest is working correctly,
>> even with the XSA-46 patch applied.
>>
>> When passing through physical devices, my hypervisor debugging is being
>> triggered, but the actions of XEN_DOMCTL_irq_permission appear to be
>> correct, given sensible input from the Xapi toolstack.  When passing
>> through an SRIOV virtual function, no hypervisor debugging is being
>> triggered.
>>
>> At a preliminary guess, I would say that XM looks to be doing something
>> stupid which it used to be getting away with, but is not now given the
>> changed in the hypervisor.
>>
>> I suspect that it will be hard to progress this issue until Gordan
>> applied my debugging patch and gets back with the results.
>>
> Hi Andrew,
>
> Got a reply from Gordon -> http://xen.crc.id.au/bugs/view.php?id=5
>
> The 'xm dmesg' output shows the following:
> (XEN) sh error: sh_remove_all_mappings(): can't find all mappings of mfn 
> 4bc435: c=c000000000000002 t=7400000000000001
> (XEN) **DBG perms { 16, 1 } = 0
> (XEN) **DBG perms { 34, 1 } = -22
> (XEN) sh error: sh_remove_all_mappings(): can't find all mappings of mfn 
> 4be230: c=c000000000000002 t=7400000000000001
> (XEN) **DBG perms { 16, 1 } = 0
> (XEN) **DBG perms { 34, 1 } = -22
>
> The full dmesg is attached to the bug report.
>
> --
> Steven Haigh

Unrelated to the PCI passthrough problem, those spurious sh errors are
fixed by
http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=0984c2b63a7c3e9dfa770b29dac51a5124aecaab

>From Gordon's log, it appears pirq 34 is the one causing problems.

Can he please try the latest attached debugging patch which should
provide rather more information in the failure case.

Also, can he boot with "loglvl=all" on the Xen command line, and also
issue "xm debug-keys izq" before capturing xm dmesg.  The debug keys
should dump loads on information into the dmesg buffer to do with
interrupts etc.

Thanks,

~Andrew

[-- Attachment #2: XSA-46-xen-4.2-debug-v3.patch --]
[-- Type: text/plain, Size: 690 bytes --]

diff --git a/xen/common/domctl.c b/xen/common/domctl.c
index b3bfb38..be30cf3 100644
--- a/xen/common/domctl.c
+++ b/xen/common/domctl.c
@@ -908,6 +908,16 @@ long do_domctl(XEN_GUEST_HANDLE(xen_domctl_t) u_domctl)
         else
             ret = pirq_deny_access(d, pirq);
 
+        printk("**DBG perms { %u, %d } = %ld\n", pirq, allow, ret);
+        if ( ret )
+        {
+            printk(" Domain %"PRId16", nr_pirqs %d\n",
+                   d->domain_id, d->nr_pirqs);
+            printk(" dom_pirq_to_irq(%d) = %d\n",
+                   pirq, domain_pirq_to_irq(d, pirq));
+            rangeset_domain_printk(d);
+        }
+
         rcu_unlock_domain(d);
     }
     break;

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: xsa46-4.2.patch breaks PCI passthrough?
  2013-05-04 17:23         ` Andrew Cooper
@ 2013-05-05 10:53           ` Steven Haigh
  2013-05-06  7:15             ` Jan Beulich
  0 siblings, 1 reply; 15+ messages in thread
From: Steven Haigh @ 2013-05-05 10:53 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: George Dunlap, xen-devel

On 05/05/2013 03:23 AM, Andrew Cooper wrote:
> On 03/05/2013 23:15, Steven Haigh wrote:
>> On 05/02/2013 02:07 AM, Andrew Cooper wrote:
>>> On 01/05/13 16:26, Steven Haigh wrote:
>>>> On 2/05/2013 1:18 AM, George Dunlap wrote:
>>>>> On Wed, May 1, 2013 at 6:29 AM, Steven Haigh <netwiz@crc.id.au> wrote:
>>>>>> Hi all,
>>>>>>
>>>>>> I've had a report lodged against my packages that the patch provided for
>>>>>> XSA46 against Xen 4.2.1 causes PCI passthru to break.
>>>>>>
>>>>>> It seems that 4.2.1 *without* the XSA46 patch works perfectly. 4.2.2 does
>>>>>> not work.
>>>>> Have you tried this with xen-unstable tip?  That would be a blocker
>>>>> bug for the 4.3 release.
>>>> Hi George,
>>>>
>>>> It hasn't been tried it against anything other than 4.2.1 & 4.2.2 as
>>>> yet. As I'm not the end user with the problem here, I need to wait for
>>>> feedback.
>>>>
>>>> I have passed the patch provided by Andrew to the bug author - when I've
>>>> got feedback on this I'll be able to provide more information. I think
>>>> when we've got a root cause for this then it should be simple to verify
>>>> it on 4.3.
>>>>
>>> I have been investigating this issue on XenServer.
>>>
>>> On XenServer, PCIPassthrough to a SLES11SP1 guest is working correctly,
>>> even with the XSA-46 patch applied.
>>>
>>> When passing through physical devices, my hypervisor debugging is being
>>> triggered, but the actions of XEN_DOMCTL_irq_permission appear to be
>>> correct, given sensible input from the Xapi toolstack.  When passing
>>> through an SRIOV virtual function, no hypervisor debugging is being
>>> triggered.
>>>
>>> At a preliminary guess, I would say that XM looks to be doing something
>>> stupid which it used to be getting away with, but is not now given the
>>> changed in the hypervisor.
>>>
>>> I suspect that it will be hard to progress this issue until Gordan
>>> applied my debugging patch and gets back with the results.
>>>
>> Hi Andrew,
>>
>> Got a reply from Gordon -> http://xen.crc.id.au/bugs/view.php?id=5
>>
>> The 'xm dmesg' output shows the following:
>> (XEN) sh error: sh_remove_all_mappings(): can't find all mappings of mfn
>> 4bc435: c=c000000000000002 t=7400000000000001
>> (XEN) **DBG perms { 16, 1 } = 0
>> (XEN) **DBG perms { 34, 1 } = -22
>> (XEN) sh error: sh_remove_all_mappings(): can't find all mappings of mfn
>> 4be230: c=c000000000000002 t=7400000000000001
>> (XEN) **DBG perms { 16, 1 } = 0
>> (XEN) **DBG perms { 34, 1 } = -22
>>
>> The full dmesg is attached to the bug report.
>>
>> --
>> Steven Haigh
> Unrelated to the PCI passthrough problem, those spurious sh errors are
> fixed by
> http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=0984c2b63a7c3e9dfa770b29dac51a5124aecaab
>
>  From Gordon's log, it appears pirq 34 is the one causing problems.
>
> Can he please try the latest attached debugging patch which should
> provide rather more information in the failure case.
>
> Also, can he boot with "loglvl=all" on the Xen command line, and also
> issue "xm debug-keys izq" before capturing xm dmesg.  The debug keys
> should dump loads on information into the dmesg buffer to do with
> interrupts etc.

Debug log is now attached to the bug report. Its a little large to 
attach here.

http://xen.crc.id.au/bugs/file_download.php?file_id=13&type=bug

--
Steven Haigh

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: xsa46-4.2.patch breaks PCI passthrough?
  2013-05-05 10:53           ` Steven Haigh
@ 2013-05-06  7:15             ` Jan Beulich
  2013-05-08 10:18               ` Steven Haigh
  0 siblings, 1 reply; 15+ messages in thread
From: Jan Beulich @ 2013-05-06  7:15 UTC (permalink / raw)
  To: Andrew Cooper, Steven Haigh; +Cc: George Dunlap, xen-devel

>>> On 05.05.13 at 12:53, Steven Haigh <netwiz@crc.id.au> wrote:
> Debug log is now attached to the bug report. Its a little large to 
> attach here.
> 
> http://xen.crc.id.au/bugs/file_download.php?file_id=13&type=bug 

>(XEN) **DBG perms { 16, 1 } = 0

I'm surprised by this if the test was done without the xend
adjustment, whereas this

>(XEN) **DBG perms { 34, 1 } = -22
>(XEN)  Domain 2, nr_pirqs 80
>(XEN)  dom_pirq_to_irq(34) = 0

is expected without a prior physdev_map_pirq() invocation. I'm
meanwhile guessing that there might be a second place in xend
where under some condition that call is being issued - that could
also explain the -EEXIST observed with the xend adjustment in
place.

Jan

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: xsa46-4.2.patch breaks PCI passthrough?
  2013-05-06  7:15             ` Jan Beulich
@ 2013-05-08 10:18               ` Steven Haigh
  2013-05-08 11:45                 ` Jan Beulich
  0 siblings, 1 reply; 15+ messages in thread
From: Steven Haigh @ 2013-05-08 10:18 UTC (permalink / raw)
  To: Jan Beulich; +Cc: George Dunlap, Andrew Cooper, xen-devel

On 6/05/2013 5:15 PM, Jan Beulich wrote:
>>>> On 05.05.13 at 12:53, Steven Haigh <netwiz@crc.id.au> wrote:
>> Debug log is now attached to the bug report. Its a little large to
>> attach here.
>>
>> http://xen.crc.id.au/bugs/file_download.php?file_id=13&type=bug
>
>> (XEN) **DBG perms { 16, 1 } = 0
>
> I'm surprised by this if the test was done without the xend
> adjustment, whereas this
>
>> (XEN) **DBG perms { 34, 1 } = -22
>> (XEN)  Domain 2, nr_pirqs 80
>> (XEN)  dom_pirq_to_irq(34) = 0
>
> is expected without a prior physdev_map_pirq() invocation. I'm
> meanwhile guessing that there might be a second place in xend
> where under some condition that call is being issued - that could
> also explain the -EEXIST observed with the xend adjustment in
> place.

I'll be the first to admit that this is beyond my knowledge in Xen at 
this low a level... Is there anything I can do to help debugging progress?

Is there any more information you need from the original bug reporter?

-- 
Steven Haigh

Email: netwiz@crc.id.au
Web: https://www.crc.id.au
Phone: (03) 9001 6090 - 0412 935 897
Fax: (03) 8338 0299

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: xsa46-4.2.patch breaks PCI passthrough?
  2013-05-08 10:18               ` Steven Haigh
@ 2013-05-08 11:45                 ` Jan Beulich
  0 siblings, 0 replies; 15+ messages in thread
From: Jan Beulich @ 2013-05-08 11:45 UTC (permalink / raw)
  To: Steven Haigh; +Cc: George Dunlap, Andrew Cooper, xen-devel

>>> On 08.05.13 at 12:18, Steven Haigh <netwiz@crc.id.au> wrote:
> On 6/05/2013 5:15 PM, Jan Beulich wrote:
>>>>> On 05.05.13 at 12:53, Steven Haigh <netwiz@crc.id.au> wrote:
>>> Debug log is now attached to the bug report. Its a little large to
>>> attach here.
>>>
>>> http://xen.crc.id.au/bugs/file_download.php?file_id=13&type=bug 
>>
>>> (XEN) **DBG perms { 16, 1 } = 0
>>
>> I'm surprised by this if the test was done without the xend
>> adjustment, whereas this
>>
>>> (XEN) **DBG perms { 34, 1 } = -22
>>> (XEN)  Domain 2, nr_pirqs 80
>>> (XEN)  dom_pirq_to_irq(34) = 0
>>
>> is expected without a prior physdev_map_pirq() invocation. I'm
>> meanwhile guessing that there might be a second place in xend
>> where under some condition that call is being issued - that could
>> also explain the -EEXIST observed with the xend adjustment in
>> place.
> 
> I'll be the first to admit that this is beyond my knowledge in Xen at 
> this low a level... Is there anything I can do to help debugging progress?

See the other thread ("PCI passthrough problems after legacy
update of xen 4.1") about the same problem. Andreas had put
together a debugging patch that helped narrow it, but it's still
unclear where the conflicting hypercalls originate. Perhaps
continuing the discussion centrally in that other thread (which
has made better progress) would help keep all information
together.

Jan

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2013-05-08 11:45 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-05-01  5:29 xsa46-4.2.patch breaks PCI passthrough? Steven Haigh
2013-05-01 11:09 ` Andrew Cooper
2013-05-01 11:28   ` Andrew Cooper
2013-05-02  8:49     ` Jan Beulich
2013-05-02 10:43       ` Ian Campbell
2013-05-02 11:54         ` Jan Beulich
2013-05-01 15:18 ` George Dunlap
2013-05-01 15:26   ` Steven Haigh
2013-05-01 16:07     ` Andrew Cooper
2013-05-03 22:15       ` Steven Haigh
2013-05-04 17:23         ` Andrew Cooper
2013-05-05 10:53           ` Steven Haigh
2013-05-06  7:15             ` Jan Beulich
2013-05-08 10:18               ` Steven Haigh
2013-05-08 11:45                 ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.