All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: Regression in kernel 3.5 as Dom0 regarding PCI Passthrough?!
@ 2012-08-21  2:41 Ren, Yongjie
  2012-08-21 14:23 ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 20+ messages in thread
From: Ren, Yongjie @ 2012-08-21  2:41 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, Tobias Geiger; +Cc: xen-devel

> -----Original Message-----
> From: xen-devel-bounces@lists.xen.org
> [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of Konrad Rzeszutek
> Wilk
> Sent: Tuesday, August 21, 2012 7:30 AM
> To: Tobias Geiger
> Cc: xen-devel@lists.xen.org
> Subject: Re: [Xen-devel] Regression in kernel 3.5 as Dom0 regarding PCI
> Passthrough?!
> 
> On Mon, Aug 06, 2012 at 12:16:33PM -0400, Konrad Rzeszutek Wilk wrote:
> > On Wed, Jul 25, 2012 at 09:43:57AM -0400, Konrad Rzeszutek Wilk
> wrote:
> > > On Wed, Jul 25, 2012 at 02:30:00PM +0200, Tobias Geiger wrote:
> > > > Hi!
> > > >
> > > > i notice a serious regression with 3.5 as Dom0 kernel (3.4 was rock
> > > > stable):
> > > >
> > > > 1st: only the GPU PCI Passthrough works, the PCI USB Controller is
> > > > not recognized within the DomU (HVM Win7 64)
> > > > Dom0 cmdline is:
> > > > ro root=LABEL=dom0root
> xen-pciback.hide=(08:00.0)(08:00.1)(00:1d.0)(00:1d.1)(00:1d.2)(00:1d.7)
> > > > security=apparmor noirqdebug nouveau.msi=1
> > > >
> > > > Only 8:00.0 and 8:00.1 get passed through without problems, all the
> > > > USB Controller IDs are not correctly passed through and get a
> > > > exclamation mark within the win7 device manager ("could not be
> > > > started").
> > >
> > > Ok, but they do get passed in though? As in, QEMU sees them.
> > > If you boot a Live Ubuntu/Fedora CD within the guest with the PCI
> > > passed in devices do you see them? Meaning lspci shows them?
> > >
> > >
> > > Is the lspci -vvv output in dom0 different from 3.4 vs 3.5?
> > >
> > > >
> > > >
> > > > 2nd: After DomU shutdown , Dom0 panics (100% reproducable) -
> sorry
> > > > that i have no full stacktrace, all i have is a "screenshot" which i
> > > > uploaded here:
> > > >
> http://imageshack.us/photo/my-images/52/img20120724235921.jpg/
> > >
> > > Ugh, that looks like somebody removed a large chunk of a pagetable.
> > >
> > > Hmm. Are you using dom0_mem=max parameter? If not, can you try
> > > that and also disable ballooning in the xm/xl config file pls?
> > >
> > > >
> > > >
> > > > With 3.4 both issues were not there - everything worked perfectly.
> > > > Tell me which debugging info you need, i may be able to re-install
> > > > my netconsole to get the full stacktrace (but i had not much luck
> > > > with netconsole regarding kernel panics - rarely this info gets sent
> > > > before the "panic"...)
> >
> > So I am able to reproduce this with a Windows 7 with an ATI 4870 and
> > an Intel 82574L NIC. The video card still works, but the NIC stopped
> > working. Same version of hypervisor/toolstack/etc, only change is the
> > kernel (v3.4.6->v3.5).
> >
> > Time to get my hands greasy with this..
> 
> And its due to a patch I added in v3.4
> (cd9db80e5257682a7f7ab245a2459648b3c8d268)
> - which did not work properly in v3.4, but with v3.5 got it working
> (977f857ca566a1e68045fcbb7cfc9c4acb077cf0) which causes v3.5 to now
> work
> anymore.
> 
> Anyhow, for right now jsut revert
> cd9db80e5257682a7f7ab245a2459648b3c8d268
> and it should work for you.
> 
Also, our team reported a VT-d bug 2 months ago.
http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1824
We found "3bb07f1b73ea6313b843807063e183e168c9182a" is the bad commit in linux tree.
Linux3.4.7 works fine; but Linux 3.5 has this issue.
Seem Tobias has the same issue as that in the bug.
But we didn't meet Dom0 panic when shutting down the DomU.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Regression in kernel 3.5 as Dom0 regarding PCI Passthrough?!
  2012-08-21  2:41 Regression in kernel 3.5 as Dom0 regarding PCI Passthrough?! Ren, Yongjie
@ 2012-08-21 14:23 ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 20+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-08-21 14:23 UTC (permalink / raw)
  To: Ren, Yongjie; +Cc: Tobias Geiger, xen-devel, Konrad Rzeszutek Wilk

On Tue, Aug 21, 2012 at 02:41:36AM +0000, Ren, Yongjie wrote:
> > -----Original Message-----
> > From: xen-devel-bounces@lists.xen.org
> > [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of Konrad Rzeszutek
> > Wilk
> > Sent: Tuesday, August 21, 2012 7:30 AM
> > To: Tobias Geiger
> > Cc: xen-devel@lists.xen.org
> > Subject: Re: [Xen-devel] Regression in kernel 3.5 as Dom0 regarding PCI
> > Passthrough?!
> > 
> > On Mon, Aug 06, 2012 at 12:16:33PM -0400, Konrad Rzeszutek Wilk wrote:
> > > On Wed, Jul 25, 2012 at 09:43:57AM -0400, Konrad Rzeszutek Wilk
> > wrote:
> > > > On Wed, Jul 25, 2012 at 02:30:00PM +0200, Tobias Geiger wrote:
> > > > > Hi!
> > > > >
> > > > > i notice a serious regression with 3.5 as Dom0 kernel (3.4 was rock
> > > > > stable):
> > > > >
> > > > > 1st: only the GPU PCI Passthrough works, the PCI USB Controller is
> > > > > not recognized within the DomU (HVM Win7 64)
> > > > > Dom0 cmdline is:
> > > > > ro root=LABEL=dom0root
> > xen-pciback.hide=(08:00.0)(08:00.1)(00:1d.0)(00:1d.1)(00:1d.2)(00:1d.7)
> > > > > security=apparmor noirqdebug nouveau.msi=1
> > > > >
> > > > > Only 8:00.0 and 8:00.1 get passed through without problems, all the
> > > > > USB Controller IDs are not correctly passed through and get a
> > > > > exclamation mark within the win7 device manager ("could not be
> > > > > started").
> > > >
> > > > Ok, but they do get passed in though? As in, QEMU sees them.
> > > > If you boot a Live Ubuntu/Fedora CD within the guest with the PCI
> > > > passed in devices do you see them? Meaning lspci shows them?
> > > >
> > > >
> > > > Is the lspci -vvv output in dom0 different from 3.4 vs 3.5?
> > > >
> > > > >
> > > > >
> > > > > 2nd: After DomU shutdown , Dom0 panics (100% reproducable) -
> > sorry
> > > > > that i have no full stacktrace, all i have is a "screenshot" which i
> > > > > uploaded here:
> > > > >
> > http://imageshack.us/photo/my-images/52/img20120724235921.jpg/
> > > >
> > > > Ugh, that looks like somebody removed a large chunk of a pagetable.
> > > >
> > > > Hmm. Are you using dom0_mem=max parameter? If not, can you try
> > > > that and also disable ballooning in the xm/xl config file pls?
> > > >
> > > > >
> > > > >
> > > > > With 3.4 both issues were not there - everything worked perfectly.
> > > > > Tell me which debugging info you need, i may be able to re-install
> > > > > my netconsole to get the full stacktrace (but i had not much luck
> > > > > with netconsole regarding kernel panics - rarely this info gets sent
> > > > > before the "panic"...)
> > >
> > > So I am able to reproduce this with a Windows 7 with an ATI 4870 and
> > > an Intel 82574L NIC. The video card still works, but the NIC stopped
> > > working. Same version of hypervisor/toolstack/etc, only change is the
> > > kernel (v3.4.6->v3.5).
> > >
> > > Time to get my hands greasy with this..
> > 
> > And its due to a patch I added in v3.4
> > (cd9db80e5257682a7f7ab245a2459648b3c8d268)
> > - which did not work properly in v3.4, but with v3.5 got it working
> > (977f857ca566a1e68045fcbb7cfc9c4acb077cf0) which causes v3.5 to now
> > work
> > anymore.
> > 
> > Anyhow, for right now jsut revert
> > cd9db80e5257682a7f7ab245a2459648b3c8d268
> > and it should work for you.
> > 
> Also, our team reported a VT-d bug 2 months ago.
> http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1824
> We found "3bb07f1b73ea6313b843807063e183e168c9182a" is the bad commit in linux tree.
> Linux3.4.7 works fine; but Linux 3.5 has this issue.

Oh, I wish I saw that earlier.

> Seem Tobias has the same issue as that in the bug.
> But we didn't meet Dom0 panic when shutting down the DomU.

Neither do I - not sure why he sees that.
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Regression in kernel 3.5 as Dom0 regarding PCI Passthrough?!
  2012-09-07  2:08     ` Ren, Yongjie
@ 2012-09-07 10:37       ` Tobias Geiger
  0 siblings, 0 replies; 20+ messages in thread
From: Tobias Geiger @ 2012-09-07 10:37 UTC (permalink / raw)
  To: Ren, Yongjie; +Cc: Konrad Rzeszutek Wilk, xen-devel, Konrad Rzeszutek Wilk

Am 07.09.2012 04:08, schrieb Ren, Yongjie:
>> -----Original Message-----
>> From: Tobias Geiger [mailto:tobias.geiger@vido.info]
>> Sent: Thursday, September 06, 2012 7:28 PM
>> To: Konrad Rzeszutek Wilk
>> Cc: Ren, Yongjie; Konrad Rzeszutek Wilk; xen-devel@lists.xen.org
>> Subject: Re: [Xen-devel] Regression in kernel 3.5 as Dom0 regarding 
>> PCI
>> Passthrough?!
>>
>> Hello Konrad,
>>
>> the patch helps regarding the USB-PCIController-Passthrough - this
>> works now in DomU.
>>
> Hi Tobias,
> In my testing, this patch can't work for a NIC pass-through.
> Could you have a try with a NIC pass-through?

Hi!

unfortunatly not - i have no physical access to the machine until the 
middle of next week.
and as so soon as i try to passthrough the only nic in this machine my 
remote access will die... :(

perhaps someone else can test nic-passthrough? if not i'll try it asap 
next week!

Greetings
Tobias

>
>> but i still get the Dom0 crash when shutting down DomU:
>>
>> Sep  6 13:26:19 pc kernel: [  361.011514]
>> xen-blkback:backend/vbd/1/832: prepare for reconnect
>> Sep  6 13:26:20 pc kernel: [  361.876395]
>> xen-blkback:backend/vbd/1/768: prepare for reconnect
>> Sep  6 13:26:21 pc kernel: [  362.682152] br0: port 3(vif1.0) 
>> entered
>> disabled state
>> Sep  6 13:26:21 pc kernel: [  362.682267] br0: port 3(vif1.0) 
>> entered
>> disabled state
>> Sep  6 13:26:24 pc kernel: [  365.541386] ------------[ cut here
>> ]------------
>> Sep  6 13:26:24 pc kernel: [  365.541411] invalid opcode: 0000 [#1]
>> PREEMPT SMP
>> Sep  6 13:26:24 pc kernel: [  365.541423] CPU 2
>> Sep  6 13:26:24 pc kernel: [  365.541427] Modules linked in: 
>> uvcvideo
>> snd_usb_audio snd_usbmidi_lib snd_seq_midi snd_hwd
>> ep snd_rawmidi videobuf2_vmalloc videobuf2_memops videobuf2_core
>> videodev gpio_ich joydev hid_generic [last unloaded: sc
>> si_wait_scan]
>> Sep  6 13:26:24 pc kernel: [  365.541474]
>> Sep  6 13:26:24 pc kernel: [  365.541477] Pid: 1208, comm: 
>> kworker/2:1
>> Not tainted 3.5.0 #3                  /DX58SO
>> Sep  6 13:26:24 pc kernel: [  365.541491] RIP:
>> e030:[<ffffffff81447f95>]  [<ffffffff81447f95>]
>> balloon_process+0x385/0x3
>> a0
>> Sep  6 13:26:24 pc kernel: [  365.541507] RSP: e02b:ffff88012e7abdc0
>> EFLAGS: 00010213
>> Sep  6 13:26:24 pc kernel: [  365.541515] RAX: 0000000220be7000 RBX:
>> 0000000000000000 RCX: 0000000000000008
>> Sep  6 13:26:24 pc kernel: [  365.541523] RDX: ffff88010d99a000 RSI:
>> 00000000000001df RDI: 000000000020efdf
>> Sep  6 13:26:24 pc kernel: [  365.541532] RBP: ffff88012e7abe20 R08:
>> ffff88014064e140 R09: 00000000fffffffe
>> Sep  6 13:26:24 pc kernel: [  365.541540] R10: 0000000000000001 R11:
>> 0000000000000000 R12: 0000160000000000
>> Sep  6 13:26:24 pc kernel: [  365.541548] R13: 0000000000000001 R14:
>> 000000000020efdf R15: ffffea00083bf7c0
>> Sep  6 13:26:24 pc kernel: [  365.541561] FS:  
>> 00007f79d32ce700(0000)
>> GS:ffff880140640000(0000) knlGS:0000000000000000
>> Sep  6 13:26:24 pc kernel: [  365.541571] CS:  e033 DS: 0000 ES: 
>> 0000
>> CR0: 000000008005003b
>> Sep  6 13:26:24 pc kernel: [  365.541578] CR2: 00007f79d2d6ce02 CR3:
>> 0000000001e0c000 CR4: 0000000000002660
>> Sep  6 13:26:24 pc kernel: [  365.541587] DR0: 0000000000000000 DR1:
>> 0000000000000000 DR2: 0000000000000000
>> Sep  6 13:26:24 pc kernel: [  365.541596] DR3: 0000000000000000 DR6:
>> 00000000ffff0ff0 DR7: 0000000000000400
>> Sep  6 13:26:24 pc kernel: [  365.541604] Process kworker/2:1 (pid:
>> 1208, threadinfo ffff88012e7aa000, task ffff88013101
>> c440)
>> Sep  6 13:26:24 pc kernel: [  365.541613] Stack:
>> Sep  6 13:26:24 pc kernel: [  365.541618]  000000000006877b
>> 0000000000000001 ffffffff8200ea80 0000000000000001
>> Sep  6 13:26:24 pc kernel: [  365.541649]  0000000000000000
>> 0000000000007ff0 ffff88012e7abe00 ffff8801302eee00
>> Sep  6 13:26:24 pc kernel: [  365.541664]  ffff880140657000
>> ffff88014064e140 0000000000000000 ffffffff81e587c0
>> Sep  6 13:26:24 pc kernel: [  365.541679] Call Trace:
>> Sep  6 13:26:24 pc kernel: [  365.541688]  [<ffffffff8106753b>]
>> process_one_work+0x12b/0x450
>> Sep  6 13:26:24 pc kernel: [  365.541697]  [<ffffffff81447c10>] ?
>> decrease_reservation+0x320/0x320
>> Sep  6 13:26:24 pc kernel: [  365.541706]  [<ffffffff810688be>]
>> worker_thread+0x12e/0x2d0
>> Sep  6 13:26:24 pc kernel: [  365.541715]  [<ffffffff81068790>] ?
>> manage_workers.isra.26+0x1f0/0x1f0
>> Sep  6 13:26:24 pc kernel: [  365.541725]  [<ffffffff8106db7e>]
>> kthread+0x8e/0xa0
>> Sep  6 13:26:24 pc kernel: [  365.541735]  [<ffffffff8184e3e4>]
>> kernel_thread_helper+0x4/0x10
>> Sep  6 13:26:24 pc kernel: [  365.541745]  [<ffffffff8184c87c>] ?
>> retint_restore_args+0x5/0x6
>> Sep  6 13:26:24 pc kernel: [  365.541754]  [<ffffffff8184e3e0>] ?
>> gs_change+0x13/0x13
>> Sep  6 13:26:24 pc kernel: [  365.541760] Code: 01 15 f0 6a bc 00 48 
>> 29
>> d0 48 89 05 ee 6a bc 00 e9 31 fd ff ff 0f 0b 0f
>> 0b 4c 89 f7 e8 85 34 bc ff 48 83 f8 ff 0f 84 2b fe ff ff <0f> 0b 66 
>> 0f
>> 1f 84 00 00 00 00 00 48 83 c1 01 e9 c2 fd ff ff 0
>> f
>> Sep  6 13:26:24 pc kernel: [  365.541898]  RSP <ffff88012e7abdc0>
>> Sep  6 13:26:24 pc kernel: [  365.565054] ---[ end trace
>> 25eb9ce0cc61c3a1 ]---
>> Sep  6 13:26:24 pc kernel: [  365.565101] PGD 1e0e067 PUD 1e0f067
>> PMD 0
>> Sep  6 13:26:24 pc kernel: [  365.565108] Oops: 0000 [#2] PREEMPT 
>> SMP
>> Sep  6 13:26:24 pc kernel: [  365.565115] CPU 2
>> Sep  6 13:26:24 pc kernel: [  365.565118] Modules linked in: 
>> uvcvideo
>> snd_usb_audio snd_usbmidi_lib snd_seq_midi snd_hwd
>> ep snd_rawmidi videobuf2_vmalloc videobuf2_memops videobuf2_core
>> videodev gpio_ich joydev hid_generic [last unloaded: sc
>> si_wait_scan]
>> Sep  6 13:26:24 pc kernel: [  365.565153]
>> Sep  6 13:26:24 pc kernel: [  365.565156] Pid: 1208, comm: 
>> kworker/2:1
>> Tainted: G      D      3.5.0 #3
>> /DX58SO
>> Sep  6 13:26:24 pc kernel: [  365.565176] RIP:
>> e030:[<ffffffff8106e08c>]  [<ffffffff8106e08c>] 
>> kthread_data+0xc/0x20
>> Sep  6 13:26:24 pc kernel: [  365.565194] RSP: e02b:ffff88012e7aba90
>> EFLAGS: 00010092
>> Sep  6 13:26:24 pc kernel: [  365.565205] RAX: 0000000000000000 RBX:
>> 0000000000000002 RCX: 0000000000000002
>> Sep  6 13:26:24 pc kernel: [  365.565219] RDX: ffffffff81fcba40 RSI:
>> 0000000000000002 RDI: ffff88013101c440
>> Sep  6 13:26:24 pc kernel: [  365.565233] RBP: ffff88012e7abaa8 R08:
>> 0000000000989680 R09: ffffffff81fcba40
>> Sep  6 13:26:24 pc kernel: [  365.565248] R10: ffffffff813b0c00 R11:
>> 0000000000000000 R12: ffff8801406536c0
>> Sep  6 13:26:24 pc kernel: [  365.565262] R13: 0000000000000002 R14:
>> ffff88013101c430 R15: ffff88013101c440
>> Sep  6 13:26:24 pc kernel: [  365.565280] FS:  
>> 00007f79d32ce700(0000)
>> GS:ffff880140640000(0000) knlGS:0000000000000000
>> Sep  6 13:26:24 pc kernel: [  365.565293] CS:  e033 DS: 0000 ES: 
>> 0000
>> CR0: 000000008005003b
>> Sep  6 13:26:24 pc kernel: [  365.565303] CR2: fffffffffffffff8 CR3:
>> 0000000001e0c000 CR4: 0000000000002660
>> Sep  6 13:26:24 pc kernel: [  365.565318] DR0: 0000000000000000 DR1:
>> 0000000000000000 DR2: 0000000000000000
>> Sep  6 13:26:24 pc kernel: [  365.565332] DR3: 0000000000000000 DR6:
>> 00000000ffff0ff0 DR7: 0000000000000400
>> Sep  6 13:26:24 pc kernel: [  365.565349] Process kworker/2:1 (pid:
>> 1208, threadinfo ffff88012e7aa000, task ffff88013101
>> c440)
>> Sep  6 13:26:24 pc kernel: [  365.565362] Stack:
>> Sep  6 13:26:24 pc kernel: [  365.565367]  ffffffff810698e0
>> ffff88012e7abaa8 ffff88013101c818 ffff88012e7abb18
>> Sep  6 13:26:24 pc kernel: [  365.565389]  ffffffff8184ae02
>> ffff88012e7abfd8 ffff88013101c440 ffff88012e7abfd8
>> Sep  6 13:26:24 pc kernel: [  365.565410]  ffff88012e7abfd8
>> ffff88012d8840c0 ffff88013101c440 ffff88013101ca30
>>
>>
>>
>> Perhaps this stacktrace helps...
>>
>> Thanks!
>>
>> Am 05.09.2012 20:54, schrieb Konrad Rzeszutek Wilk:
>> >> > > > And its due to a patch I added in v3.4
>> >> > > > (cd9db80e5257682a7f7ab245a2459648b3c8d268)
>> >> > > > - which did not work properly in v3.4, but with v3.5 got it
>> >> working
>> >> > > > (977f857ca566a1e68045fcbb7cfc9c4acb077cf0) which causes 
>> v3.5
>> >> to
>> >> > now
>> >> > > > work
>> >> > > > anymore.
>> >> > > >
>> >> > > > Anyhow, for right now jsut revert
>> >> > > > cd9db80e5257682a7f7ab245a2459648b3c8d268
>> >> > > > and it should work for you.
>> >> > > >
>> >> Confirmed, after reverting that commit, VT-d will work fine.
>> >> Will you fix this and push it to upstream Linux, Konrad?
>> >>
>> >> > > Also, our team reported a VT-d bug 2 months ago.
>> >> > > http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1824
>> >> >
>> >
>> > Can either one of you please test this patch, please:
>> >
>> >
>> > diff --git a/drivers/xen/xen-pciback/pci_stub.c
>> > b/drivers/xen/xen-pciback/pci_stub.c
>> > index 097e536..425bd0b 100644
>> > --- a/drivers/xen/xen-pciback/pci_stub.c
>> > +++ b/drivers/xen/xen-pciback/pci_stub.c
>> > @@ -4,6 +4,8 @@
>> >   * Ryan Wilson <hap9@epoch.ncsc.mil>
>> >   * Chris Bookholt <hap10@epoch.ncsc.mil>
>> >   */
>> > +#define DEBUG 1
>> > +
>> >  #include <linux/module.h>
>> >  #include <linux/init.h>
>> >  #include <linux/rwsem.h>
>> > @@ -97,13 +99,15 @@ static void pcistub_device_release(struct kref
>> > *kref)
>> >  	/* Call the reset function which does not take lock as this
>> >  	 * is called from "unbind" which takes a device_lock mutex.
>> >  	 */
>> > +	dev_dbg(&psdev->dev->dev, "FLR locked..\n");
>> >  	__pci_reset_function_locked(psdev->dev);
>> >  	if (pci_load_and_free_saved_state(psdev->dev,
>> >  					  &dev_data->pci_saved_state)) {
>> >  		dev_dbg(&psdev->dev->dev, "Could not reload PCI state\n");
>> > -	} else
>> > +	} else {
>> > +		dev_dbg(&psdev->dev->dev, "Reloading PCI state..\n");
>> >  		pci_restore_state(psdev->dev);
>> > -
>> > +	}
>> >  	/* Disable the device */
>> >  	xen_pcibk_reset_device(psdev->dev);
>> >
>> > @@ -353,16 +357,16 @@ static int __devinit 
>> pcistub_init_device(struct
>> > pci_dev *dev)
>> >  	if (err)
>> >  		goto config_release;
>> >
>> > -	dev_dbg(&dev->dev, "reseting (FLR, D3, etc) the device\n");
>> > -	__pci_reset_function_locked(dev);
>> > -
>> >  	/* We need the device active to save the state. */
>> >  	dev_dbg(&dev->dev, "save state of device\n");
>> >  	pci_save_state(dev);
>> >  	dev_data->pci_saved_state = pci_store_saved_state(dev);
>> >  	if (!dev_data->pci_saved_state)
>> >  		dev_err(&dev->dev, "Could not store PCI conf saved state!\n");
>> > -
>> > +	else {
>> > +		dev_dbg(&dev->dev, "reseting (FLR, D3, etc) the device\n");
>> > +		__pci_reset_function_locked(dev);
>> > +	}
>> >  	/* Now disable the device (this also ensures some private device
>> >  	 * data is setup before we export)
>> >  	 */

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Regression in kernel 3.5 as Dom0 regarding PCI Passthrough?!
  2012-09-06 11:28   ` Tobias Geiger
  2012-09-06 13:05     ` Konrad Rzeszutek Wilk
@ 2012-09-07  2:08     ` Ren, Yongjie
  2012-09-07 10:37       ` Tobias Geiger
  1 sibling, 1 reply; 20+ messages in thread
From: Ren, Yongjie @ 2012-09-07  2:08 UTC (permalink / raw)
  To: Tobias Geiger, Konrad Rzeszutek Wilk; +Cc: Konrad Rzeszutek Wilk, xen-devel

> -----Original Message-----
> From: Tobias Geiger [mailto:tobias.geiger@vido.info]
> Sent: Thursday, September 06, 2012 7:28 PM
> To: Konrad Rzeszutek Wilk
> Cc: Ren, Yongjie; Konrad Rzeszutek Wilk; xen-devel@lists.xen.org
> Subject: Re: [Xen-devel] Regression in kernel 3.5 as Dom0 regarding PCI
> Passthrough?!
> 
> Hello Konrad,
> 
> the patch helps regarding the USB-PCIController-Passthrough - this
> works now in DomU.
> 
Hi Tobias,
In my testing, this patch can't work for a NIC pass-through.
Could you have a try with a NIC pass-through?

> but i still get the Dom0 crash when shutting down DomU:
> 
> Sep  6 13:26:19 pc kernel: [  361.011514]
> xen-blkback:backend/vbd/1/832: prepare for reconnect
> Sep  6 13:26:20 pc kernel: [  361.876395]
> xen-blkback:backend/vbd/1/768: prepare for reconnect
> Sep  6 13:26:21 pc kernel: [  362.682152] br0: port 3(vif1.0) entered
> disabled state
> Sep  6 13:26:21 pc kernel: [  362.682267] br0: port 3(vif1.0) entered
> disabled state
> Sep  6 13:26:24 pc kernel: [  365.541386] ------------[ cut here
> ]------------
> Sep  6 13:26:24 pc kernel: [  365.541411] invalid opcode: 0000 [#1]
> PREEMPT SMP
> Sep  6 13:26:24 pc kernel: [  365.541423] CPU 2
> Sep  6 13:26:24 pc kernel: [  365.541427] Modules linked in: uvcvideo
> snd_usb_audio snd_usbmidi_lib snd_seq_midi snd_hwd
> ep snd_rawmidi videobuf2_vmalloc videobuf2_memops videobuf2_core
> videodev gpio_ich joydev hid_generic [last unloaded: sc
> si_wait_scan]
> Sep  6 13:26:24 pc kernel: [  365.541474]
> Sep  6 13:26:24 pc kernel: [  365.541477] Pid: 1208, comm: kworker/2:1
> Not tainted 3.5.0 #3                  /DX58SO
> Sep  6 13:26:24 pc kernel: [  365.541491] RIP:
> e030:[<ffffffff81447f95>]  [<ffffffff81447f95>]
> balloon_process+0x385/0x3
> a0
> Sep  6 13:26:24 pc kernel: [  365.541507] RSP: e02b:ffff88012e7abdc0
> EFLAGS: 00010213
> Sep  6 13:26:24 pc kernel: [  365.541515] RAX: 0000000220be7000 RBX:
> 0000000000000000 RCX: 0000000000000008
> Sep  6 13:26:24 pc kernel: [  365.541523] RDX: ffff88010d99a000 RSI:
> 00000000000001df RDI: 000000000020efdf
> Sep  6 13:26:24 pc kernel: [  365.541532] RBP: ffff88012e7abe20 R08:
> ffff88014064e140 R09: 00000000fffffffe
> Sep  6 13:26:24 pc kernel: [  365.541540] R10: 0000000000000001 R11:
> 0000000000000000 R12: 0000160000000000
> Sep  6 13:26:24 pc kernel: [  365.541548] R13: 0000000000000001 R14:
> 000000000020efdf R15: ffffea00083bf7c0
> Sep  6 13:26:24 pc kernel: [  365.541561] FS:  00007f79d32ce700(0000)
> GS:ffff880140640000(0000) knlGS:0000000000000000
> Sep  6 13:26:24 pc kernel: [  365.541571] CS:  e033 DS: 0000 ES: 0000
> CR0: 000000008005003b
> Sep  6 13:26:24 pc kernel: [  365.541578] CR2: 00007f79d2d6ce02 CR3:
> 0000000001e0c000 CR4: 0000000000002660
> Sep  6 13:26:24 pc kernel: [  365.541587] DR0: 0000000000000000 DR1:
> 0000000000000000 DR2: 0000000000000000
> Sep  6 13:26:24 pc kernel: [  365.541596] DR3: 0000000000000000 DR6:
> 00000000ffff0ff0 DR7: 0000000000000400
> Sep  6 13:26:24 pc kernel: [  365.541604] Process kworker/2:1 (pid:
> 1208, threadinfo ffff88012e7aa000, task ffff88013101
> c440)
> Sep  6 13:26:24 pc kernel: [  365.541613] Stack:
> Sep  6 13:26:24 pc kernel: [  365.541618]  000000000006877b
> 0000000000000001 ffffffff8200ea80 0000000000000001
> Sep  6 13:26:24 pc kernel: [  365.541649]  0000000000000000
> 0000000000007ff0 ffff88012e7abe00 ffff8801302eee00
> Sep  6 13:26:24 pc kernel: [  365.541664]  ffff880140657000
> ffff88014064e140 0000000000000000 ffffffff81e587c0
> Sep  6 13:26:24 pc kernel: [  365.541679] Call Trace:
> Sep  6 13:26:24 pc kernel: [  365.541688]  [<ffffffff8106753b>]
> process_one_work+0x12b/0x450
> Sep  6 13:26:24 pc kernel: [  365.541697]  [<ffffffff81447c10>] ?
> decrease_reservation+0x320/0x320
> Sep  6 13:26:24 pc kernel: [  365.541706]  [<ffffffff810688be>]
> worker_thread+0x12e/0x2d0
> Sep  6 13:26:24 pc kernel: [  365.541715]  [<ffffffff81068790>] ?
> manage_workers.isra.26+0x1f0/0x1f0
> Sep  6 13:26:24 pc kernel: [  365.541725]  [<ffffffff8106db7e>]
> kthread+0x8e/0xa0
> Sep  6 13:26:24 pc kernel: [  365.541735]  [<ffffffff8184e3e4>]
> kernel_thread_helper+0x4/0x10
> Sep  6 13:26:24 pc kernel: [  365.541745]  [<ffffffff8184c87c>] ?
> retint_restore_args+0x5/0x6
> Sep  6 13:26:24 pc kernel: [  365.541754]  [<ffffffff8184e3e0>] ?
> gs_change+0x13/0x13
> Sep  6 13:26:24 pc kernel: [  365.541760] Code: 01 15 f0 6a bc 00 48 29
> d0 48 89 05 ee 6a bc 00 e9 31 fd ff ff 0f 0b 0f
> 0b 4c 89 f7 e8 85 34 bc ff 48 83 f8 ff 0f 84 2b fe ff ff <0f> 0b 66 0f
> 1f 84 00 00 00 00 00 48 83 c1 01 e9 c2 fd ff ff 0
> f
> Sep  6 13:26:24 pc kernel: [  365.541898]  RSP <ffff88012e7abdc0>
> Sep  6 13:26:24 pc kernel: [  365.565054] ---[ end trace
> 25eb9ce0cc61c3a1 ]---
> Sep  6 13:26:24 pc kernel: [  365.565101] PGD 1e0e067 PUD 1e0f067
> PMD 0
> Sep  6 13:26:24 pc kernel: [  365.565108] Oops: 0000 [#2] PREEMPT SMP
> Sep  6 13:26:24 pc kernel: [  365.565115] CPU 2
> Sep  6 13:26:24 pc kernel: [  365.565118] Modules linked in: uvcvideo
> snd_usb_audio snd_usbmidi_lib snd_seq_midi snd_hwd
> ep snd_rawmidi videobuf2_vmalloc videobuf2_memops videobuf2_core
> videodev gpio_ich joydev hid_generic [last unloaded: sc
> si_wait_scan]
> Sep  6 13:26:24 pc kernel: [  365.565153]
> Sep  6 13:26:24 pc kernel: [  365.565156] Pid: 1208, comm: kworker/2:1
> Tainted: G      D      3.5.0 #3
> /DX58SO
> Sep  6 13:26:24 pc kernel: [  365.565176] RIP:
> e030:[<ffffffff8106e08c>]  [<ffffffff8106e08c>] kthread_data+0xc/0x20
> Sep  6 13:26:24 pc kernel: [  365.565194] RSP: e02b:ffff88012e7aba90
> EFLAGS: 00010092
> Sep  6 13:26:24 pc kernel: [  365.565205] RAX: 0000000000000000 RBX:
> 0000000000000002 RCX: 0000000000000002
> Sep  6 13:26:24 pc kernel: [  365.565219] RDX: ffffffff81fcba40 RSI:
> 0000000000000002 RDI: ffff88013101c440
> Sep  6 13:26:24 pc kernel: [  365.565233] RBP: ffff88012e7abaa8 R08:
> 0000000000989680 R09: ffffffff81fcba40
> Sep  6 13:26:24 pc kernel: [  365.565248] R10: ffffffff813b0c00 R11:
> 0000000000000000 R12: ffff8801406536c0
> Sep  6 13:26:24 pc kernel: [  365.565262] R13: 0000000000000002 R14:
> ffff88013101c430 R15: ffff88013101c440
> Sep  6 13:26:24 pc kernel: [  365.565280] FS:  00007f79d32ce700(0000)
> GS:ffff880140640000(0000) knlGS:0000000000000000
> Sep  6 13:26:24 pc kernel: [  365.565293] CS:  e033 DS: 0000 ES: 0000
> CR0: 000000008005003b
> Sep  6 13:26:24 pc kernel: [  365.565303] CR2: fffffffffffffff8 CR3:
> 0000000001e0c000 CR4: 0000000000002660
> Sep  6 13:26:24 pc kernel: [  365.565318] DR0: 0000000000000000 DR1:
> 0000000000000000 DR2: 0000000000000000
> Sep  6 13:26:24 pc kernel: [  365.565332] DR3: 0000000000000000 DR6:
> 00000000ffff0ff0 DR7: 0000000000000400
> Sep  6 13:26:24 pc kernel: [  365.565349] Process kworker/2:1 (pid:
> 1208, threadinfo ffff88012e7aa000, task ffff88013101
> c440)
> Sep  6 13:26:24 pc kernel: [  365.565362] Stack:
> Sep  6 13:26:24 pc kernel: [  365.565367]  ffffffff810698e0
> ffff88012e7abaa8 ffff88013101c818 ffff88012e7abb18
> Sep  6 13:26:24 pc kernel: [  365.565389]  ffffffff8184ae02
> ffff88012e7abfd8 ffff88013101c440 ffff88012e7abfd8
> Sep  6 13:26:24 pc kernel: [  365.565410]  ffff88012e7abfd8
> ffff88012d8840c0 ffff88013101c440 ffff88013101ca30
> 
> 
> 
> Perhaps this stacktrace helps...
> 
> Thanks!
> 
> Am 05.09.2012 20:54, schrieb Konrad Rzeszutek Wilk:
> >> > > > And its due to a patch I added in v3.4
> >> > > > (cd9db80e5257682a7f7ab245a2459648b3c8d268)
> >> > > > - which did not work properly in v3.4, but with v3.5 got it
> >> working
> >> > > > (977f857ca566a1e68045fcbb7cfc9c4acb077cf0) which causes v3.5
> >> to
> >> > now
> >> > > > work
> >> > > > anymore.
> >> > > >
> >> > > > Anyhow, for right now jsut revert
> >> > > > cd9db80e5257682a7f7ab245a2459648b3c8d268
> >> > > > and it should work for you.
> >> > > >
> >> Confirmed, after reverting that commit, VT-d will work fine.
> >> Will you fix this and push it to upstream Linux, Konrad?
> >>
> >> > > Also, our team reported a VT-d bug 2 months ago.
> >> > > http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1824
> >> >
> >
> > Can either one of you please test this patch, please:
> >
> >
> > diff --git a/drivers/xen/xen-pciback/pci_stub.c
> > b/drivers/xen/xen-pciback/pci_stub.c
> > index 097e536..425bd0b 100644
> > --- a/drivers/xen/xen-pciback/pci_stub.c
> > +++ b/drivers/xen/xen-pciback/pci_stub.c
> > @@ -4,6 +4,8 @@
> >   * Ryan Wilson <hap9@epoch.ncsc.mil>
> >   * Chris Bookholt <hap10@epoch.ncsc.mil>
> >   */
> > +#define DEBUG 1
> > +
> >  #include <linux/module.h>
> >  #include <linux/init.h>
> >  #include <linux/rwsem.h>
> > @@ -97,13 +99,15 @@ static void pcistub_device_release(struct kref
> > *kref)
> >  	/* Call the reset function which does not take lock as this
> >  	 * is called from "unbind" which takes a device_lock mutex.
> >  	 */
> > +	dev_dbg(&psdev->dev->dev, "FLR locked..\n");
> >  	__pci_reset_function_locked(psdev->dev);
> >  	if (pci_load_and_free_saved_state(psdev->dev,
> >  					  &dev_data->pci_saved_state)) {
> >  		dev_dbg(&psdev->dev->dev, "Could not reload PCI state\n");
> > -	} else
> > +	} else {
> > +		dev_dbg(&psdev->dev->dev, "Reloading PCI state..\n");
> >  		pci_restore_state(psdev->dev);
> > -
> > +	}
> >  	/* Disable the device */
> >  	xen_pcibk_reset_device(psdev->dev);
> >
> > @@ -353,16 +357,16 @@ static int __devinit pcistub_init_device(struct
> > pci_dev *dev)
> >  	if (err)
> >  		goto config_release;
> >
> > -	dev_dbg(&dev->dev, "reseting (FLR, D3, etc) the device\n");
> > -	__pci_reset_function_locked(dev);
> > -
> >  	/* We need the device active to save the state. */
> >  	dev_dbg(&dev->dev, "save state of device\n");
> >  	pci_save_state(dev);
> >  	dev_data->pci_saved_state = pci_store_saved_state(dev);
> >  	if (!dev_data->pci_saved_state)
> >  		dev_err(&dev->dev, "Could not store PCI conf saved state!\n");
> > -
> > +	else {
> > +		dev_dbg(&dev->dev, "reseting (FLR, D3, etc) the device\n");
> > +		__pci_reset_function_locked(dev);
> > +	}
> >  	/* Now disable the device (this also ensures some private device
> >  	 * data is setup before we export)
> >  	 */

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Regression in kernel 3.5 as Dom0 regarding PCI Passthrough?!
  2012-09-06 13:05     ` Konrad Rzeszutek Wilk
@ 2012-09-06 13:24       ` Tobias Geiger
  0 siblings, 0 replies; 20+ messages in thread
From: Tobias Geiger @ 2012-09-06 13:24 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: Konrad Rzeszutek Wilk, Ren, Yongjie, xen-devel

Am 06.09.2012 15:05, schrieb Konrad Rzeszutek Wilk:
> On Thu, Sep 06, 2012 at 01:28:00PM +0200, Tobias Geiger wrote:
>> Hello Konrad,
>>
>> the patch helps regarding the USB-PCIController-Passthrough - this
>> works now in DomU.
>
> Good. Can I put your Reported and Tested by tag.

Of course. thanks for the fix!

>>
>> but i still get the Dom0 crash when shutting down DomU:
>
> That is a different issue. Take a look at
> " dom0 linux 3.6.0-rc4, crash due to ballooning althoug dom0_mem=X,
> max:X set   "
> thread please.

ok will do!

Greetings
Tobias

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Regression in kernel 3.5 as Dom0 regarding PCI Passthrough?!
  2012-09-06 11:28   ` Tobias Geiger
@ 2012-09-06 13:05     ` Konrad Rzeszutek Wilk
  2012-09-06 13:24       ` Tobias Geiger
  2012-09-07  2:08     ` Ren, Yongjie
  1 sibling, 1 reply; 20+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-09-06 13:05 UTC (permalink / raw)
  To: Tobias Geiger; +Cc: Konrad Rzeszutek Wilk, Ren, Yongjie, xen-devel

On Thu, Sep 06, 2012 at 01:28:00PM +0200, Tobias Geiger wrote:
> Hello Konrad,
> 
> the patch helps regarding the USB-PCIController-Passthrough - this
> works now in DomU.

Good. Can I put your Reported and Tested by tag.
> 
> but i still get the Dom0 crash when shutting down DomU:

That is a different issue. Take a look at 
" dom0 linux 3.6.0-rc4, crash due to ballooning althoug dom0_mem=X, max:X set   "
thread please.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Regression in kernel 3.5 as Dom0 regarding PCI Passthrough?!
  2012-09-05 18:54 ` Konrad Rzeszutek Wilk
  2012-09-06 11:28   ` Tobias Geiger
  2012-09-06 11:32   ` Tobias Geiger
@ 2012-09-06 11:46   ` Tobias Geiger
  2 siblings, 0 replies; 20+ messages in thread
From: Tobias Geiger @ 2012-09-06 11:46 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: Konrad Rzeszutek Wilk, Ren, Yongjie, xen-devel

me again :)

it seems the Crash is not always a "fatal one":

[  247.080617] vif vif-2-0: 2 reading script
[  247.083519] br0: port 4(vif2.0) entered disabled state
[  247.084144] br0: port 4(vif2.0) entered disabled state
[  250.700029] ------------[ cut here ]------------
[  250.700046] kernel BUG at drivers/xen/balloon.c:359!
[  250.700059] invalid opcode: 0000 [#1] PREEMPT SMP
[  250.700071] CPU 4
[  250.700075] Modules linked in: joydev hid_generic uvcvideo 
snd_usb_audio snd_seq_midi snd_usbmidi_lib snd_hwdep snd_r
awmidi videobuf2_vmalloc videobuf2_memops videobuf2_core videodev 
gpio_ich [last unloaded: scsi_wait_scan]
[  250.700122]
[  250.700125] Pid: 23, comm: kworker/4:0 Not tainted 3.5.0 #3          
        /DX58SO
[  250.700139] RIP: e030:[<ffffffff81447f95>]  [<ffffffff81447f95>] 
balloon_process+0x385/0x3a0
[  250.700158] RSP: e02b:ffff8801317b9dc0  EFLAGS: 00010213
[  250.700162] RAX: 000000021f895000 RBX: 0000000000000000 RCX: 
0000000000000002
[  250.700167] RDX: ffffffff82027000 RSI: 0000000000000137 RDI: 
00000000000a2337
[  250.700172] RBP: ffff8801317b9e20 R08: ffff88014068e140 R09: 
00000000fffffffc
[  250.700180] R10: 0000000000000001 R11: 0000000000000000 R12: 
0000160000000000
[  250.700185] R13: 0000000000000001 R14: 00000000000a2337 R15: 
ffffea000288cdc0
[  250.700192] FS:  00007fb82ee14700(0000) GS:ffff880140680000(0000) 
knlGS:0000000000000000
[  250.700198] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[  250.700202] CR2: 00007fb82e7b39a6 CR3: 0000000001e0c000 CR4: 
0000000000002660
[  250.700207] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
0000000000000000
[  250.700213] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 
0000000000000400
[  250.700218] Process kworker/4:0 (pid: 23, threadinfo 
ffff8801317b8000, task ffff88013178db00)
[  250.700223] Stack:
[  250.700225]  000000000006aa7b 0000000000000001 ffffffff8200ea80 
0000000000000001
[  250.700293]  0000000000000000 0000000000007ff0 ffff8801317b9e00 
ffff880131796400
[  250.700301]  ffff880140697000 ffff88014068e140 0000000000000000 
ffffffff81e587c0
[  250.700311] Call Trace:
[  250.700317]  [<ffffffff8106753b>] process_one_work+0x12b/0x450
[  250.700322]  [<ffffffff81447c10>] ? decrease_reservation+0x320/0x320
[  250.700328]  [<ffffffff810688be>] worker_thread+0x12e/0x2d0
[  250.700334]  [<ffffffff81068790>] ? 
manage_workers.isra.26+0x1f0/0x1f0
[  250.700340]  [<ffffffff8106db7e>] kthread+0x8e/0xa0
[  250.700346]  [<ffffffff8184e3e4>] kernel_thread_helper+0x4/0x10
[  250.700353]  [<ffffffff8184c87c>] ? retint_restore_args+0x5/0x6
[  250.700358]  [<ffffffff8184e3e0>] ? gs_change+0x13/0x13
[  250.700362] Code: 01 15 f0 6a bc 00 48 29 d0 48 89 05 ee 6a bc 00 e9 
31 fd ff ff 0f 0b 0f 0b 4c 89 f7 e8 85 34 bc ff
48 83 f8 ff 0f 84 2b fe ff ff <0f> 0b 66 0f 1f 84 00 00 00 00 00 48 83 
c1 01 e9 c2 fd ff ff 0f
[  250.700471] RIP  [<ffffffff81447f95>] balloon_process+0x385/0x3a0
[  250.700482]  RSP <ffff8801317b9dc0>
[  250.733955] ---[ end trace a5e5187e8ed6c1ff ]---
[  250.733982] BUG: unable to handle kernel paging request at 
fffffffffffffff8
[  250.733992] IP: [<ffffffff8106e08c>] kthread_data+0xc/0x20
[  250.733999] PGD 1e0e067 PUD 1e0f067 PMD 0
[  250.734006] Oops: 0000 [#2] PREEMPT SMP
[  250.734013] CPU 4
[  250.734016] Modules linked in: joydev hid_generic uvcvideo 
snd_usb_audio snd_seq_midi snd_usbmidi_lib snd_hwdep snd_r
awmidi videobuf2_vmalloc videobuf2_memops videobuf2_core videodev 
gpio_ich [last unloaded: scsi_wait_scan]
[  250.734071]
[  250.734073] Pid: 23, comm: kworker/4:0 Tainted: G      D      3.5.0 
#3                  /DX58SO
[  250.734095] RIP: e030:[<ffffffff8106e08c>]  [<ffffffff8106e08c>] 
kthread_data+0xc/0x20
[  250.734111] RSP: e02b:ffff8801317b9a90  EFLAGS: 00010092
[  250.734122] RAX: 0000000000000000 RBX: 0000000000000004 RCX: 
0000000000000004
[  250.734137] RDX: ffffffff81fcba40 RSI: 0000000000000004 RDI: 
ffff88013178db00
[  250.734151] RBP: ffff8801317b9aa8 R08: 0000000000989680 R09: 
ffffffff81fcba40
[  250.734166] R10: ffffffff8104960a R11: 0000000000000000 R12: 
ffff8801406936c0
[  250.734178] R13: 0000000000000004 R14: ffff88013178daf0 R15: 
ffff88013178db00
[  250.734196] FS:  00007fb82ee14700(0000) GS:ffff880140680000(0000) 
knlGS:0000000000000000
[  250.734202] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[  250.734209] CR2: fffffffffffffff8 CR3: 0000000001e0c000 CR4: 
0000000000002660
[  250.734222] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
0000000000000000
[  250.734235] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 
0000000000000400
[  250.734249] Process kworker/4:0 (pid: 23, threadinfo 
ffff8801317b8000, task ffff88013178db00)
[  250.734266] Stack:
[  250.734271]  ffffffff810698e0 ffff8801317b9aa8 ffff88013178ded8 
ffff8801317b9b18
[  250.734292]  ffffffff8184ae02 ffff8801317b9fd8 ffff88013178db00 
ffff8801317b9fd8
[  250.734313]  ffff8801317b9fd8 ffff8801334796c0 ffff88013178db00 
ffff8801317b9ae8
[  250.734979] Call Trace:
[  250.735572]  [<ffffffff810698e0>] ? wq_worker_sleeping+0x10/0xa0
[  250.736179]  [<ffffffff8184ae02>] __schedule+0x592/0x7d0
[  250.736783]  [<ffffffff8184b164>] schedule+0x24/0x70
[  250.737373]  [<ffffffff81051592>] do_exit+0x5b2/0x910
[  250.737937]  [<ffffffff8183ea1e>] ? printk+0x48/0x4a
[  250.738498]  [<ffffffff8100ace2>] ? check_events+0x12/0x20
[  250.739053]  [<ffffffff81017581>] oops_end+0x71/0xa0
[  250.739596]  [<ffffffff810176f3>] die+0x53/0x80
[  250.740134]  [<ffffffff810143f8>] do_trap+0xb8/0x160
[  250.740668]  [<ffffffff810146f3>] do_invalid_op+0xa3/0xb0
[  250.741203]  [<ffffffff81447f95>] ? balloon_process+0x385/0x3a0
[  250.741737]  [<ffffffff81085f52>] ? load_balance+0xd2/0x800
[  250.742267]  [<ffffffff81006276>] ? xen_flush_tlb+0xd6/0x2a0
[  250.742803]  [<ffffffff8108117d>] ? cpuacct_charge+0x6d/0xb0
[  250.743332]  [<ffffffff8184e25b>] invalid_op+0x1b/0x20
[  250.743855]  [<ffffffff81447f95>] ? balloon_process+0x385/0x3a0
[  250.744374]  [<ffffffff8106753b>] process_one_work+0x12b/0x450
[  250.744897]  [<ffffffff81447c10>] ? decrease_reservation+0x320/0x320
[  250.745426]  [<ffffffff810688be>] worker_thread+0x12e/0x2d0
[  250.745942]  [<ffffffff81068790>] ? 
manage_workers.isra.26+0x1f0/0x1f0
[  250.746457]  [<ffffffff8106db7e>] kthread+0x8e/0xa0
[  250.746969]  [<ffffffff8184e3e4>] kernel_thread_helper+0x4/0x10
[  250.747480]  [<ffffffff8184c87c>] ? retint_restore_args+0x5/0x6
[  250.747990]  [<ffffffff8184e3e0>] ? gs_change+0x13/0x13
[  250.748487] Code: e0 ff ff 01 48 8b 80 38 e0 ff ff a8 08 0f 84 3d ff 
ff ff e8 57 d0 7d 00 e9 33 ff ff ff 66 90 48 8b
87 80 03 00 00 55 48 89 e5 5d <48> 8b 40 f8 c3 66 66 66 66 66 66 2e 0f 
1f 84 00 00 00 00 00 55
[  250.749575] RIP  [<ffffffff8106e08c>] kthread_data+0xc/0x20
[  250.750103]  RSP <ffff8801317b9a90>
[  250.750627] CR2: fffffffffffffff8
[  250.751151] ---[ end trace a5e5187e8ed6c200 ]---
[  250.751152] Fixing recursive fault but reboot is needed!
[  311.042233] INFO: rcu_preempt detected stalls on CPUs/tasks: { 4} 
(detected by 7, t=60011 jiffies)
[  311.042237] INFO: Stall ended before state dump start
[  491.279642] INFO: rcu_preempt detected stalls on CPUs/tasks: { 4} 
(detected by 7, t=240249 jiffies)
[  491.279646] INFO: Stall ended before state dump start
[  671.670546] INFO: rcu_preempt detected stalls on CPUs/tasks: { 4} 
(detected by 7, t=420638 jiffies)
[  671.670550] INFO: Stall ended before state dump start
[  763.240862] INFO: rcu_bh detected stalls on CPUs/tasks: { 1 4} 
(detected by 5, t=63547 jiffies)
[  763.240867] INFO: Stall ended before state dump start
[  853.438186] INFO: rcu_preempt detected stalls on CPUs/tasks: { 4} 
(detected by 7, t=602410 jiffies)
[  853.438190] INFO: Stall ended before state dump start
[  943.632087] INFO: rcu_bh detected stalls on CPUs/tasks: { 1 4} 
(detected by 0, t=243935 jiffies)
[  943.632092] INFO: Stall ended before state dump start
[ 1033.828726] INFO: rcu_preempt detected stalls on CPUs/tasks: { 4} 
(detected by 7, t=782798 jiffies)
[ 1033.828729] INFO: Stall ended before state dump start


Now Dom0 still reacts, but mostly unusable sluggish...

Am 05.09.2012 20:54, schrieb Konrad Rzeszutek Wilk:
>> > > > And its due to a patch I added in v3.4
>> > > > (cd9db80e5257682a7f7ab245a2459648b3c8d268)
>> > > > - which did not work properly in v3.4, but with v3.5 got it 
>> working
>> > > > (977f857ca566a1e68045fcbb7cfc9c4acb077cf0) which causes v3.5 
>> to
>> > now
>> > > > work
>> > > > anymore.
>> > > >
>> > > > Anyhow, for right now jsut revert
>> > > > cd9db80e5257682a7f7ab245a2459648b3c8d268
>> > > > and it should work for you.
>> > > >
>> Confirmed, after reverting that commit, VT-d will work fine.
>> Will you fix this and push it to upstream Linux, Konrad?
>>
>> > > Also, our team reported a VT-d bug 2 months ago.
>> > > http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1824
>> >
>
> Can either one of you please test this patch, please:
>
>
> diff --git a/drivers/xen/xen-pciback/pci_stub.c
> b/drivers/xen/xen-pciback/pci_stub.c
> index 097e536..425bd0b 100644
> --- a/drivers/xen/xen-pciback/pci_stub.c
> +++ b/drivers/xen/xen-pciback/pci_stub.c
> @@ -4,6 +4,8 @@
>   * Ryan Wilson <hap9@epoch.ncsc.mil>
>   * Chris Bookholt <hap10@epoch.ncsc.mil>
>   */
> +#define DEBUG 1
> +
>  #include <linux/module.h>
>  #include <linux/init.h>
>  #include <linux/rwsem.h>
> @@ -97,13 +99,15 @@ static void pcistub_device_release(struct kref 
> *kref)
>  	/* Call the reset function which does not take lock as this
>  	 * is called from "unbind" which takes a device_lock mutex.
>  	 */
> +	dev_dbg(&psdev->dev->dev, "FLR locked..\n");
>  	__pci_reset_function_locked(psdev->dev);
>  	if (pci_load_and_free_saved_state(psdev->dev,
>  					  &dev_data->pci_saved_state)) {
>  		dev_dbg(&psdev->dev->dev, "Could not reload PCI state\n");
> -	} else
> +	} else {
> +		dev_dbg(&psdev->dev->dev, "Reloading PCI state..\n");
>  		pci_restore_state(psdev->dev);
> -
> +	}
>  	/* Disable the device */
>  	xen_pcibk_reset_device(psdev->dev);
>
> @@ -353,16 +357,16 @@ static int __devinit pcistub_init_device(struct
> pci_dev *dev)
>  	if (err)
>  		goto config_release;
>
> -	dev_dbg(&dev->dev, "reseting (FLR, D3, etc) the device\n");
> -	__pci_reset_function_locked(dev);
> -
>  	/* We need the device active to save the state. */
>  	dev_dbg(&dev->dev, "save state of device\n");
>  	pci_save_state(dev);
>  	dev_data->pci_saved_state = pci_store_saved_state(dev);
>  	if (!dev_data->pci_saved_state)
>  		dev_err(&dev->dev, "Could not store PCI conf saved state!\n");
> -
> +	else {
> +		dev_dbg(&dev->dev, "reseting (FLR, D3, etc) the device\n");
> +		__pci_reset_function_locked(dev);
> +	}
>  	/* Now disable the device (this also ensures some private device
>  	 * data is setup before we export)
>  	 */

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Regression in kernel 3.5 as Dom0 regarding PCI Passthrough?!
  2012-09-05 18:54 ` Konrad Rzeszutek Wilk
  2012-09-06 11:28   ` Tobias Geiger
@ 2012-09-06 11:32   ` Tobias Geiger
  2012-09-06 11:46   ` Tobias Geiger
  2 siblings, 0 replies; 20+ messages in thread
From: Tobias Geiger @ 2012-09-06 11:32 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: Konrad Rzeszutek Wilk, Ren, Yongjie, xen-devel

FYI:

This Dom0-Crash only happens, when i shuttdown the DomU within the 
DomU, meaning when i choose "Start - Shutdown" within Windows7.
The Crash does NOT happen, when i do "xl shutdown domu" ... ?! :)

Greetings
Tobias

Am 05.09.2012 20:54, schrieb Konrad Rzeszutek Wilk:
>> > > > And its due to a patch I added in v3.4
>> > > > (cd9db80e5257682a7f7ab245a2459648b3c8d268)
>> > > > - which did not work properly in v3.4, but with v3.5 got it 
>> working
>> > > > (977f857ca566a1e68045fcbb7cfc9c4acb077cf0) which causes v3.5 
>> to
>> > now
>> > > > work
>> > > > anymore.
>> > > >
>> > > > Anyhow, for right now jsut revert
>> > > > cd9db80e5257682a7f7ab245a2459648b3c8d268
>> > > > and it should work for you.
>> > > >
>> Confirmed, after reverting that commit, VT-d will work fine.
>> Will you fix this and push it to upstream Linux, Konrad?
>>
>> > > Also, our team reported a VT-d bug 2 months ago.
>> > > http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1824
>> >
>
> Can either one of you please test this patch, please:
>
>
> diff --git a/drivers/xen/xen-pciback/pci_stub.c
> b/drivers/xen/xen-pciback/pci_stub.c
> index 097e536..425bd0b 100644
> --- a/drivers/xen/xen-pciback/pci_stub.c
> +++ b/drivers/xen/xen-pciback/pci_stub.c
> @@ -4,6 +4,8 @@
>   * Ryan Wilson <hap9@epoch.ncsc.mil>
>   * Chris Bookholt <hap10@epoch.ncsc.mil>
>   */
> +#define DEBUG 1
> +
>  #include <linux/module.h>
>  #include <linux/init.h>
>  #include <linux/rwsem.h>
> @@ -97,13 +99,15 @@ static void pcistub_device_release(struct kref 
> *kref)
>  	/* Call the reset function which does not take lock as this
>  	 * is called from "unbind" which takes a device_lock mutex.
>  	 */
> +	dev_dbg(&psdev->dev->dev, "FLR locked..\n");
>  	__pci_reset_function_locked(psdev->dev);
>  	if (pci_load_and_free_saved_state(psdev->dev,
>  					  &dev_data->pci_saved_state)) {
>  		dev_dbg(&psdev->dev->dev, "Could not reload PCI state\n");
> -	} else
> +	} else {
> +		dev_dbg(&psdev->dev->dev, "Reloading PCI state..\n");
>  		pci_restore_state(psdev->dev);
> -
> +	}
>  	/* Disable the device */
>  	xen_pcibk_reset_device(psdev->dev);
>
> @@ -353,16 +357,16 @@ static int __devinit pcistub_init_device(struct
> pci_dev *dev)
>  	if (err)
>  		goto config_release;
>
> -	dev_dbg(&dev->dev, "reseting (FLR, D3, etc) the device\n");
> -	__pci_reset_function_locked(dev);
> -
>  	/* We need the device active to save the state. */
>  	dev_dbg(&dev->dev, "save state of device\n");
>  	pci_save_state(dev);
>  	dev_data->pci_saved_state = pci_store_saved_state(dev);
>  	if (!dev_data->pci_saved_state)
>  		dev_err(&dev->dev, "Could not store PCI conf saved state!\n");
> -
> +	else {
> +		dev_dbg(&dev->dev, "reseting (FLR, D3, etc) the device\n");
> +		__pci_reset_function_locked(dev);
> +	}
>  	/* Now disable the device (this also ensures some private device
>  	 * data is setup before we export)
>  	 */

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Regression in kernel 3.5 as Dom0 regarding PCI Passthrough?!
  2012-09-05 18:54 ` Konrad Rzeszutek Wilk
@ 2012-09-06 11:28   ` Tobias Geiger
  2012-09-06 13:05     ` Konrad Rzeszutek Wilk
  2012-09-07  2:08     ` Ren, Yongjie
  2012-09-06 11:32   ` Tobias Geiger
  2012-09-06 11:46   ` Tobias Geiger
  2 siblings, 2 replies; 20+ messages in thread
From: Tobias Geiger @ 2012-09-06 11:28 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: Konrad Rzeszutek Wilk, Ren, Yongjie, xen-devel

Hello Konrad,

the patch helps regarding the USB-PCIController-Passthrough - this 
works now in DomU.

but i still get the Dom0 crash when shutting down DomU:

Sep  6 13:26:19 pc kernel: [  361.011514] 
xen-blkback:backend/vbd/1/832: prepare for reconnect
Sep  6 13:26:20 pc kernel: [  361.876395] 
xen-blkback:backend/vbd/1/768: prepare for reconnect
Sep  6 13:26:21 pc kernel: [  362.682152] br0: port 3(vif1.0) entered 
disabled state
Sep  6 13:26:21 pc kernel: [  362.682267] br0: port 3(vif1.0) entered 
disabled state
Sep  6 13:26:24 pc kernel: [  365.541386] ------------[ cut here 
]------------
Sep  6 13:26:24 pc kernel: [  365.541411] invalid opcode: 0000 [#1] 
PREEMPT SMP
Sep  6 13:26:24 pc kernel: [  365.541423] CPU 2
Sep  6 13:26:24 pc kernel: [  365.541427] Modules linked in: uvcvideo 
snd_usb_audio snd_usbmidi_lib snd_seq_midi snd_hwd
ep snd_rawmidi videobuf2_vmalloc videobuf2_memops videobuf2_core 
videodev gpio_ich joydev hid_generic [last unloaded: sc
si_wait_scan]
Sep  6 13:26:24 pc kernel: [  365.541474]
Sep  6 13:26:24 pc kernel: [  365.541477] Pid: 1208, comm: kworker/2:1 
Not tainted 3.5.0 #3                  /DX58SO
Sep  6 13:26:24 pc kernel: [  365.541491] RIP: 
e030:[<ffffffff81447f95>]  [<ffffffff81447f95>] 
balloon_process+0x385/0x3
a0
Sep  6 13:26:24 pc kernel: [  365.541507] RSP: e02b:ffff88012e7abdc0  
EFLAGS: 00010213
Sep  6 13:26:24 pc kernel: [  365.541515] RAX: 0000000220be7000 RBX: 
0000000000000000 RCX: 0000000000000008
Sep  6 13:26:24 pc kernel: [  365.541523] RDX: ffff88010d99a000 RSI: 
00000000000001df RDI: 000000000020efdf
Sep  6 13:26:24 pc kernel: [  365.541532] RBP: ffff88012e7abe20 R08: 
ffff88014064e140 R09: 00000000fffffffe
Sep  6 13:26:24 pc kernel: [  365.541540] R10: 0000000000000001 R11: 
0000000000000000 R12: 0000160000000000
Sep  6 13:26:24 pc kernel: [  365.541548] R13: 0000000000000001 R14: 
000000000020efdf R15: ffffea00083bf7c0
Sep  6 13:26:24 pc kernel: [  365.541561] FS:  00007f79d32ce700(0000) 
GS:ffff880140640000(0000) knlGS:0000000000000000
Sep  6 13:26:24 pc kernel: [  365.541571] CS:  e033 DS: 0000 ES: 0000 
CR0: 000000008005003b
Sep  6 13:26:24 pc kernel: [  365.541578] CR2: 00007f79d2d6ce02 CR3: 
0000000001e0c000 CR4: 0000000000002660
Sep  6 13:26:24 pc kernel: [  365.541587] DR0: 0000000000000000 DR1: 
0000000000000000 DR2: 0000000000000000
Sep  6 13:26:24 pc kernel: [  365.541596] DR3: 0000000000000000 DR6: 
00000000ffff0ff0 DR7: 0000000000000400
Sep  6 13:26:24 pc kernel: [  365.541604] Process kworker/2:1 (pid: 
1208, threadinfo ffff88012e7aa000, task ffff88013101
c440)
Sep  6 13:26:24 pc kernel: [  365.541613] Stack:
Sep  6 13:26:24 pc kernel: [  365.541618]  000000000006877b 
0000000000000001 ffffffff8200ea80 0000000000000001
Sep  6 13:26:24 pc kernel: [  365.541649]  0000000000000000 
0000000000007ff0 ffff88012e7abe00 ffff8801302eee00
Sep  6 13:26:24 pc kernel: [  365.541664]  ffff880140657000 
ffff88014064e140 0000000000000000 ffffffff81e587c0
Sep  6 13:26:24 pc kernel: [  365.541679] Call Trace:
Sep  6 13:26:24 pc kernel: [  365.541688]  [<ffffffff8106753b>] 
process_one_work+0x12b/0x450
Sep  6 13:26:24 pc kernel: [  365.541697]  [<ffffffff81447c10>] ? 
decrease_reservation+0x320/0x320
Sep  6 13:26:24 pc kernel: [  365.541706]  [<ffffffff810688be>] 
worker_thread+0x12e/0x2d0
Sep  6 13:26:24 pc kernel: [  365.541715]  [<ffffffff81068790>] ? 
manage_workers.isra.26+0x1f0/0x1f0
Sep  6 13:26:24 pc kernel: [  365.541725]  [<ffffffff8106db7e>] 
kthread+0x8e/0xa0
Sep  6 13:26:24 pc kernel: [  365.541735]  [<ffffffff8184e3e4>] 
kernel_thread_helper+0x4/0x10
Sep  6 13:26:24 pc kernel: [  365.541745]  [<ffffffff8184c87c>] ? 
retint_restore_args+0x5/0x6
Sep  6 13:26:24 pc kernel: [  365.541754]  [<ffffffff8184e3e0>] ? 
gs_change+0x13/0x13
Sep  6 13:26:24 pc kernel: [  365.541760] Code: 01 15 f0 6a bc 00 48 29 
d0 48 89 05 ee 6a bc 00 e9 31 fd ff ff 0f 0b 0f
0b 4c 89 f7 e8 85 34 bc ff 48 83 f8 ff 0f 84 2b fe ff ff <0f> 0b 66 0f 
1f 84 00 00 00 00 00 48 83 c1 01 e9 c2 fd ff ff 0
f
Sep  6 13:26:24 pc kernel: [  365.541898]  RSP <ffff88012e7abdc0>
Sep  6 13:26:24 pc kernel: [  365.565054] ---[ end trace 
25eb9ce0cc61c3a1 ]---
Sep  6 13:26:24 pc kernel: [  365.565101] PGD 1e0e067 PUD 1e0f067 PMD 0
Sep  6 13:26:24 pc kernel: [  365.565108] Oops: 0000 [#2] PREEMPT SMP
Sep  6 13:26:24 pc kernel: [  365.565115] CPU 2
Sep  6 13:26:24 pc kernel: [  365.565118] Modules linked in: uvcvideo 
snd_usb_audio snd_usbmidi_lib snd_seq_midi snd_hwd
ep snd_rawmidi videobuf2_vmalloc videobuf2_memops videobuf2_core 
videodev gpio_ich joydev hid_generic [last unloaded: sc
si_wait_scan]
Sep  6 13:26:24 pc kernel: [  365.565153]
Sep  6 13:26:24 pc kernel: [  365.565156] Pid: 1208, comm: kworker/2:1 
Tainted: G      D      3.5.0 #3
/DX58SO
Sep  6 13:26:24 pc kernel: [  365.565176] RIP: 
e030:[<ffffffff8106e08c>]  [<ffffffff8106e08c>] kthread_data+0xc/0x20
Sep  6 13:26:24 pc kernel: [  365.565194] RSP: e02b:ffff88012e7aba90  
EFLAGS: 00010092
Sep  6 13:26:24 pc kernel: [  365.565205] RAX: 0000000000000000 RBX: 
0000000000000002 RCX: 0000000000000002
Sep  6 13:26:24 pc kernel: [  365.565219] RDX: ffffffff81fcba40 RSI: 
0000000000000002 RDI: ffff88013101c440
Sep  6 13:26:24 pc kernel: [  365.565233] RBP: ffff88012e7abaa8 R08: 
0000000000989680 R09: ffffffff81fcba40
Sep  6 13:26:24 pc kernel: [  365.565248] R10: ffffffff813b0c00 R11: 
0000000000000000 R12: ffff8801406536c0
Sep  6 13:26:24 pc kernel: [  365.565262] R13: 0000000000000002 R14: 
ffff88013101c430 R15: ffff88013101c440
Sep  6 13:26:24 pc kernel: [  365.565280] FS:  00007f79d32ce700(0000) 
GS:ffff880140640000(0000) knlGS:0000000000000000
Sep  6 13:26:24 pc kernel: [  365.565293] CS:  e033 DS: 0000 ES: 0000 
CR0: 000000008005003b
Sep  6 13:26:24 pc kernel: [  365.565303] CR2: fffffffffffffff8 CR3: 
0000000001e0c000 CR4: 0000000000002660
Sep  6 13:26:24 pc kernel: [  365.565318] DR0: 0000000000000000 DR1: 
0000000000000000 DR2: 0000000000000000
Sep  6 13:26:24 pc kernel: [  365.565332] DR3: 0000000000000000 DR6: 
00000000ffff0ff0 DR7: 0000000000000400
Sep  6 13:26:24 pc kernel: [  365.565349] Process kworker/2:1 (pid: 
1208, threadinfo ffff88012e7aa000, task ffff88013101
c440)
Sep  6 13:26:24 pc kernel: [  365.565362] Stack:
Sep  6 13:26:24 pc kernel: [  365.565367]  ffffffff810698e0 
ffff88012e7abaa8 ffff88013101c818 ffff88012e7abb18
Sep  6 13:26:24 pc kernel: [  365.565389]  ffffffff8184ae02 
ffff88012e7abfd8 ffff88013101c440 ffff88012e7abfd8
Sep  6 13:26:24 pc kernel: [  365.565410]  ffff88012e7abfd8 
ffff88012d8840c0 ffff88013101c440 ffff88013101ca30



Perhaps this stacktrace helps...

Thanks!

Am 05.09.2012 20:54, schrieb Konrad Rzeszutek Wilk:
>> > > > And its due to a patch I added in v3.4
>> > > > (cd9db80e5257682a7f7ab245a2459648b3c8d268)
>> > > > - which did not work properly in v3.4, but with v3.5 got it 
>> working
>> > > > (977f857ca566a1e68045fcbb7cfc9c4acb077cf0) which causes v3.5 
>> to
>> > now
>> > > > work
>> > > > anymore.
>> > > >
>> > > > Anyhow, for right now jsut revert
>> > > > cd9db80e5257682a7f7ab245a2459648b3c8d268
>> > > > and it should work for you.
>> > > >
>> Confirmed, after reverting that commit, VT-d will work fine.
>> Will you fix this and push it to upstream Linux, Konrad?
>>
>> > > Also, our team reported a VT-d bug 2 months ago.
>> > > http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1824
>> >
>
> Can either one of you please test this patch, please:
>
>
> diff --git a/drivers/xen/xen-pciback/pci_stub.c
> b/drivers/xen/xen-pciback/pci_stub.c
> index 097e536..425bd0b 100644
> --- a/drivers/xen/xen-pciback/pci_stub.c
> +++ b/drivers/xen/xen-pciback/pci_stub.c
> @@ -4,6 +4,8 @@
>   * Ryan Wilson <hap9@epoch.ncsc.mil>
>   * Chris Bookholt <hap10@epoch.ncsc.mil>
>   */
> +#define DEBUG 1
> +
>  #include <linux/module.h>
>  #include <linux/init.h>
>  #include <linux/rwsem.h>
> @@ -97,13 +99,15 @@ static void pcistub_device_release(struct kref 
> *kref)
>  	/* Call the reset function which does not take lock as this
>  	 * is called from "unbind" which takes a device_lock mutex.
>  	 */
> +	dev_dbg(&psdev->dev->dev, "FLR locked..\n");
>  	__pci_reset_function_locked(psdev->dev);
>  	if (pci_load_and_free_saved_state(psdev->dev,
>  					  &dev_data->pci_saved_state)) {
>  		dev_dbg(&psdev->dev->dev, "Could not reload PCI state\n");
> -	} else
> +	} else {
> +		dev_dbg(&psdev->dev->dev, "Reloading PCI state..\n");
>  		pci_restore_state(psdev->dev);
> -
> +	}
>  	/* Disable the device */
>  	xen_pcibk_reset_device(psdev->dev);
>
> @@ -353,16 +357,16 @@ static int __devinit pcistub_init_device(struct
> pci_dev *dev)
>  	if (err)
>  		goto config_release;
>
> -	dev_dbg(&dev->dev, "reseting (FLR, D3, etc) the device\n");
> -	__pci_reset_function_locked(dev);
> -
>  	/* We need the device active to save the state. */
>  	dev_dbg(&dev->dev, "save state of device\n");
>  	pci_save_state(dev);
>  	dev_data->pci_saved_state = pci_store_saved_state(dev);
>  	if (!dev_data->pci_saved_state)
>  		dev_err(&dev->dev, "Could not store PCI conf saved state!\n");
> -
> +	else {
> +		dev_dbg(&dev->dev, "reseting (FLR, D3, etc) the device\n");
> +		__pci_reset_function_locked(dev);
> +	}
>  	/* Now disable the device (this also ensures some private device
>  	 * data is setup before we export)
>  	 */

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Regression in kernel 3.5 as Dom0 regarding PCI Passthrough?!
  2012-08-28  8:25 Ren, Yongjie
  2012-08-28 13:19 ` Konrad Rzeszutek Wilk
@ 2012-09-05 18:54 ` Konrad Rzeszutek Wilk
  2012-09-06 11:28   ` Tobias Geiger
                     ` (2 more replies)
  1 sibling, 3 replies; 20+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-09-05 18:54 UTC (permalink / raw)
  To: Ren, Yongjie; +Cc: Konrad Rzeszutek Wilk, Tobias Geiger, xen-devel

> > > > And its due to a patch I added in v3.4
> > > > (cd9db80e5257682a7f7ab245a2459648b3c8d268)
> > > > - which did not work properly in v3.4, but with v3.5 got it working
> > > > (977f857ca566a1e68045fcbb7cfc9c4acb077cf0) which causes v3.5 to
> > now
> > > > work
> > > > anymore.
> > > >
> > > > Anyhow, for right now jsut revert
> > > > cd9db80e5257682a7f7ab245a2459648b3c8d268
> > > > and it should work for you.
> > > >
> Confirmed, after reverting that commit, VT-d will work fine.
> Will you fix this and push it to upstream Linux, Konrad?
> 
> > > Also, our team reported a VT-d bug 2 months ago.
> > > http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1824
> >

Can either one of you please test this patch, please:


diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
index 097e536..425bd0b 100644
--- a/drivers/xen/xen-pciback/pci_stub.c
+++ b/drivers/xen/xen-pciback/pci_stub.c
@@ -4,6 +4,8 @@
  * Ryan Wilson <hap9@epoch.ncsc.mil>
  * Chris Bookholt <hap10@epoch.ncsc.mil>
  */
+#define DEBUG 1
+
 #include <linux/module.h>
 #include <linux/init.h>
 #include <linux/rwsem.h>
@@ -97,13 +99,15 @@ static void pcistub_device_release(struct kref *kref)
 	/* Call the reset function which does not take lock as this
 	 * is called from "unbind" which takes a device_lock mutex.
 	 */
+	dev_dbg(&psdev->dev->dev, "FLR locked..\n");
 	__pci_reset_function_locked(psdev->dev);
 	if (pci_load_and_free_saved_state(psdev->dev,
 					  &dev_data->pci_saved_state)) {
 		dev_dbg(&psdev->dev->dev, "Could not reload PCI state\n");
-	} else
+	} else {
+		dev_dbg(&psdev->dev->dev, "Reloading PCI state..\n");
 		pci_restore_state(psdev->dev);
-
+	}
 	/* Disable the device */
 	xen_pcibk_reset_device(psdev->dev);
 
@@ -353,16 +357,16 @@ static int __devinit pcistub_init_device(struct pci_dev *dev)
 	if (err)
 		goto config_release;
 
-	dev_dbg(&dev->dev, "reseting (FLR, D3, etc) the device\n");
-	__pci_reset_function_locked(dev);
-
 	/* We need the device active to save the state. */
 	dev_dbg(&dev->dev, "save state of device\n");
 	pci_save_state(dev);
 	dev_data->pci_saved_state = pci_store_saved_state(dev);
 	if (!dev_data->pci_saved_state)
 		dev_err(&dev->dev, "Could not store PCI conf saved state!\n");
-
+	else {
+		dev_dbg(&dev->dev, "reseting (FLR, D3, etc) the device\n");
+		__pci_reset_function_locked(dev);
+	}
 	/* Now disable the device (this also ensures some private device
 	 * data is setup before we export)
 	 */

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: Regression in kernel 3.5 as Dom0 regarding PCI Passthrough?!
  2012-08-28  8:25 Ren, Yongjie
@ 2012-08-28 13:19 ` Konrad Rzeszutek Wilk
  2012-09-05 18:54 ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 20+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-08-28 13:19 UTC (permalink / raw)
  To: Ren, Yongjie; +Cc: Konrad Rzeszutek Wilk, Tobias Geiger, xen-devel

> > > > Anyhow, for right now jsut revert
> > > > cd9db80e5257682a7f7ab245a2459648b3c8d268
> > > > and it should work for you.
> > > >
> Confirmed, after reverting that commit, VT-d will work fine.
> Will you fix this and push it to upstream Linux, Konrad?

Yes I plan to fix it - thought I am not sure exactly how. The
reset functionality works - (too well one could say) - perhaps
what I also need is to enable the device after the reset.

> 
> > > Also, our team reported a VT-d bug 2 months ago.
> > > http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1824
> >

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Regression in kernel 3.5 as Dom0 regarding PCI Passthrough?!
@ 2012-08-28  8:25 Ren, Yongjie
  2012-08-28 13:19 ` Konrad Rzeszutek Wilk
  2012-09-05 18:54 ` Konrad Rzeszutek Wilk
  0 siblings, 2 replies; 20+ messages in thread
From: Ren, Yongjie @ 2012-08-28  8:25 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: Tobias Geiger, xen-devel, Konrad Rzeszutek Wilk

> -----Original Message-----
> From: Konrad Rzeszutek [mailto:ketuzsezr@gmail.com] On Behalf Of
> Konrad Rzeszutek Wilk
> Sent: Tuesday, August 21, 2012 10:23 PM
> To: Ren, Yongjie
> Cc: Konrad Rzeszutek Wilk; Tobias Geiger; xen-devel@lists.xen.org
> Subject: Re: [Xen-devel] Regression in kernel 3.5 as Dom0 regarding PCI
> Passthrough?!
> 
> On Tue, Aug 21, 2012 at 02:41:36AM +0000, Ren, Yongjie wrote:
> > > -----Original Message-----
> > > From: xen-devel-bounces@lists.xen.org
> > > [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of Konrad
> Rzeszutek
> > > Wilk
> > > Sent: Tuesday, August 21, 2012 7:30 AM
> > > To: Tobias Geiger
> > > Cc: xen-devel@lists.xen.org
> > > Subject: Re: [Xen-devel] Regression in kernel 3.5 as Dom0 regarding PCI
> > > Passthrough?!
> > >
> > > On Mon, Aug 06, 2012 at 12:16:33PM -0400, Konrad Rzeszutek Wilk
> wrote:
> > > > On Wed, Jul 25, 2012 at 09:43:57AM -0400, Konrad Rzeszutek Wilk
> > > wrote:
> > > > > On Wed, Jul 25, 2012 at 02:30:00PM +0200, Tobias Geiger wrote:
> > > > > > Hi!
> > > > > >
> > > > > > i notice a serious regression with 3.5 as Dom0 kernel (3.4 was
> rock
> > > > > > stable):
> > > > > >
> > > > > > 1st: only the GPU PCI Passthrough works, the PCI USB Controller
> is
> > > > > > not recognized within the DomU (HVM Win7 64)
> > > > > > Dom0 cmdline is:
> > > > > > ro root=LABEL=dom0root
> > >
> xen-pciback.hide=(08:00.0)(08:00.1)(00:1d.0)(00:1d.1)(00:1d.2)(00:1d.7)
> > > > > > security=apparmor noirqdebug nouveau.msi=1
> > > > > >
> > > > > > Only 8:00.0 and 8:00.1 get passed through without problems, all
> the
> > > > > > USB Controller IDs are not correctly passed through and get a
> > > > > > exclamation mark within the win7 device manager ("could not be
> > > > > > started").
> > > > >
> > > > > Ok, but they do get passed in though? As in, QEMU sees them.
> > > > > If you boot a Live Ubuntu/Fedora CD within the guest with the PCI
> > > > > passed in devices do you see them? Meaning lspci shows them?
> > > > >
> > > > >
> > > > > Is the lspci -vvv output in dom0 different from 3.4 vs 3.5?
> > > > >
> > > > > >
> > > > > >
> > > > > > 2nd: After DomU shutdown , Dom0 panics (100% reproducable) -
> > > sorry
> > > > > > that i have no full stacktrace, all i have is a "screenshot" which i
> > > > > > uploaded here:
> > > > > >
> > > http://imageshack.us/photo/my-images/52/img20120724235921.jpg/
> > > > >
> > > > > Ugh, that looks like somebody removed a large chunk of a
> pagetable.
> > > > >
> > > > > Hmm. Are you using dom0_mem=max parameter? If not, can you
> try
> > > > > that and also disable ballooning in the xm/xl config file pls?
> > > > >
> > > > > >
> > > > > >
> > > > > > With 3.4 both issues were not there - everything worked perfectly.
> > > > > > Tell me which debugging info you need, i may be able to re-install
> > > > > > my netconsole to get the full stacktrace (but i had not much luck
> > > > > > with netconsole regarding kernel panics - rarely this info gets sent
> > > > > > before the "panic"...)
> > > >
> > > > So I am able to reproduce this with a Windows 7 with an ATI 4870 and
> > > > an Intel 82574L NIC. The video card still works, but the NIC stopped
> > > > working. Same version of hypervisor/toolstack/etc, only change is the
> > > > kernel (v3.4.6->v3.5).
> > > >
> > > > Time to get my hands greasy with this..
> > >
> > > And its due to a patch I added in v3.4
> > > (cd9db80e5257682a7f7ab245a2459648b3c8d268)
> > > - which did not work properly in v3.4, but with v3.5 got it working
> > > (977f857ca566a1e68045fcbb7cfc9c4acb077cf0) which causes v3.5 to
> now
> > > work
> > > anymore.
> > >
> > > Anyhow, for right now jsut revert
> > > cd9db80e5257682a7f7ab245a2459648b3c8d268
> > > and it should work for you.
> > >
Confirmed, after reverting that commit, VT-d will work fine.
Will you fix this and push it to upstream Linux, Konrad?

> > Also, our team reported a VT-d bug 2 months ago.
> > http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1824
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Regression in kernel 3.5 as Dom0 regarding PCI Passthrough?!
  2012-08-06 16:16   ` Konrad Rzeszutek Wilk
@ 2012-08-20 23:30     ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 20+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-08-20 23:30 UTC (permalink / raw)
  To: Tobias Geiger; +Cc: xen-devel

On Mon, Aug 06, 2012 at 12:16:33PM -0400, Konrad Rzeszutek Wilk wrote:
> On Wed, Jul 25, 2012 at 09:43:57AM -0400, Konrad Rzeszutek Wilk wrote:
> > On Wed, Jul 25, 2012 at 02:30:00PM +0200, Tobias Geiger wrote:
> > > Hi!
> > > 
> > > i notice a serious regression with 3.5 as Dom0 kernel (3.4 was rock
> > > stable):
> > > 
> > > 1st: only the GPU PCI Passthrough works, the PCI USB Controller is
> > > not recognized within the DomU (HVM Win7 64)
> > > Dom0 cmdline is:
> > > ro root=LABEL=dom0root xen-pciback.hide=(08:00.0)(08:00.1)(00:1d.0)(00:1d.1)(00:1d.2)(00:1d.7)
> > > security=apparmor noirqdebug nouveau.msi=1
> > > 
> > > Only 8:00.0 and 8:00.1 get passed through without problems, all the
> > > USB Controller IDs are not correctly passed through and get a
> > > exclamation mark within the win7 device manager ("could not be
> > > started").
> > 
> > Ok, but they do get passed in though? As in, QEMU sees them.
> > If you boot a Live Ubuntu/Fedora CD within the guest with the PCI
> > passed in devices do you see them? Meaning lspci shows them?
> > 
> > 
> > Is the lspci -vvv output in dom0 different from 3.4 vs 3.5?
> > 
> > > 
> > > 
> > > 2nd: After DomU shutdown , Dom0 panics (100% reproducable) - sorry
> > > that i have no full stacktrace, all i have is a "screenshot" which i
> > > uploaded here:
> > > http://imageshack.us/photo/my-images/52/img20120724235921.jpg/
> > 
> > Ugh, that looks like somebody removed a large chunk of a pagetable.
> > 
> > Hmm. Are you using dom0_mem=max parameter? If not, can you try
> > that and also disable ballooning in the xm/xl config file pls?
> > 
> > > 
> > > 
> > > With 3.4 both issues were not there - everything worked perfectly.
> > > Tell me which debugging info you need, i may be able to re-install
> > > my netconsole to get the full stacktrace (but i had not much luck
> > > with netconsole regarding kernel panics - rarely this info gets sent
> > > before the "panic"...)
> 
> So I am able to reproduce this with a Windows 7 with an ATI 4870 and
> an Intel 82574L NIC. The video card still works, but the NIC stopped
> working. Same version of hypervisor/toolstack/etc, only change is the
> kernel (v3.4.6->v3.5).
> 
> Time to get my hands greasy with this..

And its due to a patch I added in v3.4 (cd9db80e5257682a7f7ab245a2459648b3c8d268)
- which did not work properly in v3.4, but with v3.5 got it working
(977f857ca566a1e68045fcbb7cfc9c4acb077cf0) which causes v3.5 to now work
anymore.

Anyhow, for right now jsut revert cd9db80e5257682a7f7ab245a2459648b3c8d268
and it should work for you.
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Regression in kernel 3.5 as Dom0 regarding PCI Passthrough?!
  2012-07-25 13:43 ` Konrad Rzeszutek Wilk
  2012-07-25 14:20   ` Tobias Geiger
  2012-07-25 17:59   ` Tobias Geiger
@ 2012-08-06 16:16   ` Konrad Rzeszutek Wilk
  2012-08-20 23:30     ` Konrad Rzeszutek Wilk
  2 siblings, 1 reply; 20+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-08-06 16:16 UTC (permalink / raw)
  To: Tobias Geiger; +Cc: xen-devel

On Wed, Jul 25, 2012 at 09:43:57AM -0400, Konrad Rzeszutek Wilk wrote:
> On Wed, Jul 25, 2012 at 02:30:00PM +0200, Tobias Geiger wrote:
> > Hi!
> > 
> > i notice a serious regression with 3.5 as Dom0 kernel (3.4 was rock
> > stable):
> > 
> > 1st: only the GPU PCI Passthrough works, the PCI USB Controller is
> > not recognized within the DomU (HVM Win7 64)
> > Dom0 cmdline is:
> > ro root=LABEL=dom0root xen-pciback.hide=(08:00.0)(08:00.1)(00:1d.0)(00:1d.1)(00:1d.2)(00:1d.7)
> > security=apparmor noirqdebug nouveau.msi=1
> > 
> > Only 8:00.0 and 8:00.1 get passed through without problems, all the
> > USB Controller IDs are not correctly passed through and get a
> > exclamation mark within the win7 device manager ("could not be
> > started").
> 
> Ok, but they do get passed in though? As in, QEMU sees them.
> If you boot a Live Ubuntu/Fedora CD within the guest with the PCI
> passed in devices do you see them? Meaning lspci shows them?
> 
> 
> Is the lspci -vvv output in dom0 different from 3.4 vs 3.5?
> 
> > 
> > 
> > 2nd: After DomU shutdown , Dom0 panics (100% reproducable) - sorry
> > that i have no full stacktrace, all i have is a "screenshot" which i
> > uploaded here:
> > http://imageshack.us/photo/my-images/52/img20120724235921.jpg/
> 
> Ugh, that looks like somebody removed a large chunk of a pagetable.
> 
> Hmm. Are you using dom0_mem=max parameter? If not, can you try
> that and also disable ballooning in the xm/xl config file pls?
> 
> > 
> > 
> > With 3.4 both issues were not there - everything worked perfectly.
> > Tell me which debugging info you need, i may be able to re-install
> > my netconsole to get the full stacktrace (but i had not much luck
> > with netconsole regarding kernel panics - rarely this info gets sent
> > before the "panic"...)

So I am able to reproduce this with a Windows 7 with an ATI 4870 and
an Intel 82574L NIC. The video card still works, but the NIC stopped
working. Same version of hypervisor/toolstack/etc, only change is the
kernel (v3.4.6->v3.5).

Time to get my hands greasy with this..

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Regression in kernel 3.5 as Dom0 regarding PCI Passthrough?!
  2012-07-25 17:59   ` Tobias Geiger
@ 2012-07-25 18:09     ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 20+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-07-25 18:09 UTC (permalink / raw)
  To: Tobias Geiger; +Cc: xen-devel

On Wed, Jul 25, 2012 at 07:59:33PM +0200, Tobias Geiger wrote:
> The dom0 panic when shutting down domu also happens with 
> 
> dom0_mem=4096M
> and also with
> dom0_mem=4096M,max:4096M
> 
> and both times:
> pc:~# cat /etc/xen/xl.conf   | grep autoballoon
> autoballoon=0
> 
> 
> :(

OK, so the balloon driver is still being activated somehow.
> 
> > Is the lspci -vvv output in dom0 different from 3.4 vs 3.5?

^^^ anything there?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Regression in kernel 3.5 as Dom0 regarding PCI Passthrough?!
  2012-07-25 13:43 ` Konrad Rzeszutek Wilk
  2012-07-25 14:20   ` Tobias Geiger
@ 2012-07-25 17:59   ` Tobias Geiger
  2012-07-25 18:09     ` Konrad Rzeszutek Wilk
  2012-08-06 16:16   ` Konrad Rzeszutek Wilk
  2 siblings, 1 reply; 20+ messages in thread
From: Tobias Geiger @ 2012-07-25 17:59 UTC (permalink / raw)
  To: xen-devel; +Cc: Konrad Rzeszutek Wilk

The dom0 panic when shutting down domu also happens with 

dom0_mem=4096M
and also with
dom0_mem=4096M,max:4096M

and both times:
pc:~# cat /etc/xen/xl.conf   | grep autoballoon
autoballoon=0


:(

FWIW here is the diff betwen dom0 kernel 3.5 and 3.4:

pc:~# diff /tmp/3.4.config /usr/src/3.5/linux-3.5/.config
3c3
< # Linux/x86_64 3.4.0 Kernel Configuration
---
> # Linux/x86_64 3.5.0 Kernel Configuration
12,16d11
< CONFIG_GENERIC_CMOS_UPDATE=y
< CONFIG_CLOCKSOURCE_WATCHDOG=y
< CONFIG_GENERIC_CLOCKEVENTS=y
< CONFIG_ARCH_CLOCKSOURCE_DATA=y
< CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
31d25
< CONFIG_ARCH_HAS_CPU_IDLE_WAIT=y
33d26
< CONFIG_GENERIC_TIME_VSYSCALL=y
51d43
< # CONFIG_KTIME_SCALAR is not set
52a45
> CONFIG_ARCH_SUPPORTS_UPROBES=y
55a49
> CONFIG_BUILDTIME_EXTABLE_SORT=y
104a99,113
> CONFIG_CLOCKSOURCE_WATCHDOG=y
> CONFIG_ARCH_CLOCKSOURCE_DATA=y
> CONFIG_GENERIC_TIME_VSYSCALL=y
> CONFIG_GENERIC_CLOCKEVENTS=y
> CONFIG_GENERIC_CLOCKEVENTS_BUILD=y
> CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
> CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST=y
> CONFIG_GENERIC_CMOS_UPDATE=y
> 
> #
> # Timers subsystem
> #
> CONFIG_TICK_ONESHOT=y
> CONFIG_NO_HZ=y
> CONFIG_HIGH_RES_TIMERS=y
111a121
> CONFIG_RCU_FANOUT_LEAF=16
145d154
< CONFIG_USER_NS=y
188d196
< # CONFIG_PERF_COUNTERS is not set
210a219
> CONFIG_GENERIC_SMP_IDLE_THREAD=y
222a232,233
> CONFIG_HAVE_ARCH_SECCOMP_FILTER=y
> CONFIG_SECCOMP_FILTER=y
322,326d332
< CONFIG_TICK_ONESHOT=y
< CONFIG_NO_HZ=y
< CONFIG_HIGH_RES_TIMERS=y
< CONFIG_GENERIC_CLOCKEVENTS_BUILD=y
< CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST=y
444a451
> CONFIG_CROSS_MEMORY_ATTACH=y
445a453
> CONFIG_FRONTSWAP=y
486a495,496
> # CONFIG_PM_AUTOSLEEP is not set
> # CONFIG_PM_WAKELOCKS is not set
504c514
< CONFIG_ACPI_PROCESSOR_AGGREGATOR=m
---
> # CONFIG_ACPI_PROCESSOR_AGGREGATOR is not set
540,543c550,553
< CONFIG_X86_PCC_CPUFREQ=m
< CONFIG_X86_ACPI_CPUFREQ=m
< CONFIG_X86_POWERNOW_K8=m
< CONFIG_X86_SPEEDSTEP_CENTRINO=m
---
> # CONFIG_X86_PCC_CPUFREQ is not set
> # CONFIG_X86_ACPI_CPUFREQ is not set
> # CONFIG_X86_POWERNOW_K8 is not set
> # CONFIG_X86_SPEEDSTEP_CENTRINO is not set
621a632
> CONFIG_X86_DEV_DMA_OPS=y
630a642
> CONFIG_XFRM_ALGO=y
773a786
> CONFIG_NETFILTER_XT_TARGET_HMARK=m
891d903
< CONFIG_IP6_NF_QUEUE=m
952d963
< # CONFIG_ECONET is not set
977a989,990
> CONFIG_NET_SCH_CODEL=m
> CONFIG_NET_SCH_FQ_CODEL=m
1018a1032
> CONFIG_BATMAN_ADV_BLA=y
1026d1039
< CONFIG_HAVE_BPF_JIT=y
1082a1096,1097
> CONFIG_NFC_HCI=m
> # CONFIG_NFC_SHDLC is not set
1090a1106
> CONFIG_HAVE_BPF_JIT=y
1253d1268
< # CONFIG_MTD_UBI_DEBUG is not set
1331c1346,1347
< CONFIG_BMP085=m
---
> CONFIG_BMP085=y
> CONFIG_BMP085_I2C=m
1358a1375
> CONFIG_INTEL_MEI=m
1535a1553
> CONFIG_SBP_TARGET=m
1558a1577
> CONFIG_NET_TEAM_MODE_LOADBALANCE=m
1695c1714
< CONFIG_STMMAC_PLATFORM=m
---
> CONFIG_STMMAC_PLATFORM=y
1712a1732,1734
> CONFIG_NET_VENDOR_WIZNET=y
> # CONFIG_WIZNET_W5100 is not set
> # CONFIG_WIZNET_W5300 is not set
1753d1774
< # CONFIG_TR is not set
1809a1831
> CONFIG_INPUT_MATRIXKMAP=m
1837a1860
> CONFIG_KEYBOARD_LM8333=m
1900a1924
> # CONFIG_INPUT_MC13783_PWRBUTTON is not set
2015d2038
< CONFIG_RAMOOPS=m
2139a2163
> CONFIG_GPIO_ICH=m
2321a2346
> CONFIG_SENSORS_INA2XX=m
2340a2366
> CONFIG_SENSORS_MC13783_ADC=m
2370a2397
> CONFIG_IE6XX_WDT=m
2421c2448
< CONFIG_MFD_CORE=m
---
> CONFIG_MFD_CORE=y
2427a2455
> CONFIG_MFD_LM3533=m
2442a2471
> # CONFIG_MFD_MAX77693 is not set
2447c2476
< CONFIG_MFD_WM8400=m
---
> # CONFIG_MFD_WM8400 is not set
2453a2483,2485
> CONFIG_MFD_MC13783=m
> CONFIG_MFD_MC13XXX=m
> CONFIG_MFD_MC13XXX_I2C=m
2457a2490
> CONFIG_LPC_ICH=y
2464a2498
> # CONFIG_MFD_PALMAS is not set
2472a2507,2509
> CONFIG_REGULATOR_MC13XXX_CORE=m
> CONFIG_REGULATOR_MC13783=m
> CONFIG_REGULATOR_MC13892=m
2485d2521
< CONFIG_REGULATOR_WM8400=m
2683c2719
< CONFIG_VIDEO_EM28XX_RC=y
---
> CONFIG_VIDEO_EM28XX_RC=m
2690d2725
< CONFIG_USB_ET61X251=m
2782a2818,2820
> CONFIG_DRM_AST=m
> # CONFIG_DRM_MGAG200 is not set
> CONFIG_DRM_CIRRUS_QEMU=m
2853a2892,2894
> CONFIG_FB_AUO_K190X=m
> CONFIG_FB_AUO_K1900=m
> CONFIG_FB_AUO_K1901=m
2859a2901
> CONFIG_BACKLIGHT_LM3533=m
2986d3027
< # CONFIG_SND_HDA_ENABLE_REALTEK_QUIRKS is not set
3050a3092
> CONFIG_SND_SOC_CS42L52=m
3058a3101
> CONFIG_SND_SOC_LM49453=m
3079d3121
< CONFIG_SND_SOC_WM8400=m
3117a3160,3161
> CONFIG_SND_SOC_MC13783=m
> CONFIG_SND_SOC_ML26124=m
3118a3163
> CONFIG_SND_SIMPLE_CARD=m
3137,3140d3181
< CONFIG_HID_SUPPORT=y
< CONFIG_HID=y
< # CONFIG_HID_BATTERY_STRENGTH is not set
< # CONFIG_HIDRAW is not set
3143c3184
< # USB Input Devices
---
> # HID support
3145,3147c3186,3189
< CONFIG_USB_HID=y
< # CONFIG_HID_PID is not set
< # CONFIG_USB_HIDDEV is not set
---
> CONFIG_HID=y
> # CONFIG_HID_BATTERY_STRENGTH is not set
> # CONFIG_HIDRAW is not set
> CONFIG_HID_GENERIC=m
3155a3198
> CONFIG_HID_AUREAL=m
3216a3260,3266
> 
> #
> # USB HID support
> #
> CONFIG_USB_HID=y
> # CONFIG_HID_PID is not set
> # CONFIG_USB_HIDDEV is not set
3230,3231d3279
< # CONFIG_USB_DEVICEFS is not set
< # CONFIG_USB_DEVICE_CLASS is not set
3269a3318,3321
> CONFIG_USB_CHIPIDEA=m
> # CONFIG_USB_CHIPIDEA_UDC is not set
> # CONFIG_USB_CHIPIDEA_HOST is not set
> # CONFIG_USB_CHIPIDEA_DEBUG is not set
3382a3435
> CONFIG_USB_SERIAL_QT2=m
3408a3462,3466
> 
> #
> # USB Physical Layer drivers
> #
> CONFIG_USB_ISP1301=m
3414a3473,3476
> 
> #
> # USB Peripheral Controller
> #
3419d3480
< CONFIG_USB_CI13XXX_PCI=m
3443a3505
> CONFIG_USB_GADGET_TARGET=m
3500a3563
> CONFIG_LEDS_LM3533=m
3514a3578
> CONFIG_LEDS_MC13783=m
3530a3595
> CONFIG_LEDS_TRIGGER_TRANSIENT=m
3618a3684
> CONFIG_RTC_DRV_MC13XXX=m
3654a3721
> # CONFIG_VIRTIO_MMIO_CMDLINE_DEVICES is not set
3699,3700d3765
< # CONFIG_USB_SERIAL_QUATECH_USB2 is not set
< # CONFIG_VME_BUS is not set
3702d3766
< # CONFIG_IIO is not set
3737d3800
< CONFIG_INTEL_MEI=m
3760a3824,3833
> CONFIG_IPACK_BUS=m
> CONFIG_BOARD_TPCI200=m
> CONFIG_SERIAL_IPOCTAL=m
> CONFIG_WIMAX_GDM72XX=m
> # CONFIG_WIMAX_GDM72XX_QOS is not set
> # CONFIG_WIMAX_GDM72XX_K_MODE is not set
> # CONFIG_WIMAX_GDM72XX_WIMAX2 is not set
> CONFIG_WIMAX_GDM72XX_USB=y
> # CONFIG_WIMAX_GDM72XX_SDIO is not set
> # CONFIG_WIMAX_GDM72XX_USB_PM is not set
3787c3860
< CONFIG_INTEL_MENLOW=m
---
> # CONFIG_INTEL_MENLOW is not set
3843a3917,3925
> CONFIG_EXTCON=m
> 
> #
> # Extcon Device Drivers
> #
> CONFIG_EXTCON_GPIO=m
> # CONFIG_MEMORY is not set
> # CONFIG_IIO is not set
> # CONFIG_VME_BUS is not set
3989d4070
< # CONFIG_UBIFS_FS_XATTR is not set
3993d4073
< # CONFIG_UBIFS_FS_DEBUG is not set
4015a4096
> CONFIG_NFS_V2=y
4087a4169,4179
> CONFIG_NLS_MAC_ROMAN=m
> CONFIG_NLS_MAC_CELTIC=m
> CONFIG_NLS_MAC_CENTEURO=m
> CONFIG_NLS_MAC_CROATIAN=m
> CONFIG_NLS_MAC_CYRILLIC=m
> CONFIG_NLS_MAC_GAELIC=m
> CONFIG_NLS_MAC_GREEK=m
> CONFIG_NLS_MAC_ICELAND=m
> CONFIG_NLS_MAC_INUIT=m
> CONFIG_NLS_MAC_ROMANIAN=m
> CONFIG_NLS_MAC_TURKISH=m
4102a4195
> # CONFIG_READABLE_ASM is not set
4110a4204,4205
> # CONFIG_PANIC_ON_OOPS is not set
> CONFIG_PANIC_ON_OOPS_VALUE=0
4188a4284,4285
> # CONFIG_UPROBE_EVENT is not set
> # CONFIG_PROBE_EVENTS is not set
4393a4491
> CONFIG_HAVE_KVM_MSI=y
4406a4505,4506
> CONFIG_GENERIC_STRNCPY_FROM_USER=y
> CONFIG_GENERIC_STRNLEN_USER=y
4460a4561
> # CONFIG_DDR is not set





Am Mittwoch 25 Juli 2012, 15:43:57 schrieb Konrad Rzeszutek Wilk:
> On Wed, Jul 25, 2012 at 02:30:00PM +0200, Tobias Geiger wrote:
> > Hi!
> > 
> > i notice a serious regression with 3.5 as Dom0 kernel (3.4 was rock
> > stable):
> > 
> > 1st: only the GPU PCI Passthrough works, the PCI USB Controller is
> > not recognized within the DomU (HVM Win7 64)
> > Dom0 cmdline is:
> > ro root=LABEL=dom0root
> > xen-pciback.hide=(08:00.0)(08:00.1)(00:1d.0)(00:1d.1)(00:1d.2)(00:1d.7)
> > security=apparmor noirqdebug nouveau.msi=1
> > 
> > Only 8:00.0 and 8:00.1 get passed through without problems, all the
> > USB Controller IDs are not correctly passed through and get a
> > exclamation mark within the win7 device manager ("could not be
> > started").
> 
> Ok, but they do get passed in though? As in, QEMU sees them.
> If you boot a Live Ubuntu/Fedora CD within the guest with the PCI
> passed in devices do you see them? Meaning lspci shows them?
> 
> 
> Is the lspci -vvv output in dom0 different from 3.4 vs 3.5?
> 
> > 2nd: After DomU shutdown , Dom0 panics (100% reproducable) - sorry
> > that i have no full stacktrace, all i have is a "screenshot" which i
> > uploaded here:
> > http://imageshack.us/photo/my-images/52/img20120724235921.jpg/
> 
> Ugh, that looks like somebody removed a large chunk of a pagetable.
> 
> Hmm. Are you using dom0_mem=max parameter? If not, can you try
> that and also disable ballooning in the xm/xl config file pls?
> 
> > With 3.4 both issues were not there - everything worked perfectly.
> > Tell me which debugging info you need, i may be able to re-install
> > my netconsole to get the full stacktrace (but i had not much luck
> > with netconsole regarding kernel panics - rarely this info gets sent
> > before the "panic"...)
> > 
> > Greetings
> > Tobias
> > 
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xen.org
> > http://lists.xen.org/xen-devel
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Regression in kernel 3.5 as Dom0 regarding PCI Passthrough?!
  2012-07-25 14:20   ` Tobias Geiger
@ 2012-07-25 14:32     ` Tobias Geiger
  0 siblings, 0 replies; 20+ messages in thread
From: Tobias Geiger @ 2012-07-25 14:32 UTC (permalink / raw)
  To: xen-devel

It will take some time for me to re-test with "dom0_mem=4096M" (i.e. 
w/o a "max" range), because i forgot a "panic=X" command on the Dom0 
cmdline, so right now the machine is waiting for me to press the 
reset-button ... :(

I'll post my results asap.

Greetings

Am 25.07.2012 16:20, schrieb Tobias Geiger:
> Am 25.07.2012 15:43, schrieb Konrad Rzeszutek Wilk:
>> On Wed, Jul 25, 2012 at 02:30:00PM +0200, Tobias Geiger wrote:
>>> Hi!
>>>
>>> i notice a serious regression with 3.5 as Dom0 kernel (3.4 was rock
>>> stable):
>>>
>>> 1st: only the GPU PCI Passthrough works, the PCI USB Controller is
>>> not recognized within the DomU (HVM Win7 64)
>>> Dom0 cmdline is:
>>> ro root=LABEL=dom0root 
>>> xen-pciback.hide=(08:00.0)(08:00.1)(00:1d.0)(00:1d.1)(00:1d.2)(00:1d.7)
>>> security=apparmor noirqdebug nouveau.msi=1
>>>
>>> Only 8:00.0 and 8:00.1 get passed through without problems, all the
>>> USB Controller IDs are not correctly passed through and get a
>>> exclamation mark within the win7 device manager ("could not be
>>> started").
>>
>> Ok, but they do get passed in though? As in, QEMU sees them.
>> If you boot a Live Ubuntu/Fedora CD within the guest with the PCI
>> passed in devices do you see them? Meaning lspci shows them?
>>
>
> Yes, they get passed through:
>
> pc:~# xl pci-list win
> Vdev Device
> 05.0 0000:08:00.0
> 06.0 0000:08:00.1
> 07.0 0000:00:1d.0
> 08.0 0000:00:1d.1
> 09.0 0000:00:1d.2
> 0a.0 0000:00:1d.7
>
> but *:1d.* gets a exclamation mark within win7...
>
> sorry i have no linux hvm at hand right now to do a lspci.
>
>>
>> Is the lspci -vvv output in dom0 different from 3.4 vs 3.5?
>>
>>>
>>>
>>> 2nd: After DomU shutdown , Dom0 panics (100% reproducable) - sorry
>>> that i have no full stacktrace, all i have is a "screenshot" which 
>>> i
>>> uploaded here:
>>> http://imageshack.us/photo/my-images/52/img20120724235921.jpg/
>>
>> Ugh, that looks like somebody removed a large chunk of a pagetable.
>>
>> Hmm. Are you using dom0_mem=max parameter? If not, can you try
>> that and also disable ballooning in the xm/xl config file pls?
>
> i already have/had:
> xen_commandline        : watchdog dom0_mem=4096M,max:7680M 
> dom0_vcpus_pin
>
> but autoballooning was on in xl.conf, i disabled it:
>
> but still i get a panic as soon as domu is shut down:
> (luckily i happend to press "enter" on the dmesg command exactly at
> the right time to get the full stacktrace just before my ssh
> connection died...)
>
> pc:~# dmesg
> [  206.553547] xen-blkback:backend/vbd/1/832: prepare for reconnect
> [  207.421690] xen-blkback:backend/vbd/1/768: prepare for reconnect
> [  208.248271] vif vif-1-0: 2 reading script
> [  208.252882] br0: port 3(vif1.0) entered disabled state
> [  208.253584] br0: port 3(vif1.0) entered disabled state
> [  213.115052] ------------[ cut here ]------------
> [  213.115071] kernel BUG at drivers/xen/balloon.c:359!
> [  213.115079] invalid opcode: 0000 [#1] PREEMPT SMP
> [  213.115091] CPU 4
> [  213.115094] Modules linked in: uvcvideo snd_seq_midi snd_usb_audio
> snd_usbmidi_lib snd_hwdep snd_rawmidi videobuf2_vm
> alloc videobuf2_memops videobuf2_core videodev joydev hid_generic
> gpio_ich [last unloaded: scsi_wait_scan]
> [  213.115124]
> [  213.115126] Pid: 1191, comm: kworker/4:1 Not tainted 3.5.0 #2
>           /DX58SO
> [  213.115135] RIP: e030:[<ffffffff81448105>]  [<ffffffff81448105>]
> balloon_process+0x385/0x3a0
> [  213.115146] RSP: e02b:ffff88012e7f7dc0  EFLAGS: 00010213
> [  213.115150] RAX: 0000000220be8000 RBX: 0000000000000000 RCX:
> 0000000000000008
> [  213.115158] RDX: ffff88010bb02000 RSI: 00000000000001cb RDI:
> 000000000020efcb
> [  213.115164] RBP: ffff88012e7f7e20 R08: ffff88014068e140 R09:
> 0000000000000001
> [  213.115169] R10: 0000000000000001 R11: 0000000000000000 R12:
> 0000160000000000
> [  213.115175] R13: 0000000000000001 R14: 000000000020efcb R15:
> ffffea00083bf2c0
> [  213.115183] FS:  00007f31ea7f7700(0000) GS:ffff880140680000(0000)
> knlGS:0000000000000000
> [  213.115189] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
> [  213.115193] CR2: 00007f31ea193986 CR3: 0000000001e0c000 CR4:
> 0000000000002660
> [  213.115199] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [  213.115204] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
> 0000000000000400
> [  213.115210] Process kworker/4:1 (pid: 1191, threadinfo
> ffff88012e7f6000, task ffff88012ec65b00)
> [  213.115216] Stack:
> [  213.115218]  000000000008a6ba 0000000000000001 ffffffff8200ea80
> 0000000000000001
> [  213.115331]  0000000000000000 0000000000007ff0 ffff88012e7f7e00
> ffff8801312fb100
> [  213.115341]  ffff880140697000 ffff88014068e140 0000000000000000
> ffffffff81e587c0
> [  213.115350] Call Trace:
> [  213.115356]  [<ffffffff8106752b>] process_one_work+0x12b/0x450
> [  213.115362]  [<ffffffff81447d80>] ? 
> decrease_reservation+0x320/0x320
> [  213.115368]  [<ffffffff810688ae>] worker_thread+0x12e/0x2d0
> [  213.115374]  [<ffffffff81068780>] ? 
> manage_workers.isra.26+0x1f0/0x1f0
> [  213.115380]  [<ffffffff8106db6e>] kthread+0x8e/0xa0
> [  213.115386]  [<ffffffff8184e324>] kernel_thread_helper+0x4/0x10
> [  213.115394]  [<ffffffff8184c7bc>] ? retint_restore_args+0x5/0x6
> [  213.115400]  [<ffffffff8184e320>] ? gs_change+0x13/0x13
> [  213.115406] Code: 01 15 80 69 bc 00 48 29 d0 48 89 05 7e 69 bc 00
> e9 31 fd ff ff 0f 0b 0f 0b 4c 89 f7 e8 35 33 bc ff
> 48 83 f8 ff 0f 84 2b fe ff ff <0f> 0b 66 0f 1f 84 00 00 00 00 00 48
> 83 c1 01 e9 c2 fd ff ff 0f
> [  213.115509] RIP  [<ffffffff81448105>] balloon_process+0x385/0x3a0
> [  213.115521]  RSP <ffff88012e7f7dc0>
> [  213.126036] ---[ end trace 38b78364333593e7 ]---
> [  213.126061] BUG: unable to handle kernel paging request at
> fffffffffffffff8
> [  213.126072] IP: [<ffffffff8106e07c>] kthread_data+0xc/0x20
> [  213.126079] PGD 1e0e067 PUD 1e0f067 PMD 0
> [  213.126087] Oops: 0000 [#2] PREEMPT SMP
> [  213.126094] CPU 4
> [  213.126097] Modules linked in: uvcvideo snd_seq_midi snd_usb_audio
> snd_usbmidi_lib snd_hwdep snd_rawmidi videobuf2_vm
> alloc videobuf2_memops videobuf2_core videodev joydev hid_generic
> gpio_ich [last unloaded: scsi_wait_scan]
> [  213.126151]
> [  213.126154] Pid: 1191, comm: kworker/4:1 Tainted: G      D
> 3.5.0 #2                  /DX58SO
> [  213.126175] RIP: e030:[<ffffffff8106e07c>]  [<ffffffff8106e07c>]
> kthread_data+0xc/0x20
> [  213.126192] RSP: e02b:ffff88012e7f7a90  EFLAGS: 00010092
> [  213.126203] RAX: 0000000000000000 RBX: 0000000000000004 RCX:
> 0000000000000004
> [  213.126212] RDX: ffffffff81fcba40 RSI: 0000000000000004 RDI:
> ffff88012ec65b00
> [  213.126225] RBP: ffff88012e7f7aa8 R08: 0000000000989680 R09:
> ffffffff81fcba40
> [  213.126239] R10: ffffffff813b0d60 R11: 0000000000000000 R12:
> ffff8801406936c0
> [  213.126254] R13: 0000000000000004 R14: ffff88012ec65af0 R15:
> ffff88012ec65b00
> [  213.126270] FS:  00007f31ea7f7700(0000) GS:ffff880140680000(0000)
> knlGS:0000000000000000
> [  213.126284] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
> [  213.126296] CR2: fffffffffffffff8 CR3: 0000000001e0c000 CR4:
> 0000000000002660
> [  213.126310] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [  213.126325] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
> 0000000000000400
> [  213.126337] Process kworker/4:1 (pid: 1191, threadinfo
> ffff88012e7f6000, task ffff88012ec65b00)
> [  213.126354] Stack:
> [  213.126360]  ffffffff810698d0 ffff88012e7f7aa8 ffff88012ec65ed8
> ffff88012e7f7b18
> [  213.126381]  ffffffff8184ad32 ffff88012e7f7fd8 ffff88012ec65b00
> ffff88012e7f7fd8
> [  213.126403]  ffff88012e7f7fd8 ffff8801312f94e0 ffff88012ec65b00
> ffff88012ec660f0
> [  213.126422] Call Trace:
> [  213.126427]  [<ffffffff810698d0>] ? wq_worker_sleeping+0x10/0xa0
> [  213.126435]  [<ffffffff8184ad32>] __schedule+0x592/0x7d0
> [  213.126443]  [<ffffffff8184b094>] schedule+0x24/0x70
> [  213.126449]  [<ffffffff81051582>] do_exit+0x5b2/0x910
> [  213.126457]  [<ffffffff8183e941>] ? printk+0x48/0x4a
> [  213.126464]  [<ffffffff8100ad02>] ? check_events+0x12/0x20
> [  213.126472]  [<ffffffff810175a1>] oops_end+0x71/0xa0
> [  213.126478]  [<ffffffff81017713>] die+0x53/0x80
> [  213.126484]  [<ffffffff81014418>] do_trap+0xb8/0x160
> [  213.126490]  [<ffffffff81014713>] do_invalid_op+0xa3/0xb0
> [  213.126499]  [<ffffffff81448105>] ? balloon_process+0x385/0x3a0
> [  213.127254]  [<ffffffff81085f52>] ? load_balance+0xd2/0x800
> [  213.127940]  [<ffffffff8108116d>] ? cpuacct_charge+0x6d/0xb0
> [  213.128621]  [<ffffffff8184e19b>] invalid_op+0x1b/0x20
> [  213.129304]  [<ffffffff81448105>] ? balloon_process+0x385/0x3a0
> [  213.129962]  [<ffffffff8106752b>] process_one_work+0x12b/0x450
> [  213.130590]  [<ffffffff81447d80>] ? 
> decrease_reservation+0x320/0x320
> [  213.131226]  [<ffffffff810688ae>] worker_thread+0x12e/0x2d0
> [  213.131856]  [<ffffffff81068780>] ? 
> manage_workers.isra.26+0x1f0/0x1f0
> [  213.132482]  [<ffffffff8106db6e>] kthread+0x8e/0xa0
> [  213.133099]  [<ffffffff8184e324>] kernel_thread_helper+0x4/0x10
> [  213.133718]  [<ffffffff8184c7bc>] ? retint_restore_args+0x5/0x6
> [  213.134338]  [<ffffffff8184e320>] ? gs_change+0x13/0x13
> [  213.134954] Code: e0 ff ff 01 48 8b 80 38 e0 ff ff a8 08 0f 84 3d
> ff ff ff e8 97 cf 7d 00 e9 33 ff ff ff 66 90 48 8b
> 87 80 03 00 00 55 48 89 e5 5d <48> 8b 40 f8 c3 66 66 66 66 66 66 2e
> 0f 1f 84 00 00 00 00 00 55
> [  213.135647] RIP  [<ffffffff8106e07c>] kthread_data+0xc/0x20
> [  213.136320]  RSP <ffff88012e7f7a90>
> [  213.136967] CR2: fffffffffffffff8
> [  213.137610] ---[ end trace 38b78364333593e8 ]---
> [  213.137611] Fixing recursive fault but reboot is needed!
>
> seems like a ballooning thing - i will try to with only a "max"
> setting, not a range ... stay tuned ;)
>
>
>>
>>>
>>>
>>> With 3.4 both issues were not there - everything worked perfectly.
>>> Tell me which debugging info you need, i may be able to re-install
>>> my netconsole to get the full stacktrace (but i had not much luck
>>> with netconsole regarding kernel panics - rarely this info gets 
>>> sent
>>> before the "panic"...)
>>>
>>> Greetings
>>> Tobias
>>>
>>> _______________________________________________
>>> Xen-devel mailing list
>>> Xen-devel@lists.xen.org
>>> http://lists.xen.org/xen-devel
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Regression in kernel 3.5 as Dom0 regarding PCI Passthrough?!
  2012-07-25 13:43 ` Konrad Rzeszutek Wilk
@ 2012-07-25 14:20   ` Tobias Geiger
  2012-07-25 14:32     ` Tobias Geiger
  2012-07-25 17:59   ` Tobias Geiger
  2012-08-06 16:16   ` Konrad Rzeszutek Wilk
  2 siblings, 1 reply; 20+ messages in thread
From: Tobias Geiger @ 2012-07-25 14:20 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel

Am 25.07.2012 15:43, schrieb Konrad Rzeszutek Wilk:
> On Wed, Jul 25, 2012 at 02:30:00PM +0200, Tobias Geiger wrote:
>> Hi!
>>
>> i notice a serious regression with 3.5 as Dom0 kernel (3.4 was rock
>> stable):
>>
>> 1st: only the GPU PCI Passthrough works, the PCI USB Controller is
>> not recognized within the DomU (HVM Win7 64)
>> Dom0 cmdline is:
>> ro root=LABEL=dom0root 
>> xen-pciback.hide=(08:00.0)(08:00.1)(00:1d.0)(00:1d.1)(00:1d.2)(00:1d.7)
>> security=apparmor noirqdebug nouveau.msi=1
>>
>> Only 8:00.0 and 8:00.1 get passed through without problems, all the
>> USB Controller IDs are not correctly passed through and get a
>> exclamation mark within the win7 device manager ("could not be
>> started").
>
> Ok, but they do get passed in though? As in, QEMU sees them.
> If you boot a Live Ubuntu/Fedora CD within the guest with the PCI
> passed in devices do you see them? Meaning lspci shows them?
>

Yes, they get passed through:

pc:~# xl pci-list win
Vdev Device
05.0 0000:08:00.0
06.0 0000:08:00.1
07.0 0000:00:1d.0
08.0 0000:00:1d.1
09.0 0000:00:1d.2
0a.0 0000:00:1d.7

but *:1d.* gets a exclamation mark within win7...

sorry i have no linux hvm at hand right now to do a lspci.

>
> Is the lspci -vvv output in dom0 different from 3.4 vs 3.5?
>
>>
>>
>> 2nd: After DomU shutdown , Dom0 panics (100% reproducable) - sorry
>> that i have no full stacktrace, all i have is a "screenshot" which i
>> uploaded here:
>> http://imageshack.us/photo/my-images/52/img20120724235921.jpg/
>
> Ugh, that looks like somebody removed a large chunk of a pagetable.
>
> Hmm. Are you using dom0_mem=max parameter? If not, can you try
> that and also disable ballooning in the xm/xl config file pls?

i already have/had:
xen_commandline        : watchdog dom0_mem=4096M,max:7680M 
dom0_vcpus_pin

but autoballooning was on in xl.conf, i disabled it:

but still i get a panic as soon as domu is shut down:
(luckily i happend to press "enter" on the dmesg command exactly at the 
right time to get the full stacktrace just before my ssh connection 
died...)

pc:~# dmesg
[  206.553547] xen-blkback:backend/vbd/1/832: prepare for reconnect
[  207.421690] xen-blkback:backend/vbd/1/768: prepare for reconnect
[  208.248271] vif vif-1-0: 2 reading script
[  208.252882] br0: port 3(vif1.0) entered disabled state
[  208.253584] br0: port 3(vif1.0) entered disabled state
[  213.115052] ------------[ cut here ]------------
[  213.115071] kernel BUG at drivers/xen/balloon.c:359!
[  213.115079] invalid opcode: 0000 [#1] PREEMPT SMP
[  213.115091] CPU 4
[  213.115094] Modules linked in: uvcvideo snd_seq_midi snd_usb_audio 
snd_usbmidi_lib snd_hwdep snd_rawmidi videobuf2_vm
alloc videobuf2_memops videobuf2_core videodev joydev hid_generic 
gpio_ich [last unloaded: scsi_wait_scan]
[  213.115124]
[  213.115126] Pid: 1191, comm: kworker/4:1 Not tainted 3.5.0 #2        
          /DX58SO
[  213.115135] RIP: e030:[<ffffffff81448105>]  [<ffffffff81448105>] 
balloon_process+0x385/0x3a0
[  213.115146] RSP: e02b:ffff88012e7f7dc0  EFLAGS: 00010213
[  213.115150] RAX: 0000000220be8000 RBX: 0000000000000000 RCX: 
0000000000000008
[  213.115158] RDX: ffff88010bb02000 RSI: 00000000000001cb RDI: 
000000000020efcb
[  213.115164] RBP: ffff88012e7f7e20 R08: ffff88014068e140 R09: 
0000000000000001
[  213.115169] R10: 0000000000000001 R11: 0000000000000000 R12: 
0000160000000000
[  213.115175] R13: 0000000000000001 R14: 000000000020efcb R15: 
ffffea00083bf2c0
[  213.115183] FS:  00007f31ea7f7700(0000) GS:ffff880140680000(0000) 
knlGS:0000000000000000
[  213.115189] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[  213.115193] CR2: 00007f31ea193986 CR3: 0000000001e0c000 CR4: 
0000000000002660
[  213.115199] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
0000000000000000
[  213.115204] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 
0000000000000400
[  213.115210] Process kworker/4:1 (pid: 1191, threadinfo 
ffff88012e7f6000, task ffff88012ec65b00)
[  213.115216] Stack:
[  213.115218]  000000000008a6ba 0000000000000001 ffffffff8200ea80 
0000000000000001
[  213.115331]  0000000000000000 0000000000007ff0 ffff88012e7f7e00 
ffff8801312fb100
[  213.115341]  ffff880140697000 ffff88014068e140 0000000000000000 
ffffffff81e587c0
[  213.115350] Call Trace:
[  213.115356]  [<ffffffff8106752b>] process_one_work+0x12b/0x450
[  213.115362]  [<ffffffff81447d80>] ? decrease_reservation+0x320/0x320
[  213.115368]  [<ffffffff810688ae>] worker_thread+0x12e/0x2d0
[  213.115374]  [<ffffffff81068780>] ? 
manage_workers.isra.26+0x1f0/0x1f0
[  213.115380]  [<ffffffff8106db6e>] kthread+0x8e/0xa0
[  213.115386]  [<ffffffff8184e324>] kernel_thread_helper+0x4/0x10
[  213.115394]  [<ffffffff8184c7bc>] ? retint_restore_args+0x5/0x6
[  213.115400]  [<ffffffff8184e320>] ? gs_change+0x13/0x13
[  213.115406] Code: 01 15 80 69 bc 00 48 29 d0 48 89 05 7e 69 bc 00 e9 
31 fd ff ff 0f 0b 0f 0b 4c 89 f7 e8 35 33 bc ff
48 83 f8 ff 0f 84 2b fe ff ff <0f> 0b 66 0f 1f 84 00 00 00 00 00 48 83 
c1 01 e9 c2 fd ff ff 0f
[  213.115509] RIP  [<ffffffff81448105>] balloon_process+0x385/0x3a0
[  213.115521]  RSP <ffff88012e7f7dc0>
[  213.126036] ---[ end trace 38b78364333593e7 ]---
[  213.126061] BUG: unable to handle kernel paging request at 
fffffffffffffff8
[  213.126072] IP: [<ffffffff8106e07c>] kthread_data+0xc/0x20
[  213.126079] PGD 1e0e067 PUD 1e0f067 PMD 0
[  213.126087] Oops: 0000 [#2] PREEMPT SMP
[  213.126094] CPU 4
[  213.126097] Modules linked in: uvcvideo snd_seq_midi snd_usb_audio 
snd_usbmidi_lib snd_hwdep snd_rawmidi videobuf2_vm
alloc videobuf2_memops videobuf2_core videodev joydev hid_generic 
gpio_ich [last unloaded: scsi_wait_scan]
[  213.126151]
[  213.126154] Pid: 1191, comm: kworker/4:1 Tainted: G      D      
3.5.0 #2                  /DX58SO
[  213.126175] RIP: e030:[<ffffffff8106e07c>]  [<ffffffff8106e07c>] 
kthread_data+0xc/0x20
[  213.126192] RSP: e02b:ffff88012e7f7a90  EFLAGS: 00010092
[  213.126203] RAX: 0000000000000000 RBX: 0000000000000004 RCX: 
0000000000000004
[  213.126212] RDX: ffffffff81fcba40 RSI: 0000000000000004 RDI: 
ffff88012ec65b00
[  213.126225] RBP: ffff88012e7f7aa8 R08: 0000000000989680 R09: 
ffffffff81fcba40
[  213.126239] R10: ffffffff813b0d60 R11: 0000000000000000 R12: 
ffff8801406936c0
[  213.126254] R13: 0000000000000004 R14: ffff88012ec65af0 R15: 
ffff88012ec65b00
[  213.126270] FS:  00007f31ea7f7700(0000) GS:ffff880140680000(0000) 
knlGS:0000000000000000
[  213.126284] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[  213.126296] CR2: fffffffffffffff8 CR3: 0000000001e0c000 CR4: 
0000000000002660
[  213.126310] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
0000000000000000
[  213.126325] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 
0000000000000400
[  213.126337] Process kworker/4:1 (pid: 1191, threadinfo 
ffff88012e7f6000, task ffff88012ec65b00)
[  213.126354] Stack:
[  213.126360]  ffffffff810698d0 ffff88012e7f7aa8 ffff88012ec65ed8 
ffff88012e7f7b18
[  213.126381]  ffffffff8184ad32 ffff88012e7f7fd8 ffff88012ec65b00 
ffff88012e7f7fd8
[  213.126403]  ffff88012e7f7fd8 ffff8801312f94e0 ffff88012ec65b00 
ffff88012ec660f0
[  213.126422] Call Trace:
[  213.126427]  [<ffffffff810698d0>] ? wq_worker_sleeping+0x10/0xa0
[  213.126435]  [<ffffffff8184ad32>] __schedule+0x592/0x7d0
[  213.126443]  [<ffffffff8184b094>] schedule+0x24/0x70
[  213.126449]  [<ffffffff81051582>] do_exit+0x5b2/0x910
[  213.126457]  [<ffffffff8183e941>] ? printk+0x48/0x4a
[  213.126464]  [<ffffffff8100ad02>] ? check_events+0x12/0x20
[  213.126472]  [<ffffffff810175a1>] oops_end+0x71/0xa0
[  213.126478]  [<ffffffff81017713>] die+0x53/0x80
[  213.126484]  [<ffffffff81014418>] do_trap+0xb8/0x160
[  213.126490]  [<ffffffff81014713>] do_invalid_op+0xa3/0xb0
[  213.126499]  [<ffffffff81448105>] ? balloon_process+0x385/0x3a0
[  213.127254]  [<ffffffff81085f52>] ? load_balance+0xd2/0x800
[  213.127940]  [<ffffffff8108116d>] ? cpuacct_charge+0x6d/0xb0
[  213.128621]  [<ffffffff8184e19b>] invalid_op+0x1b/0x20
[  213.129304]  [<ffffffff81448105>] ? balloon_process+0x385/0x3a0
[  213.129962]  [<ffffffff8106752b>] process_one_work+0x12b/0x450
[  213.130590]  [<ffffffff81447d80>] ? decrease_reservation+0x320/0x320
[  213.131226]  [<ffffffff810688ae>] worker_thread+0x12e/0x2d0
[  213.131856]  [<ffffffff81068780>] ? 
manage_workers.isra.26+0x1f0/0x1f0
[  213.132482]  [<ffffffff8106db6e>] kthread+0x8e/0xa0
[  213.133099]  [<ffffffff8184e324>] kernel_thread_helper+0x4/0x10
[  213.133718]  [<ffffffff8184c7bc>] ? retint_restore_args+0x5/0x6
[  213.134338]  [<ffffffff8184e320>] ? gs_change+0x13/0x13
[  213.134954] Code: e0 ff ff 01 48 8b 80 38 e0 ff ff a8 08 0f 84 3d ff 
ff ff e8 97 cf 7d 00 e9 33 ff ff ff 66 90 48 8b
87 80 03 00 00 55 48 89 e5 5d <48> 8b 40 f8 c3 66 66 66 66 66 66 2e 0f 
1f 84 00 00 00 00 00 55
[  213.135647] RIP  [<ffffffff8106e07c>] kthread_data+0xc/0x20
[  213.136320]  RSP <ffff88012e7f7a90>
[  213.136967] CR2: fffffffffffffff8
[  213.137610] ---[ end trace 38b78364333593e8 ]---
[  213.137611] Fixing recursive fault but reboot is needed!

seems like a ballooning thing - i will try to with only a "max" 
setting, not a range ... stay tuned ;)


>
>>
>>
>> With 3.4 both issues were not there - everything worked perfectly.
>> Tell me which debugging info you need, i may be able to re-install
>> my netconsole to get the full stacktrace (but i had not much luck
>> with netconsole regarding kernel panics - rarely this info gets sent
>> before the "panic"...)
>>
>> Greetings
>> Tobias
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org
>> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Regression in kernel 3.5 as Dom0 regarding PCI Passthrough?!
  2012-07-25 12:30 Tobias Geiger
@ 2012-07-25 13:43 ` Konrad Rzeszutek Wilk
  2012-07-25 14:20   ` Tobias Geiger
                     ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-07-25 13:43 UTC (permalink / raw)
  To: Tobias Geiger; +Cc: xen-devel

On Wed, Jul 25, 2012 at 02:30:00PM +0200, Tobias Geiger wrote:
> Hi!
> 
> i notice a serious regression with 3.5 as Dom0 kernel (3.4 was rock
> stable):
> 
> 1st: only the GPU PCI Passthrough works, the PCI USB Controller is
> not recognized within the DomU (HVM Win7 64)
> Dom0 cmdline is:
> ro root=LABEL=dom0root xen-pciback.hide=(08:00.0)(08:00.1)(00:1d.0)(00:1d.1)(00:1d.2)(00:1d.7)
> security=apparmor noirqdebug nouveau.msi=1
> 
> Only 8:00.0 and 8:00.1 get passed through without problems, all the
> USB Controller IDs are not correctly passed through and get a
> exclamation mark within the win7 device manager ("could not be
> started").

Ok, but they do get passed in though? As in, QEMU sees them.
If you boot a Live Ubuntu/Fedora CD within the guest with the PCI
passed in devices do you see them? Meaning lspci shows them?


Is the lspci -vvv output in dom0 different from 3.4 vs 3.5?

> 
> 
> 2nd: After DomU shutdown , Dom0 panics (100% reproducable) - sorry
> that i have no full stacktrace, all i have is a "screenshot" which i
> uploaded here:
> http://imageshack.us/photo/my-images/52/img20120724235921.jpg/

Ugh, that looks like somebody removed a large chunk of a pagetable.

Hmm. Are you using dom0_mem=max parameter? If not, can you try
that and also disable ballooning in the xm/xl config file pls?

> 
> 
> With 3.4 both issues were not there - everything worked perfectly.
> Tell me which debugging info you need, i may be able to re-install
> my netconsole to get the full stacktrace (but i had not much luck
> with netconsole regarding kernel panics - rarely this info gets sent
> before the "panic"...)
> 
> Greetings
> Tobias
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Regression in kernel 3.5 as Dom0 regarding PCI Passthrough?!
@ 2012-07-25 12:30 Tobias Geiger
  2012-07-25 13:43 ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 20+ messages in thread
From: Tobias Geiger @ 2012-07-25 12:30 UTC (permalink / raw)
  To: xen-devel

Hi!

i notice a serious regression with 3.5 as Dom0 kernel (3.4 was rock 
stable):

1st: only the GPU PCI Passthrough works, the PCI USB Controller is not 
recognized within the DomU (HVM Win7 64)
Dom0 cmdline is:
ro root=LABEL=dom0root 
xen-pciback.hide=(08:00.0)(08:00.1)(00:1d.0)(00:1d.1)(00:1d.2)(00:1d.7) 
security=apparmor noirqdebug nouveau.msi=1

Only 8:00.0 and 8:00.1 get passed through without problems, all the USB 
Controller IDs are not correctly passed through and get a exclamation 
mark within the win7 device manager ("could not be started").


2nd: After DomU shutdown , Dom0 panics (100% reproducable) - sorry that 
i have no full stacktrace, all i have is a "screenshot" which i uploaded 
here:
http://imageshack.us/photo/my-images/52/img20120724235921.jpg/


With 3.4 both issues were not there - everything worked perfectly.
Tell me which debugging info you need, i may be able to re-install my 
netconsole to get the full stacktrace (but i had not much luck with 
netconsole regarding kernel panics - rarely this info gets sent before 
the "panic"...)

Greetings
Tobias

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2012-09-07 10:37 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-08-21  2:41 Regression in kernel 3.5 as Dom0 regarding PCI Passthrough?! Ren, Yongjie
2012-08-21 14:23 ` Konrad Rzeszutek Wilk
  -- strict thread matches above, loose matches on Subject: below --
2012-08-28  8:25 Ren, Yongjie
2012-08-28 13:19 ` Konrad Rzeszutek Wilk
2012-09-05 18:54 ` Konrad Rzeszutek Wilk
2012-09-06 11:28   ` Tobias Geiger
2012-09-06 13:05     ` Konrad Rzeszutek Wilk
2012-09-06 13:24       ` Tobias Geiger
2012-09-07  2:08     ` Ren, Yongjie
2012-09-07 10:37       ` Tobias Geiger
2012-09-06 11:32   ` Tobias Geiger
2012-09-06 11:46   ` Tobias Geiger
2012-07-25 12:30 Tobias Geiger
2012-07-25 13:43 ` Konrad Rzeszutek Wilk
2012-07-25 14:20   ` Tobias Geiger
2012-07-25 14:32     ` Tobias Geiger
2012-07-25 17:59   ` Tobias Geiger
2012-07-25 18:09     ` Konrad Rzeszutek Wilk
2012-08-06 16:16   ` Konrad Rzeszutek Wilk
2012-08-20 23:30     ` Konrad Rzeszutek Wilk

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.