All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] usb-storage assertions
@ 2016-01-09 17:34 Andrey Korolyov
  2016-01-12 14:17 ` Gerd Hoffmann
  0 siblings, 1 reply; 15+ messages in thread
From: Andrey Korolyov @ 2016-01-09 17:34 UTC (permalink / raw)
  To: qemu-devel; +Cc: hdegoede, Gerd Hoffmann

Hello,

during regular operations within linux guest with USB EHCI frontend I
am seeing process crashes with an assert during regular operations
like dpkg install:

hw/usb/dev-storage.c:334: usb_msd_handle_reset: Assertion `s->req ==
((void *)0)' failed.

This does happen when real block backend is an USB-attached device
itself, so we are hitting some subtle race right there, because a
single change of a backend to a raw file sitting on SSD has never
triggered this block with the linux guest (but xBSD pre-reboot flush
would trigger assertion from time to time in the usb_cancel_packet()
without regarding physical backend type:

hw/usb/core.c:508: usb_cancel_packet: Assertion
`usb_packet_is_inflight(p)' failed.
)

Since the 2.2.0 which I am running is not exactly freshest one, I
could re-check with master today or tomorrow, but log for hw/usb/ has
no related commits for intentional fixes over one or another issue.

Argument set to check:
-device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x6.0x7 -device
ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x6
-device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x6.0x1
-device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x6.0x2
-drive file=/dev/sdb,if=none,id=drive-usb-disk0,format=raw -device
usb-storage,drive=drive-usb-disk0,id=usb-disk0,removable=off

Thanks for looking into this!

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] usb-storage assertions
  2016-01-09 17:34 [Qemu-devel] usb-storage assertions Andrey Korolyov
@ 2016-01-12 14:17 ` Gerd Hoffmann
  2016-01-12 14:36   ` Kevin Wolf
  0 siblings, 1 reply; 15+ messages in thread
From: Gerd Hoffmann @ 2016-01-12 14:17 UTC (permalink / raw)
  To: Andrey Korolyov; +Cc: Kevin Wolf, hdegoede, qemu-devel

On Sa, 2016-01-09 at 20:34 +0300, Andrey Korolyov wrote:
> Hello,
> 
> during regular operations within linux guest with USB EHCI frontend I
> am seeing process crashes with an assert during regular operations
> like dpkg install:
> 
> hw/usb/dev-storage.c:334: usb_msd_handle_reset: Assertion `s->req ==
> ((void *)0)' failed.
> 
> This does happen when real block backend is an USB-attached device
> itself, so we are hitting some subtle race right there, because a
> single change of a backend to a raw file sitting on SSD has never
> triggered this block with the linux guest (but xBSD pre-reboot flush
> would trigger assertion from time to time in the usb_cancel_packet()
> without regarding physical backend type:
> 
> hw/usb/core.c:508: usb_cancel_packet: Assertion
> `usb_packet_is_inflight(p)' failed.
> )
> 
> Since the 2.2.0 which I am running is not exactly freshest one, I
> could re-check with master today or tomorrow, but log for hw/usb/ has
> no related commits for intentional fixes over one or another issue.

Checking would be nice, but I likewise don't expect things being
different on 2.5.

Most likely these are bugs in rarely taken code paths, probably because
the guest cancels requests and/or resets device due to timeouts (which
in turn are triggered by slow backing storage on the host).

Any chance you can rebuild qemu with DEBUG_MSD enabled (in
hw/usb/dev-storage.c), then trigger the asserts again, so we hopefully
get a useful trail of the events triggering the bugs?

Kevin, is is possible to limit transfer rates for block devices, to
simplify debugging stuff like this?  Preferably in a way supported by
libvirt?

thanks,
  Gerd

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] usb-storage assertions
  2016-01-12 14:17 ` Gerd Hoffmann
@ 2016-01-12 14:36   ` Kevin Wolf
  2016-01-12 14:56     ` Daniel P. Berrange
  0 siblings, 1 reply; 15+ messages in thread
From: Kevin Wolf @ 2016-01-12 14:36 UTC (permalink / raw)
  To: Gerd Hoffmann; +Cc: hdegoede, Andrey Korolyov, qemu-devel

Am 12.01.2016 um 15:17 hat Gerd Hoffmann geschrieben:
> On Sa, 2016-01-09 at 20:34 +0300, Andrey Korolyov wrote:
> > Hello,
> > 
> > during regular operations within linux guest with USB EHCI frontend I
> > am seeing process crashes with an assert during regular operations
> > like dpkg install:
> > 
> > hw/usb/dev-storage.c:334: usb_msd_handle_reset: Assertion `s->req ==
> > ((void *)0)' failed.
> > 
> > This does happen when real block backend is an USB-attached device
> > itself, so we are hitting some subtle race right there, because a
> > single change of a backend to a raw file sitting on SSD has never
> > triggered this block with the linux guest (but xBSD pre-reboot flush
> > would trigger assertion from time to time in the usb_cancel_packet()
> > without regarding physical backend type:
> > 
> > hw/usb/core.c:508: usb_cancel_packet: Assertion
> > `usb_packet_is_inflight(p)' failed.
> > )
> > 
> > Since the 2.2.0 which I am running is not exactly freshest one, I
> > could re-check with master today or tomorrow, but log for hw/usb/ has
> > no related commits for intentional fixes over one or another issue.
> 
> Checking would be nice, but I likewise don't expect things being
> different on 2.5.
> 
> Most likely these are bugs in rarely taken code paths, probably because
> the guest cancels requests and/or resets device due to timeouts (which
> in turn are triggered by slow backing storage on the host).
> 
> Any chance you can rebuild qemu with DEBUG_MSD enabled (in
> hw/usb/dev-storage.c), then trigger the asserts again, so we hopefully
> get a useful trail of the events triggering the bugs?
> 
> Kevin, is is possible to limit transfer rates for block devices, to
> simplify debugging stuff like this?  Preferably in a way supported by
> libvirt?

Yes. In the qemu command line, you can limit the rate with -drive
throttling.bps_total=... and similar options (you can instead throttle
based on iops instead, and you can have separate read/write limits - I
don't think you need any of these, though).

I have never used it in libvirt, but libvirt.org seems to suggest that
it looks like this (seems to be for Xen, but you get the idea):

    <disk type='file' snapshot='external'>
      <driver name="tap" type="aio" cache="default"/>
      <source file='/var/lib/xen/images/fv0' startupPolicy='optional'>
        <seclabel relabel='no'/>
      </source>
      <target dev='hda' bus='ide'/>
      <iotune>
        <total_bytes_sec>10000000</total_bytes_sec>
        <read_iops_sec>400000</read_iops_sec>
        <write_iops_sec>100000</write_iops_sec>
      </iotune>
    </disk>

Kevin

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] usb-storage assertions
  2016-01-12 14:36   ` Kevin Wolf
@ 2016-01-12 14:56     ` Daniel P. Berrange
  2016-01-13 16:13       ` Gerd Hoffmann
  0 siblings, 1 reply; 15+ messages in thread
From: Daniel P. Berrange @ 2016-01-12 14:56 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: hdegoede, Andrey Korolyov, Gerd Hoffmann, qemu-devel

On Tue, Jan 12, 2016 at 03:36:40PM +0100, Kevin Wolf wrote:
> Am 12.01.2016 um 15:17 hat Gerd Hoffmann geschrieben:
> > On Sa, 2016-01-09 at 20:34 +0300, Andrey Korolyov wrote:
> > > Hello,
> > > 
> > > during regular operations within linux guest with USB EHCI frontend I
> > > am seeing process crashes with an assert during regular operations
> > > like dpkg install:
> > > 
> > > hw/usb/dev-storage.c:334: usb_msd_handle_reset: Assertion `s->req ==
> > > ((void *)0)' failed.
> > > 
> > > This does happen when real block backend is an USB-attached device
> > > itself, so we are hitting some subtle race right there, because a
> > > single change of a backend to a raw file sitting on SSD has never
> > > triggered this block with the linux guest (but xBSD pre-reboot flush
> > > would trigger assertion from time to time in the usb_cancel_packet()
> > > without regarding physical backend type:
> > > 
> > > hw/usb/core.c:508: usb_cancel_packet: Assertion
> > > `usb_packet_is_inflight(p)' failed.
> > > )
> > > 
> > > Since the 2.2.0 which I am running is not exactly freshest one, I
> > > could re-check with master today or tomorrow, but log for hw/usb/ has
> > > no related commits for intentional fixes over one or another issue.
> > 
> > Checking would be nice, but I likewise don't expect things being
> > different on 2.5.
> > 
> > Most likely these are bugs in rarely taken code paths, probably because
> > the guest cancels requests and/or resets device due to timeouts (which
> > in turn are triggered by slow backing storage on the host).
> > 
> > Any chance you can rebuild qemu with DEBUG_MSD enabled (in
> > hw/usb/dev-storage.c), then trigger the asserts again, so we hopefully
> > get a useful trail of the events triggering the bugs?
> > 
> > Kevin, is is possible to limit transfer rates for block devices, to
> > simplify debugging stuff like this?  Preferably in a way supported by
> > libvirt?
> 
> Yes. In the qemu command line, you can limit the rate with -drive
> throttling.bps_total=... and similar options (you can instead throttle
> based on iops instead, and you can have separate read/write limits - I
> don't think you need any of these, though).
> 
> I have never used it in libvirt, but libvirt.org seems to suggest that
> it looks like this (seems to be for Xen, but you get the idea):
> 
>     <disk type='file' snapshot='external'>
>       <driver name="tap" type="aio" cache="default"/>
>       <source file='/var/lib/xen/images/fv0' startupPolicy='optional'>
>         <seclabel relabel='no'/>
>       </source>
>       <target dev='hda' bus='ide'/>
>       <iotune>
>         <total_bytes_sec>10000000</total_bytes_sec>
>         <read_iops_sec>400000</read_iops_sec>
>         <write_iops_sec>100000</write_iops_sec>
>       </iotune>
>     </disk>

Yes, that works with QEMU and will map through to the -drive I/O
tuning parameters.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] usb-storage assertions
  2016-01-12 14:56     ` Daniel P. Berrange
@ 2016-01-13 16:13       ` Gerd Hoffmann
  2016-01-13 16:28         ` Andrey Korolyov
  0 siblings, 1 reply; 15+ messages in thread
From: Gerd Hoffmann @ 2016-01-13 16:13 UTC (permalink / raw)
  To: Daniel P. Berrange; +Cc: Kevin Wolf, hdegoede, Andrey Korolyov, qemu-devel

On Di, 2016-01-12 at 14:56 +0000, Daniel P. Berrange wrote:
> On Tue, Jan 12, 2016 at 03:36:40PM +0100, Kevin Wolf wrote:
> > Am 12.01.2016 um 15:17 hat Gerd Hoffmann geschrieben:
> > > On Sa, 2016-01-09 at 20:34 +0300, Andrey Korolyov wrote:
> > > > Hello,
> > > > 
> > > > during regular operations within linux guest with USB EHCI frontend I
> > > > am seeing process crashes with an assert during regular operations
> > > > like dpkg install:
> > > > 
> > > > hw/usb/dev-storage.c:334: usb_msd_handle_reset: Assertion `s->req ==
> > > > ((void *)0)' failed.

> >       <iotune>
> >         <total_bytes_sec>10000000</total_bytes_sec>
> >         <read_iops_sec>400000</read_iops_sec>
> >         <write_iops_sec>100000</write_iops_sec>
> >       </iotune>

> Yes, that works with QEMU and will map through to the -drive I/O
> tuning parameters.

Works, successfully slowing down things to the point that a git clone of
a linux kernel source tree from another machine in the same lan takes
hours instead of minutes.

That doesn't seem to be enough to trigger the asserts though :(
Things are very slow but rock solid ...

cheers,
  Gerd

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] usb-storage assertions
  2016-01-13 16:13       ` Gerd Hoffmann
@ 2016-01-13 16:28         ` Andrey Korolyov
  2016-01-15 18:08           ` Andrey Korolyov
  0 siblings, 1 reply; 15+ messages in thread
From: Andrey Korolyov @ 2016-01-13 16:28 UTC (permalink / raw)
  To: Gerd Hoffmann; +Cc: Kevin Wolf, hdegoede, qemu-devel

On Wed, Jan 13, 2016 at 7:13 PM, Gerd Hoffmann <kraxel@redhat.com> wrote:
> On Di, 2016-01-12 at 14:56 +0000, Daniel P. Berrange wrote:
>> On Tue, Jan 12, 2016 at 03:36:40PM +0100, Kevin Wolf wrote:
>> > Am 12.01.2016 um 15:17 hat Gerd Hoffmann geschrieben:
>> > > On Sa, 2016-01-09 at 20:34 +0300, Andrey Korolyov wrote:
>> > > > Hello,
>> > > >
>> > > > during regular operations within linux guest with USB EHCI frontend I
>> > > > am seeing process crashes with an assert during regular operations
>> > > > like dpkg install:
>> > > >
>> > > > hw/usb/dev-storage.c:334: usb_msd_handle_reset: Assertion `s->req ==
>> > > > ((void *)0)' failed.
>
>> >       <iotune>
>> >         <total_bytes_sec>10000000</total_bytes_sec>
>> >         <read_iops_sec>400000</read_iops_sec>
>> >         <write_iops_sec>100000</write_iops_sec>
>> >       </iotune>
>
>> Yes, that works with QEMU and will map through to the -drive I/O
>> tuning parameters.
>
> Works, successfully slowing down things to the point that a git clone of
> a linux kernel source tree from another machine in the same lan takes
> hours instead of minutes.
>
> That doesn't seem to be enough to trigger the asserts though :(
> Things are very slow but rock solid ...
>
> cheers,
>   Gerd
>

Gerd, could you please check against NetBSD/FreeBSD guest? Both of
them are complaining about interrupt storm upon last cleanups/syncs
before reboot when using USB frontend as a disk and second assert is
triggered very frequently. I`ll re-try with 2.5 in a couple of days
for both cases.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] usb-storage assertions
  2016-01-13 16:28         ` Andrey Korolyov
@ 2016-01-15 18:08           ` Andrey Korolyov
  2016-01-18  9:38             ` Gerd Hoffmann
  0 siblings, 1 reply; 15+ messages in thread
From: Andrey Korolyov @ 2016-01-15 18:08 UTC (permalink / raw)
  To: Gerd Hoffmann; +Cc: Kevin Wolf, hdegoede, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 425 bytes --]

Just checked, Linux usb driver decided to lose a disk during a
'stress-test' over unpacking linux source instead of triggering an
assertion in 2.5 (and to irreparably damage its ext4 as well), NetBSD
7.0 reboot action hangs on USB_RESET and NetBSD 5.1 triggers second of
mentioned asserts. Backend itself is a regular USB2-SATA adapter with
Intel S3500 SSD, hence it should not trigger first timing-related
assertion at all.

[-- Attachment #2: qemu-2.5.0-usb-assert-netbsd-5.1.txt --]
[-- Type: text/plain, Size: 2786 bytes --]

usb-msd: Data out 16384/16384
usb-msd: Command complete 0 tag 0x9f5
usb-msd: Command status 0 tag 0x9f5, len 13
usb-msd: Command on LUN 0
usb-msd: Command tag 0x9f6 flags 00000000 len 10 data 16384
[usb-disk0.0 id=0] WRITE_10 0x00 0x00 0x00 0x00 0x7f 0x00 0x00 0x20 0x00 - to-dev len=16384
usb-msd: Data out 16384/16384
usb-msd: Command complete 0 tag 0x9f6
usb-msd: Command status 0 tag 0x9f6, len 13
usb-msd: Command on LUN 0
usb-msd: Command tag 0x9f7 flags 00000000 len 10 data 16384
[usb-disk0.0 id=0] WRITE_10 0x00 0x00 0x00 0x00 0x9f 0x00 0x00 0x20 0x00 - to-dev len=16384
usb-msd: Data out 16384/16384
usb-msd: Command complete 0 tag 0x9f7
usb-msd: Command status 0 tag 0x9f7, len 13
usb-msd: Command on LUN 0
usb-msd: Command tag 0x9f8 flags 00000000 len 10 data 16384
[usb-disk0.0 id=0] WRITE_10 0x00 0x00 0x00 0x00 0xbf 0x00 0x00 0x20 0x00 - to-dev len=16384
usb-msd: Data out 16384/16384
usb-msd: Command complete 0 tag 0x9f8
usb-msd: Command status 0 tag 0x9f8, len 13
usb-msd: Command on LUN 0
usb-msd: Command tag 0x9f9 flags 00000000 len 10 data 4096
[usb-disk0.0 id=0] WRITE_10 0x00 0x00 0x02 0xbb 0x5f 0x00 0x00 0x08 0x00 - to-dev len=4096
usb-msd: Data out 4096/4096
usb-msd: Command complete 0 tag 0x9f9
usb-msd: Command status 0 tag 0x9f9, len 13
usb-msd: Command on LUN 0
usb-msd: Command tag 0x9fa flags 00000000 len 10 data 2048
[usb-disk0.0 id=0] WRITE_10 0x00 0x00 0x30 0x4c 0xbb 0x00 0x00 0x04 0x00 - to-dev len=2048
usb-msd: Data out 2048/2048
usb-msd: Command complete 0 tag 0x9fa
usb-msd: Command status 0 tag 0x9fa, len 13
usb-msd: Command on LUN 0
usb-msd: Command tag 0x9fb flags 00000000 len 10 data 2048
[usb-disk0.0 id=0] WRITE_10 0x00 0x00 0x00 0x00 0x4f 0x00 0x00 0x04 0x00 - to-dev len=2048
usb-msd: Data out 2048/2048
usb-msd: Command complete 0 tag 0x9fb
usb-msd: Command status 0 tag 0x9fb, len 13
usb-msd: Command on LUN 0
usb-msd: Command tag 0x9fc flags 00000000 len 10 data 2048
[usb-disk0.0 id=0] WRITE_10 0x00 0x00 0x00 0x17 0x1f 0x00 0x00 0x04 0x00 - to-dev len=2048
usb-msd: Data out 2048/2048
usb-msd: Command complete 0 tag 0x9fc
usb-msd: Command status 0 tag 0x9fc, len 13
usb-msd: Command on LUN 0
usb-msd: Command tag 0x9fd flags 00000000 len 10 data 2048
[usb-disk0.0 id=0] WRITE_10 0x00 0x00 0x00 0x00 0x4f 0x00 0x00 0x04 0x00 - to-dev len=2048
usb-msd: Data out 2048/2048
usb-msd: Command complete 0 tag 0x9fd
usb-msd: Command status 0 tag 0x9fd, len 13
usb-msd: Command on LUN 0
usb-msd: Command tag 0x9fe flags 00000000 len 10 data 0
[usb-disk0.0 id=0] SYNCHRONIZE_CACHE 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 - none
usb-msd: Deferring packet 0x7f335af66630 [wait status]
qemu-system-x86_64: hw/usb/core.c:491: usb_cancel_packet: Assertion `usb_packet_is_inflight(p)' failed.
2016-01-15 17:53:11.443+0000: shutting down

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] usb-storage assertions
  2016-01-15 18:08           ` Andrey Korolyov
@ 2016-01-18  9:38             ` Gerd Hoffmann
  2016-01-18  9:50               ` Andrey Korolyov
  0 siblings, 1 reply; 15+ messages in thread
From: Gerd Hoffmann @ 2016-01-18  9:38 UTC (permalink / raw)
  To: Andrey Korolyov; +Cc: Kevin Wolf, hdegoede, qemu-devel

On Fr, 2016-01-15 at 21:08 +0300, Andrey Korolyov wrote:
> Just checked, Linux usb driver decided to lose a disk during a
> 'stress-test' over unpacking linux source instead of triggering an
> assertion in 2.5 (and to irreparably damage its ext4 as well),

Ok, so behavior changed here from 2.2 -> 2.5.  I don't see disk
corruption in my tests, but linux guest resets the usb-storage device
now and then.  Seems to be able to resume operations though, there are
no disk errors in the logs.

> NetBSD
> 7.0 reboot action hangs on USB_RESET and NetBSD 5.1 triggers second of
> mentioned asserts. Backend itself is a regular USB2-SATA adapter with
> Intel S3500 SSD, hence it should not trigger first timing-related
> assertion at all.

ok.  Had no trouble with freebsd, will go fetch netbsd images.  What
arch is this?  i386?  x86_64?

cheers,
  Gerd

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] usb-storage assertions
  2016-01-18  9:38             ` Gerd Hoffmann
@ 2016-01-18  9:50               ` Andrey Korolyov
  2016-01-18 13:55                 ` Gerd Hoffmann
  0 siblings, 1 reply; 15+ messages in thread
From: Andrey Korolyov @ 2016-01-18  9:50 UTC (permalink / raw)
  To: Gerd Hoffmann; +Cc: Kevin Wolf, hdegoede, qemu-devel

On Mon, Jan 18, 2016 at 12:38 PM, Gerd Hoffmann <kraxel@redhat.com> wrote:
> On Fr, 2016-01-15 at 21:08 +0300, Andrey Korolyov wrote:
>> Just checked, Linux usb driver decided to lose a disk during a
>> 'stress-test' over unpacking linux source instead of triggering an
>> assertion in 2.5 (and to irreparably damage its ext4 as well),
>
> Ok, so behavior changed here from 2.2 -> 2.5.  I don't see disk
> corruption in my tests, but linux guest resets the usb-storage device
> now and then.  Seems to be able to resume operations though, there are
> no disk errors in the logs.

Thanks for checking, I`ve seen a couple of bus resets as well and
suddenly guest froze with 'fs remounted as ro' message splat over
physical console. Of course I didn`t set any external kernel logging
or pstore at a time and I could only assume that same timeouts was a
general cause of a corruption itself (more likely some inflights was
lucky enough to finish while others did not made it, but I don`t see a
way to corrupt a superblock *that bad*).

>
>> NetBSD
>> 7.0 reboot action hangs on USB_RESET and NetBSD 5.1 triggers second of
>> mentioned asserts. Backend itself is a regular USB2-SATA adapter with
>> Intel S3500 SSD, hence it should not trigger first timing-related
>> assertion at all.
>
> ok.  Had no trouble with freebsd, will go fetch netbsd images.  What
> arch is this?  i386?  x86_64?

i386 7.0 for the reference, but I`m sure that this wouldn`t matter in
any way. Just to mention - ancient 2.6, like 2.6.18, are actually
doing things quite faster using same frontend+backend combination, may
be due to lack of proper timeout checks... Actually there is a very
small chance that the real performance regression was introduced
during further development, so I instead believe in improper
interaction of a newer guest EHCI driver and qemu frontend. Please let
me know if any countable measurements like fio could be a matter of
interest - I don`t think that many people are concerned about USB/USB2
frontend performance at all, since they are bringing in a ton of
unwelcomed wakeups and the one thing which could be a matter of
concern in that case is an emulated xHCI, IMHO.

>
> cheers,
>   Gerd
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] usb-storage assertions
  2016-01-18  9:50               ` Andrey Korolyov
@ 2016-01-18 13:55                 ` Gerd Hoffmann
  2016-01-18 23:49                   ` Andrey Korolyov
  0 siblings, 1 reply; 15+ messages in thread
From: Gerd Hoffmann @ 2016-01-18 13:55 UTC (permalink / raw)
  To: Andrey Korolyov; +Cc: Kevin Wolf, hdegoede, qemu-devel

  Hi,

> > ok.  Had no trouble with freebsd, will go fetch netbsd images.  What
> > arch is this?  i386?  x86_64?
> 
> i386 7.0 for the reference, but I`m sure that this wouldn`t matter in
> any way.

7.0 trace:

[ ... ]
12417@1453119296.506252:usb_ehci_opreg_write wr mmio 0020 [USBCMD] = 0
12417@1453119296.506284:usb_ehci_queue_action q 0x7f94f7a98bb0: free
[ ... ]
12417@1453119296.507336:usb_ehci_state periodic schedule INACTIVE
12417@1453119296.507340:usb_ehci_usbsts usbsts PSS 0
12417@1453119296.507344:usb_ehci_queue_action q 0x7f94f7a98cd0: free
12417@1453119296.507349:usb_ehci_queue_action q 0x7f94f7a98c40: free
12417@1453119296.507353:usb_ehci_queue_action q 0x7f94f749f040: free
12417@1453119296.507357:usb_ehci_queue_action q 0x7f94f749efb0: free
12417@1453119296.507361:usb_ehci_state async schedule INACTIVE
12417@1453119296.507365:usb_ehci_usbsts usbsts ASS 0
12417@1453119296.507369:usb_ehci_usbsts usbsts HALT 1
12417@1453119296.507402:usb_ehci_opreg_write wr mmio 0020 [USBCMD] = 2
12417@1453119296.507413:usb_ehci_reset === RESET ===
12417@1453119296.507418:usb_ehci_port_detach detach port #0, owner ehci
12417@1453119296.507423:usb_ehci_irq level 1, frindex 0x2958, sts
0x1004, mask 0x37
12417@1453119296.507435:usb_ehci_port_attach attach port #0, owner comp,
device QEMU USB MSD
12417@1453119296.507444:usb_ehci_opreg_change ch mmio 0020 [USBCMD] =
80000 (old: 0)
12417@1453119296.507466:usb_ehci_opreg_read rd mmio 0024 [USBSTS] = 1000
12417@1453119296.507510:usb_ehci_opreg_read rd mmio 0024 [USBSTS] = 1000
[ ... ]

So, to shutdown ehci netbsd clears the cmd register, then sets the reset
bit in the cmd register.  Fine.  

Then it goes read the status register, in a loop, forever.  No idea why,
and I also can't spot then place in the source code.  Hmm ...

>  Just to mention - ancient 2.6, like 2.6.18, are actually
> doing things quite faster using same frontend+backend combination, may
> be due to lack of proper timeout checks... Actually there is a very
> small chance that the real performance regression was introduced
> during further development, so I instead believe in improper
> interaction of a newer guest EHCI driver and qemu frontend.

Could also be older ehci guest drivers took a shortcut which turned out
to not be correct and not working reliable ...

> Please let
> me know if any countable measurements like fio could be a matter of
> interest - I don`t think that many people are concerned about USB/USB2
> frontend performance at all, since they are bringing in a ton of
> unwelcomed wakeups and the one thing which could be a matter of
> concern in that case is an emulated xHCI, IMHO.

Yes, xhci is clearly the best choice when it comes to performance.

Reasons to use ehci instead basically boils down to (a) lacking/broken
guest drivers (old windows versions, also early linux xhci driver
versions had endian issues, firmware needs xhci support too to boot from
usb) and (b) historical (ehci emulation was there first, so support in
the management stack tends to be better for that, also people are used
to it).

cheers,
  Gerd

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] usb-storage assertions
  2016-01-18 13:55                 ` Gerd Hoffmann
@ 2016-01-18 23:49                   ` Andrey Korolyov
  2016-01-19  7:13                     ` Gerd Hoffmann
  2016-01-19 14:19                     ` Gerd Hoffmann
  0 siblings, 2 replies; 15+ messages in thread
From: Andrey Korolyov @ 2016-01-18 23:49 UTC (permalink / raw)
  To: Gerd Hoffmann; +Cc: Kevin Wolf, hdegoede, qemu-devel

On Mon, Jan 18, 2016 at 4:55 PM, Gerd Hoffmann <kraxel@redhat.com> wrote:
>   Hi,
>
>> > ok.  Had no trouble with freebsd, will go fetch netbsd images.  What
>> > arch is this?  i386?  x86_64?
>>
>> i386 7.0 for the reference, but I`m sure that this wouldn`t matter in
>> any way.
>
> 7.0 trace:

Whoops, sorry, should be 5.1/i386. On a 7.0 I observe same endless
loop as you do.

>
> [ ... ]
> 12417@1453119296.506252:usb_ehci_opreg_write wr mmio 0020 [USBCMD] = 0
> 12417@1453119296.506284:usb_ehci_queue_action q 0x7f94f7a98bb0: free
> [ ... ]
> 12417@1453119296.507336:usb_ehci_state periodic schedule INACTIVE
> 12417@1453119296.507340:usb_ehci_usbsts usbsts PSS 0
> 12417@1453119296.507344:usb_ehci_queue_action q 0x7f94f7a98cd0: free
> 12417@1453119296.507349:usb_ehci_queue_action q 0x7f94f7a98c40: free
> 12417@1453119296.507353:usb_ehci_queue_action q 0x7f94f749f040: free
> 12417@1453119296.507357:usb_ehci_queue_action q 0x7f94f749efb0: free
> 12417@1453119296.507361:usb_ehci_state async schedule INACTIVE
> 12417@1453119296.507365:usb_ehci_usbsts usbsts ASS 0
> 12417@1453119296.507369:usb_ehci_usbsts usbsts HALT 1
> 12417@1453119296.507402:usb_ehci_opreg_write wr mmio 0020 [USBCMD] = 2
> 12417@1453119296.507413:usb_ehci_reset === RESET ===
> 12417@1453119296.507418:usb_ehci_port_detach detach port #0, owner ehci
> 12417@1453119296.507423:usb_ehci_irq level 1, frindex 0x2958, sts
> 0x1004, mask 0x37
> 12417@1453119296.507435:usb_ehci_port_attach attach port #0, owner comp,
> device QEMU USB MSD
> 12417@1453119296.507444:usb_ehci_opreg_change ch mmio 0020 [USBCMD] =
> 80000 (old: 0)
> 12417@1453119296.507466:usb_ehci_opreg_read rd mmio 0024 [USBSTS] = 1000
> 12417@1453119296.507510:usb_ehci_opreg_read rd mmio 0024 [USBSTS] = 1000
> [ ... ]
>
> So, to shutdown ehci netbsd clears the cmd register, then sets the reset
> bit in the cmd register.  Fine.
>
> Then it goes read the status register, in a loop, forever.  No idea why,
> and I also can't spot then place in the source code.  Hmm ...
>
>>  Just to mention - ancient 2.6, like 2.6.18, are actually
>> doing things quite faster using same frontend+backend combination, may
>> be due to lack of proper timeout checks... Actually there is a very
>> small chance that the real performance regression was introduced
>> during further development, so I instead believe in improper
>> interaction of a newer guest EHCI driver and qemu frontend.
>
> Could also be older ehci guest drivers took a shortcut which turned out
> to not be correct and not working reliable ...
>
>> Please let
>> me know if any countable measurements like fio could be a matter of
>> interest - I don`t think that many people are concerned about USB/USB2
>> frontend performance at all, since they are bringing in a ton of
>> unwelcomed wakeups and the one thing which could be a matter of
>> concern in that case is an emulated xHCI, IMHO.
>
> Yes, xhci is clearly the best choice when it comes to performance.
>
> Reasons to use ehci instead basically boils down to (a) lacking/broken
> guest drivers (old windows versions, also early linux xhci driver
> versions had endian issues, firmware needs xhci support too to boot from
> usb) and (b) historical (ehci emulation was there first, so support in
> the management stack tends to be better for that, also people are used
> to it).
>
> cheers,
>   Gerd
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] usb-storage assertions
  2016-01-18 23:49                   ` Andrey Korolyov
@ 2016-01-19  7:13                     ` Gerd Hoffmann
  2016-01-19 10:59                       ` Andrey Korolyov
  2016-01-19 14:19                     ` Gerd Hoffmann
  1 sibling, 1 reply; 15+ messages in thread
From: Gerd Hoffmann @ 2016-01-19  7:13 UTC (permalink / raw)
  To: Andrey Korolyov; +Cc: Kevin Wolf, hdegoede, qemu-devel

On Di, 2016-01-19 at 02:49 +0300, Andrey Korolyov wrote:
> On Mon, Jan 18, 2016 at 4:55 PM, Gerd Hoffmann <kraxel@redhat.com> wrote:
> >   Hi,
> >
> >> > ok.  Had no trouble with freebsd, will go fetch netbsd images.  What
> >> > arch is this?  i386?  x86_64?
> >>
> >> i386 7.0 for the reference, but I`m sure that this wouldn`t matter in
> >> any way.
> >
> > 7.0 trace:
> 
> Whoops, sorry, should be 5.1/i386.

I'll check that too ...

>  On a 7.0 I observe same endless
> loop as you do.

... just wanted to start with that one.

[ trace snipped ]

> > So, to shutdown ehci netbsd clears the cmd register, then sets the reset
> > bit in the cmd register.  Fine.
> >
> > Then it goes read the status register, in a loop, forever.  No idea why,
> > and I also can't spot then place in the source code.  Hmm ...

/me was hoping anyone has an idea what is going on here.
Are you familiar with the netbsd kernel sources?

cheers,
  Gerd

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] usb-storage assertions
  2016-01-19  7:13                     ` Gerd Hoffmann
@ 2016-01-19 10:59                       ` Andrey Korolyov
  2016-01-19 11:44                         ` Gerd Hoffmann
  0 siblings, 1 reply; 15+ messages in thread
From: Andrey Korolyov @ 2016-01-19 10:59 UTC (permalink / raw)
  To: Gerd Hoffmann; +Cc: Kevin Wolf, hdegoede, qemu-devel

On Tue, Jan 19, 2016 at 10:13 AM, Gerd Hoffmann <kraxel@redhat.com> wrote:
> On Di, 2016-01-19 at 02:49 +0300, Andrey Korolyov wrote:
>> On Mon, Jan 18, 2016 at 4:55 PM, Gerd Hoffmann <kraxel@redhat.com> wrote:
>> >   Hi,
>> >
>> >> > ok.  Had no trouble with freebsd, will go fetch netbsd images.  What
>> >> > arch is this?  i386?  x86_64?
>> >>
>> >> i386 7.0 for the reference, but I`m sure that this wouldn`t matter in
>> >> any way.
>> >
>> > 7.0 trace:
>>
>> Whoops, sorry, should be 5.1/i386.
>
> I'll check that too ...
>
>>  On a 7.0 I observe same endless
>> loop as you do.
>
> ... just wanted to start with that one.
>
> [ trace snipped ]
>
>> > So, to shutdown ehci netbsd clears the cmd register, then sets the reset
>> > bit in the cmd register.  Fine.
>> >
>> > Then it goes read the status register, in a loop, forever.  No idea why,
>> > and I also can't spot then place in the source code.  Hmm ...
>
> /me was hoping anyone has an idea what is going on here.
> Are you familiar with the netbsd kernel sources?
>

Probably not enough with driver subsystem to point even at the obvious
issue in the EHCI driver. I`d start with slowing down an emulated CPU
10...100 times via its thread cg, leaving emulator code hanging with
enough CPU cycles and check if the issue is still here. If roots of
the crash or endless loop are timing-related, they either would change
appearance significanly or disappear completely (or vice versa, slow
an emulator thread). If you don`t have enough time for such blind
testing, I may check it in a next few days. Since I`ve seen interrupt
storm complaint on FreeBSD within same conditions, I strongly prefer
the idea of a race-driven behavior.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] usb-storage assertions
  2016-01-19 10:59                       ` Andrey Korolyov
@ 2016-01-19 11:44                         ` Gerd Hoffmann
  0 siblings, 0 replies; 15+ messages in thread
From: Gerd Hoffmann @ 2016-01-19 11:44 UTC (permalink / raw)
  To: Andrey Korolyov; +Cc: Kevin Wolf, hdegoede, qemu-devel

  Hi,

> Probably not enough with driver subsystem to point even at the obvious
> issue in the EHCI driver. I`d start with slowing down an emulated CPU
> 10...100 times via its thread cg, leaving emulator code hanging with
> enough CPU cycles and check if the issue is still here. If roots of
> the crash or endless loop are timing-related, they either would change
> appearance significanly or disappear completely (or vice versa, slow
> an emulator thread). If you don`t have enough time for such blind
> testing, I may check it in a next few days. Since I`ve seen interrupt
> storm complaint on FreeBSD within same conditions, I strongly prefer
> the idea of a race-driven behavior.

Ha!  That nailed it.  /me was looking for a loop in the code, waiting
for the device having finished reset or something like that.  But it
turned out to be a interrupt storm indeed.

cheers,
  Gerd

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] usb-storage assertions
  2016-01-18 23:49                   ` Andrey Korolyov
  2016-01-19  7:13                     ` Gerd Hoffmann
@ 2016-01-19 14:19                     ` Gerd Hoffmann
  1 sibling, 0 replies; 15+ messages in thread
From: Gerd Hoffmann @ 2016-01-19 14:19 UTC (permalink / raw)
  To: Andrey Korolyov; +Cc: Kevin Wolf, hdegoede, qemu-devel

  Hi,

> Whoops, sorry, should be 5.1/i386. On a 7.0 I observe same endless
> loop as you do.

Any chance the irq storm fix helps on 5.1 guests too?

I've installed 5.1.6 on a (virtual) usb-stick hooked up via ehci.
Works fine here, rebooted multiple times without any problems.

cheers,
  Gerd

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2016-01-19 14:19 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-09 17:34 [Qemu-devel] usb-storage assertions Andrey Korolyov
2016-01-12 14:17 ` Gerd Hoffmann
2016-01-12 14:36   ` Kevin Wolf
2016-01-12 14:56     ` Daniel P. Berrange
2016-01-13 16:13       ` Gerd Hoffmann
2016-01-13 16:28         ` Andrey Korolyov
2016-01-15 18:08           ` Andrey Korolyov
2016-01-18  9:38             ` Gerd Hoffmann
2016-01-18  9:50               ` Andrey Korolyov
2016-01-18 13:55                 ` Gerd Hoffmann
2016-01-18 23:49                   ` Andrey Korolyov
2016-01-19  7:13                     ` Gerd Hoffmann
2016-01-19 10:59                       ` Andrey Korolyov
2016-01-19 11:44                         ` Gerd Hoffmann
2016-01-19 14:19                     ` Gerd Hoffmann

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.