All of lore.kernel.org
 help / color / mirror / Atom feed
* staging: usbip: all I/O dies, how to debug?
@ 2011-08-29 15:23 Alexander Thomas
  0 siblings, 0 replies; only message in thread
From: Alexander Thomas @ 2011-08-29 15:23 UTC (permalink / raw)
  To: linux-kernel

Hello,

I am experimenting with the usbip project that is currently in the
staging drivers tree. I have a particularly nasty problem with it.

Everything works but at random moments all I/O will die system-wide.
In most cases this manifests itself as X11 freezing entirely with a
hardware reset as only way out. Sometimes certain programs like xclock
or top in a terminal will still show activity, but the machine will
not react to anything: no keyboard input, no ping, no ACPI shutdown.
In the logs there is nothing useful except sometimes a message of
(S)ATA being reset. When I'm at the console when it happens it will
show the same (S)ATA reset, attempts to remount the filesystem
read-only, and eventually it keeps repeating things like "Buffer I/O
error", "lost page write" and "unhandled error code
Result=DID_BAD_TARGET driverbyte=DRIVER_OK" every few dozen seconds.

After many experiments I have found that there is one condition that
must be met to trigger the crash: there must be simultaneous USB
traffic from the remote device and a local USB device (moving the
mouse will do). There are also a few other conditions that increase
the probability of it happening:
1. There is other heavy I/O traffic on the client, e.g. disk activity.
Compiling something is a good way to trigger the freeze.
2. The traffic from the remote device is incoming. Although I did
manage to get a crash while playing sound to a remote USB sound card,
it took way longer than when recording sound.

I have tested this with kernels ranging from 2.6.30 to 2.6.38 on two
different physical machines and inside a virtual machine. A possibly
important note is that I first had this exact same problem with a
commercial USB/IP product. I contacted the vendor but they say they
are unable to reproduce the problem. My disappointment was great when
I finally got the open source usbip working on older kernels only to
discover that it kills my system in the same way.

I have tried to debug in a virtual machine but this is problematic because:
a. it involves using a serial connection which also dies together with the rest
b. there is no way to predict when the crash will happen. This is one
of those annoying completely random bugs.
c. I have no experience in debugging at kernel/module level.

Is there anyone who can give pointers as to how to debug a problem
like this, and/or where to look for the cause?

Alexander

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2011-08-29 15:23 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-08-29 15:23 staging: usbip: all I/O dies, how to debug? Alexander Thomas

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.