linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [linux-usb-devel] USB/hal: USB open() broken? (USB CD burner  underruns, USB HDD hard resets)
       [not found]       ` <6pLll-5iq-15@gated-at.bofh.it>
@ 2006-06-21  0:07         ` Bodo Eggert
  2006-06-21 10:53           ` Alan Cox
  0 siblings, 1 reply; 18+ messages in thread
From: Bodo Eggert @ 2006-06-21  0:07 UTC (permalink / raw)
  To: Alan Cox, andi, Andrew Morton, gregkh, linux-kernel,
	linux-usb-devel, hal

Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:
> Ar Maw, 2006-06-20 am 11:05 +0200, ysgrifennodd Andreas Mohr:

>> But how would HAL safely determine whether a (IDE/USB) drive is busy?
>> As my test app demonstrates (without HAL running), the *very first* open()
>> happening during an ongoing burning operation will kill it instantly, in the
>> USB case.
>> Are there any options left for HAL at all? Still seems to strongly point
>> towards a kernel issue so far.
> 
> In the IDE space O_EXCL has the needed semantics. At least it does on
> Fedora and I don't think thats a Fedora patch, not sure if this is the
> case for the USB side of things.

This does not work, since O_EXCL does not work:
http://lkml.org/lkml/2006/2/5/137

Instead, I'd (try to) use mandatory locking and prevent open() etc. from
causing the bad commands to be sent.
-- 
Ich danke GMX dafür, die Verwendung meiner Adressen mittels per SPF
verbreiteten Lügen zu sabotieren.

http://david.woodhou.se/why-not-spf.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [linux-usb-devel] USB/hal: USB open() broken? (USB CD burner underruns, USB HDD hard resets)
  2006-06-21  0:07         ` [linux-usb-devel] USB/hal: USB open() broken? (USB CD burner underruns, USB HDD hard resets) Bodo Eggert
@ 2006-06-21 10:53           ` Alan Cox
  2006-06-21 16:16             ` Bodo Eggert
  0 siblings, 1 reply; 18+ messages in thread
From: Alan Cox @ 2006-06-21 10:53 UTC (permalink / raw)
  To: 7eggert; +Cc: andi, Andrew Morton, gregkh, linux-kernel, linux-usb-devel, hal

Ar Mer, 2006-06-21 am 02:07 +0200, ysgrifennodd Bodo Eggert:
> This does not work, since O_EXCL does not work:
> http://lkml.org/lkml/2006/2/5/137

It works fine. Its an advisory exclusive locking scheme which is
precisely what is needed and precisely how some vendors implement their
solution.

There are good reasons for not having absolute locks, one of which is
that you might want to force a reset or a hot unplug of an interface
knowing you'll lose the CD its burning (eg because your flight is about
to leave)

Alan


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [linux-usb-devel] USB/hal: USB open() broken? (USB CD burner underruns, USB HDD hard resets)
  2006-06-21 10:53           ` Alan Cox
@ 2006-06-21 16:16             ` Bodo Eggert
  2006-06-21 16:34               ` Andreas Mohr
  0 siblings, 1 reply; 18+ messages in thread
From: Bodo Eggert @ 2006-06-21 16:16 UTC (permalink / raw)
  To: Alan Cox
  Cc: 7eggert, andi, Andrew Morton, gregkh, linux-kernel, linux-usb-devel, hal

On Wed, 21 Jun 2006, Alan Cox wrote:
> Ar Mer, 2006-06-21 am 02:07 +0200, ysgrifennodd Bodo Eggert:

> > This does not work, since O_EXCL does not work:
> > http://lkml.org/lkml/2006/2/5/137
> 
> It works fine. Its an advisory exclusive locking scheme which is
> precisely what is needed and precisely how some vendors implement their
> solution.

This will be as effective as "/var/lock/please-don't-touch-the-burner",
and the lock is more portable ...

> There are good reasons for not having absolute locks, one of which is
> that you might want to force a reset or a hot unplug of an interface
> knowing you'll lose the CD its burning (eg because your flight is about
> to leave)

Killing cdrecord should take care of that lock.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [linux-usb-devel] USB/hal: USB open() broken? (USB CD burner underruns, USB HDD hard resets)
  2006-06-21 16:16             ` Bodo Eggert
@ 2006-06-21 16:34               ` Andreas Mohr
  2006-06-21 19:02                 ` Alan Stern
  0 siblings, 1 reply; 18+ messages in thread
From: Andreas Mohr @ 2006-06-21 16:34 UTC (permalink / raw)
  To: Bodo Eggert
  Cc: Alan Cox, andi, Andrew Morton, gregkh, linux-kernel,
	linux-usb-devel, hal

Hi,

On Wed, Jun 21, 2006 at 06:16:03PM +0200, Bodo Eggert wrote:
> On Wed, 21 Jun 2006, Alan Cox wrote:
> > Ar Mer, 2006-06-21 am 02:07 +0200, ysgrifennodd Bodo Eggert:
> 
> > > This does not work, since O_EXCL does not work:
> > > http://lkml.org/lkml/2006/2/5/137
> > 
> > It works fine. Its an advisory exclusive locking scheme which is
> > precisely what is needed and precisely how some vendors implement their
> > solution.
> 
> This will be as effective as "/var/lock/please-don't-touch-the-burner",
> and the lock is more portable ...

Indeed, until all(!) relevant apps specify the cooperative O_EXCL flag,
there will always be some trouble left somewhere...
And of course don't even dare trying to do a simply shell cat on the raw
I/O device during an ongoing burning operation, will you!?

Maybe it's better to (additionally?) go down the route of fixing up
low-level communication weaknesses (since it's been semi-confirmed that it's
an USB communication issue, see other thread part).
IMHO this is a severe user experience issue that shouldn't be fixed up
("covered", "hidden") by the O_EXCL thingy alone.

Andreas Mohr

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [linux-usb-devel] USB/hal: USB open() broken? (USB CD burner underruns, USB HDD hard resets)
  2006-06-21 16:34               ` Andreas Mohr
@ 2006-06-21 19:02                 ` Alan Stern
  2006-06-21 19:16                   ` Andreas Mohr
  2006-06-21 20:37                   ` Alan Cox
  0 siblings, 2 replies; 18+ messages in thread
From: Alan Stern @ 2006-06-21 19:02 UTC (permalink / raw)
  To: andi
  Cc: Bodo Eggert, Andrew Morton, linux-usb-devel, gregkh,
	linux-kernel, hal, Alan Cox

On Wed, 21 Jun 2006, Andreas Mohr wrote:

> Maybe it's better to (additionally?) go down the route of fixing up
> low-level communication weaknesses (since it's been semi-confirmed that it's
> an USB communication issue, see other thread part).
> IMHO this is a severe user experience issue that shouldn't be fixed up
> ("covered", "hidden") by the O_EXCL thingy alone.

It's not a USB issue; it's a matter of lack of coordination between the sg 
and sr drivers.  Each is unaware of the actions of the other, even when 
they are speaking to the same device.

Alan Stern


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [linux-usb-devel] USB/hal: USB open() broken? (USB CD burner underruns, USB HDD hard resets)
  2006-06-21 19:02                 ` Alan Stern
@ 2006-06-21 19:16                   ` Andreas Mohr
  2006-06-21 19:56                     ` Alan Stern
  2006-06-21 20:38                     ` Bodo Eggert
  2006-06-21 20:37                   ` Alan Cox
  1 sibling, 2 replies; 18+ messages in thread
From: Andreas Mohr @ 2006-06-21 19:16 UTC (permalink / raw)
  To: Alan Stern
  Cc: andi, Bodo Eggert, Andrew Morton, linux-usb-devel, gregkh,
	linux-kernel, hal, Alan Cox

Hi,

On Wed, Jun 21, 2006 at 03:02:44PM -0400, Alan Stern wrote:
> On Wed, 21 Jun 2006, Andreas Mohr wrote:
> 
> > Maybe it's better to (additionally?) go down the route of fixing up
> > low-level communication weaknesses (since it's been semi-confirmed that it's
> > an USB communication issue, see other thread part).
> > IMHO this is a severe user experience issue that shouldn't be fixed up
> > ("covered", "hidden") by the O_EXCL thingy alone.
> 
> It's not a USB issue; it's a matter of lack of coordination between the sg 
> and sr drivers.  Each is unaware of the actions of the other, even when 
> they are speaking to the same device.

Right, I could have expressed it much better before, sorry.

Found the relevant code:
sd.c sd_open()

        if (!sdkp->openers++ && sdev->removable) {
                if (scsi_block_when_processing_errors(sdev))
                        scsi_set_medium_removal(sdev, SCSI_REMOVAL_PREVENT);
        }

And the obvious question would be whether the sdkp->openers++ thingy
could somehow be extended to enclose all hardware device users so that
e.g. sr.c wouldn't send ALLOW_MEDIUM_REMOVAL on a device already locked
by e.g. the sd.c driver.
Difficult question, though, since the group of drivers possible to use
with a certain device is not a static set:
it could be via
- sr.c
- sd.c
- IDE (in the case of ATA devices mapped via ide-scsi)
- ???

Is it possible to have such a per-*hardware*-device instance in the kernel
to keep track of various things such as number of device openers?
I'll do some investigation myself, too...

Thanks!

Andreas Mohr

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [linux-usb-devel] USB/hal: USB open() broken? (USB CD burner underruns, USB HDD hard resets)
  2006-06-21 19:16                   ` Andreas Mohr
@ 2006-06-21 19:56                     ` Alan Stern
  2006-06-21 20:38                     ` Bodo Eggert
  1 sibling, 0 replies; 18+ messages in thread
From: Alan Stern @ 2006-06-21 19:56 UTC (permalink / raw)
  To: andi
  Cc: Bodo Eggert, Andrew Morton, linux-usb-devel, gregkh,
	linux-kernel, hal, Alan Cox

On Wed, 21 Jun 2006, Andreas Mohr wrote:

> > It's not a USB issue; it's a matter of lack of coordination between the sg 
> > and sr drivers.  Each is unaware of the actions of the other, even when 
> > they are speaking to the same device.
> 
> Right, I could have expressed it much better before, sorry.
> 
> Found the relevant code:
> sd.c sd_open()
> 
>         if (!sdkp->openers++ && sdev->removable) {
>                 if (scsi_block_when_processing_errors(sdev))
>                         scsi_set_medium_removal(sdev, SCSI_REMOVAL_PREVENT);
>         }

Um, this isn't the relevant code.  You're interested in sr.c, not sd.c.  
Furthermore, the actual ALLOW MEDIUM REMOVAL command is caused by code in 
drivers/cdrom/cdrom.c:cdrom_release().  This needs to be coordinated (the 
cdi->use_count variable) with the sg driver.

> And the obvious question would be whether the sdkp->openers++ thingy
> could somehow be extended to enclose all hardware device users so that
> e.g. sr.c wouldn't send ALLOW_MEDIUM_REMOVAL on a device already locked
> by e.g. the sd.c driver.
> Difficult question, though, since the group of drivers possible to use
> with a certain device is not a static set:
> it could be via
> - sr.c
> - sd.c
> - IDE (in the case of ATA devices mapped via ide-scsi)
> - ???
> 
> Is it possible to have such a per-*hardware*-device instance in the kernel
> to keep track of various things such as number of device openers?
> I'll do some investigation myself, too...

Look at include/scsi/scsi_device.h.  There's plenty of opportunity for 
adding an additional counter.

Alan Stern


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [linux-usb-devel] USB/hal: USB open() broken? (USB CD burner underruns, USB HDD hard resets)
  2006-06-21 19:02                 ` Alan Stern
  2006-06-21 19:16                   ` Andreas Mohr
@ 2006-06-21 20:37                   ` Alan Cox
  1 sibling, 0 replies; 18+ messages in thread
From: Alan Cox @ 2006-06-21 20:37 UTC (permalink / raw)
  To: Alan Stern
  Cc: andi, Bodo Eggert, Andrew Morton, linux-usb-devel, gregkh,
	linux-kernel, hal

Ar Mer, 2006-06-21 am 15:02 -0400, ysgrifennodd Alan Stern:
> It's not a USB issue; it's a matter of lack of coordination between the sg 
> and sr drivers.  Each is unaware of the actions of the other, even when 
> they are speaking to the same device.

Thats a relevant issue but sg is irrelevant for cd burning except with
various ancient software setups. Probably sg/sr should share the O_EXCL
locking but its not part of the cd burning stuff for modern setups.

Alan


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [linux-usb-devel] USB/hal: USB open() broken? (USB CD burner underruns, USB HDD hard resets)
  2006-06-21 19:16                   ` Andreas Mohr
  2006-06-21 19:56                     ` Alan Stern
@ 2006-06-21 20:38                     ` Bodo Eggert
  1 sibling, 0 replies; 18+ messages in thread
From: Bodo Eggert @ 2006-06-21 20:38 UTC (permalink / raw)
  To: andi
  Cc: Alan Stern, Bodo Eggert, Andrew Morton, linux-usb-devel, gregkh,
	linux-kernel, hal, Alan Cox

On Wed, 21 Jun 2006, Andreas Mohr wrote:

[...]
> And the obvious question would be whether the sdkp->openers++ thingy
> could somehow be extended to enclose all hardware device users so that
> e.g. sr.c wouldn't send ALLOW_MEDIUM_REMOVAL on a device already locked
> by e.g. the sd.c driver.
> Difficult question, though, since the group of drivers possible to use
> with a certain device is not a static set:
> it could be via
> - sr.c
> - sd.c
> - IDE (in the case of ATA devices mapped via ide-scsi)

> Is it possible to have such a per-*hardware*-device instance in the kernel
> to keep track of various things such as number of device openers?
> I'll do some investigation myself, too...

The sg part should be implemented by each SCSI device, reducing the 
current sg device to a mostly empty shell. Then you can prevent that
empty shell from binding to devices having more specific drivers.
-- 
Top 100 things you don't want the sysadmin to say:
30. And what does it mean 'rm: .o: No such file or directory'?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [linux-usb-devel] USB/hal: USB open() broken? (USB CD burner underruns, USB HDD hard resets)
  2006-06-21 19:06               ` Alan Stern
@ 2006-06-21 20:52                 ` Alan Cox
  0 siblings, 0 replies; 18+ messages in thread
From: Alan Cox @ 2006-06-21 20:52 UTC (permalink / raw)
  To: Alan Stern
  Cc: andi, Andrew Morton, gregkh, linux-kernel, hal, linux-usb-devel

Ar Mer, 2006-06-21 am 15:06 -0400, ysgrifennodd Alan Stern:
> > cdrecord is -dev=0,0,0 (whatever Linux device file this translates into)
> > or a similar device ID as returned by -scanbus.
> 
> That goes through the sg driver.

Use a cdrecord that understands SG_IO and dev=/dev/sr0


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [linux-usb-devel] USB/hal: USB open() broken? (USB CD burner underruns, USB HDD hard resets)
  2006-06-21 16:44             ` Andreas Mohr
@ 2006-06-21 19:06               ` Alan Stern
  2006-06-21 20:52                 ` Alan Cox
  0 siblings, 1 reply; 18+ messages in thread
From: Alan Stern @ 2006-06-21 19:06 UTC (permalink / raw)
  To: andi; +Cc: Alan Cox, Andrew Morton, gregkh, linux-kernel, hal, linux-usb-devel

On Wed, 21 Jun 2006, Andreas Mohr wrote:

> > The real problem seems to be that the device is reachable in two different 
> > ways, and they don't implement proper mutual exclusion.  HAL (or your test 
> > program) is undoubtedly using /dev/sr0 or something similar, whereas 
> > cdrecord uses /dev/sg0.  Going through two different drivers, it's no 
> > surprise they wind up interfering with each other.
> 
> HAL is /dev/host0/.../cd

That goes through the sr driver.

> cdrecord is -dev=0,0,0 (whatever Linux device file this translates into)
> or a similar device ID as returned by -scanbus.

That goes through the sg driver.

> Probably (stating the obvious here, I'm afraid) we should only send
> non-ALLOW_MEDIUM_REMOVAL for the *very first* device open,
> and then send ALLOW_MEDIUM_REMOVAL after the *very last* device close only.
> 
> So you think that with sr and sg drivers both talking to the device,
> proper inter-driver device tracking is not doable or quite difficult
> to implement?

Well, it's not being done now.  I suspect it wouldn't be too difficult 
technically.  The hardest part might be to obtain the agreement of the 
SCSI and CDROM developers.  :-)

Alan Stern


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [linux-usb-devel] USB/hal: USB open() broken? (USB CD burner underruns, USB HDD hard resets)
  2006-06-21 16:15           ` Alan Stern
@ 2006-06-21 16:44             ` Andreas Mohr
  2006-06-21 19:06               ` Alan Stern
  0 siblings, 1 reply; 18+ messages in thread
From: Andreas Mohr @ 2006-06-21 16:44 UTC (permalink / raw)
  To: Alan Stern
  Cc: andi, Alan Cox, Andrew Morton, gregkh, linux-kernel, hal,
	linux-usb-devel

Hi,

On Wed, Jun 21, 2006 at 12:15:08PM -0400, Alan Stern wrote:
> On Wed, 21 Jun 2006, Andreas Mohr wrote:
> > - TEST_UNIT_READY
> > - TEST_UNIT_READY
> > - READ_TOC (failure?)
> 
> I don't know why this failed.  Maybe the disc didn't have a valid Table of 
> Contents.

Ah, silly me, I should have stated that this was a simulation burn on an
otherwise rather blank disc ;)


> > - WRITE_10 (ok!)
> > - ALLOW_MEDIUM_REMOVAL (ok!)
> > - WRITE_10 (*** FAILURE! ***)
> > - going downhill from here...
> > 
> > 
> > So what could be the problem here?
> > READ_TOC might be it, but then it might be fully ok to have it fail
> > (after all it's non-valid data content), so ALLOW_MEDIUM_REMOVAL would be the
> > problem then? (next WRITE_10 FAILS!).
> 
> It sure does look like the ALLOW_MEDIUM_REMOVAL is the cause of the 
> problem.

Yup, already was quite sure of that after having written the previous mail.

I'll try to verify this by simply removing all ALLOW_MEDIUM_REMOVAL calls ;)


> > I could be totally wrong, though, since I don't have much storage debugging
> > experience.
> > 
> > 
> > A good idea would be to further check whether it's the open() or the close()
> > which disrupts burning for me.
> 
> Yep.  The ALLOW_MEDIUM_REMOVAL occurs as part of handling the close().  
> And you can understand a CD drive not wanting to carry out a long write 
> when the door is unlocked.
> 
> The real problem seems to be that the device is reachable in two different 
> ways, and they don't implement proper mutual exclusion.  HAL (or your test 
> program) is undoubtedly using /dev/sr0 or something similar, whereas 
> cdrecord uses /dev/sg0.  Going through two different drivers, it's no 
> surprise they wind up interfering with each other.

HAL is /dev/host0/.../cd
cdrecord is -dev=0,0,0 (whatever Linux device file this translates into)
or a similar device ID as returned by -scanbus.


Probably (stating the obvious here, I'm afraid) we should only send
non-ALLOW_MEDIUM_REMOVAL for the *very first* device open,
and then send ALLOW_MEDIUM_REMOVAL after the *very last* device close only.

So you think that with sr and sg drivers both talking to the device,
proper inter-driver device tracking is not doable or quite difficult
to implement?


> Unfortunately I can't debug this without seeing the start of the oops 
> message.

[OOPS output of a *different* issue]

Right, it's a rather incomplete OOPS. Let me try to get one with a nice
long-line VGA mode soon...

Thanks!

Andreas Mohr

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [linux-usb-devel] USB/hal: USB open() broken? (USB CD burner underruns, USB HDD hard resets)
  2006-06-21  9:33         ` Andreas Mohr
@ 2006-06-21 16:15           ` Alan Stern
  2006-06-21 16:44             ` Andreas Mohr
  0 siblings, 1 reply; 18+ messages in thread
From: Alan Stern @ 2006-06-21 16:15 UTC (permalink / raw)
  To: andi; +Cc: Alan Cox, Andrew Morton, gregkh, linux-kernel, hal, linux-usb-devel

On Wed, 21 Jun 2006, Andreas Mohr wrote:

> OK, at http://lisas.de/~andi/temp_usb/ there are logs:
> debug.log.gz: burning, then running my open() test app: test app started
> at 22:43:49, disrupted burning process.
> only_device_open.log.gz: device open() *only* (plus close()!),
> no other USB activity such as burning happening (for comparison purposes, to see
> what a usual open()/close() is like)
> 
> only_device_open.log.gz (and close()!) contains:
> - TEST_UNIT_READY (failure?)

Normal failure.  This indicates that the media has been changed since the
last I/O operation.

> - TEST_UNIT_READY
> - TEST_UNIT_READY
> - READ_TOC (failure?)

I don't know why this failed.  Maybe the disc didn't have a valid Table of 
Contents.

> - ALLOW_MEDIUM_REMOVAL
> - (unknown command) !!!!!!!
> - TEST_UNIT_READY
> - READ_TOC (failure?)
> - READ_TOC (failure?)

These failed for the same reason as the earlier READ_TOC.

> - READ_CAPACITY
> - ALLOW_MEDIUM_REMOVAL
> 
> Hmm, multiple failures in there: might be cable issues??

No.  If the cable was a problem then all the commands (not just READ_TOC)  
would have gotten errors (not failures).

> debug.log.gz contains:
> *** ongoing burning: ***
> - lots of WRITE_10 (NO failure!)
> - READ BUFFER CAPACITY
> - lots of WRITE_10 (NO failure!)
> - READ BUFFER CAPACITY
> - a couple WRITE_10 (NO failure!)
> *** [22:43:49] device open(): ***
> - TEST_UNIT_READY (ok!)
> - WRITE_10 (ok!!)
> - TEST_UNIT_READY (ok!)
> - WRITE_10 (ok!)
> - READ_TOC (*** ERROR!! ***)

This was a different sort of error.  The code was "Logical unit not ready, 
long write in progress", which makes sense.

> - WRITE_10 (ok!)
> - ALLOW_MEDIUM_REMOVAL (ok!)
> - WRITE_10 (*** FAILURE! ***)
> - going downhill from here...
> 
> 
> So what could be the problem here?
> READ_TOC might be it, but then it might be fully ok to have it fail
> (after all it's non-valid data content), so ALLOW_MEDIUM_REMOVAL would be the
> problem then? (next WRITE_10 FAILS!).

It sure does look like the ALLOW_MEDIUM_REMOVAL is the cause of the 
problem.

> I could be totally wrong, though, since I don't have much storage debugging
> experience.
> 
> 
> A good idea would be to further check whether it's the open() or the close()
> which disrupts burning for me.

Yep.  The ALLOW_MEDIUM_REMOVAL occurs as part of handling the close().  
And you can understand a CD drive not wanting to carry out a long write 
when the door is unlocked.

The real problem seems to be that the device is reachable in two different 
ways, and they don't implement proper mutual exclusion.  HAL (or your test 
program) is undoubtedly using /dev/sr0 or something similar, whereas 
cdrecord uses /dev/sg0.  Going through two different drivers, it's no 
surprise they wind up interfering with each other.


> Oh, and that burner_switchoff_oops.jpg in the same directory
> is an OOPS that happened when I tried to blank a CDRW,
> then cancelled the operation (2x Ctrl-C on cdrecord),
> but then had HAL device polling daemon and my test app block on I/O wait on the
> device that continued blanking the CDRW. Since I then didn't want to wait
> for the blanking to finish I had to switch off the device: immediate OOPS,
> possibly due to mis-handling the two processes still waiting on a busy device
> which then got switched off completely. Kernel 2.6.17-rc6-mm2.

Unfortunately I can't debug this without seeing the start of the oops 
message.

Alan Stern


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [linux-usb-devel] USB/hal: USB open() broken? (USB CD burner underruns, USB HDD hard resets)
  2006-06-20 14:22       ` Alan Stern
@ 2006-06-21  9:33         ` Andreas Mohr
  2006-06-21 16:15           ` Alan Stern
  0 siblings, 1 reply; 18+ messages in thread
From: Andreas Mohr @ 2006-06-21  9:33 UTC (permalink / raw)
  To: Alan Stern
  Cc: andi, Alan Cox, Andrew Morton, gregkh, linux-kernel, hal,
	linux-usb-devel

Hi,

On Tue, Jun 20, 2006 at 10:22:57AM -0400, Alan Stern wrote:
> On Tue, 20 Jun 2006, Andreas Mohr wrote:
> 
> > But how would HAL safely determine whether a (IDE/USB) drive is busy?
> > As my test app demonstrates (without HAL running), the *very first* open()
> > happening during an ongoing burning operation will kill it instantly, in the
> > USB case.
> > Are there any options left for HAL at all? Still seems to strongly point
> > towards a kernel issue so far.
> > 
> > One (rather less desireable) way I can make up might be to have HAL
> > keep the device open permanently and do an ioctl query on whether it's "busy"
> > and then quickly close the device again before the newly started
> > burning process gets disrupted (if this even properly works at all).
> 
> The open() call is not in itself the problem.
> 
> I would guess that the problem is sparked by the TEST UNIT READY command
> automatically sent when the device file is opened.  Although a drive
> should have no difficulty handling this command while carrying out a burn,
> apparently yours aborts.  In other words, this is likely to be a firmware 
> problem in the CD drive.

OK, at http://lisas.de/~andi/temp_usb/ there are logs:
debug.log.gz: burning, then running my open() test app: test app started
at 22:43:49, disrupted burning process.
only_device_open.log.gz: device open() *only* (plus close()!),
no other USB activity such as burning happening (for comparison purposes, to see
what a usual open()/close() is like)

only_device_open.log.gz (and close()!) contains:
- TEST_UNIT_READY (failure?)
- TEST_UNIT_READY
- TEST_UNIT_READY
- READ_TOC (failure?)
- ALLOW_MEDIUM_REMOVAL
- (unknown command) !!!!!!!
- TEST_UNIT_READY
- READ_TOC (failure?)
- READ_TOC (failure?)
- READ_CAPACITY
- ALLOW_MEDIUM_REMOVAL

Hmm, multiple failures in there: might be cable issues??


debug.log.gz contains:
*** ongoing burning: ***
- lots of WRITE_10 (NO failure!)
- READ BUFFER CAPACITY
- lots of WRITE_10 (NO failure!)
- READ BUFFER CAPACITY
- a couple WRITE_10 (NO failure!)
*** [22:43:49] device open(): ***
- TEST_UNIT_READY (ok!)
- WRITE_10 (ok!!)
- TEST_UNIT_READY (ok!)
- WRITE_10 (ok!)
- READ_TOC (*** ERROR!! ***)
- WRITE_10 (ok!)
- ALLOW_MEDIUM_REMOVAL (ok!)
- WRITE_10 (*** FAILURE! ***)
- going downhill from here...


So what could be the problem here?
READ_TOC might be it, but then it might be fully ok to have it fail
(after all it's non-valid data content), so ALLOW_MEDIUM_REMOVAL would be the
problem then? (next WRITE_10 FAILS!).

I could be totally wrong, though, since I don't have much storage debugging
experience.


A good idea would be to further check whether it's the open() or the close()
which disrupts burning for me.


> I can't tell what's going on with the USB HDD since you haven't provided 
> any information.

I'd like to, but can't since I don't have device access any more.


Oh, and that burner_switchoff_oops.jpg in the same directory
is an OOPS that happened when I tried to blank a CDRW,
then cancelled the operation (2x Ctrl-C on cdrecord),
but then had HAL device polling daemon and my test app block on I/O wait on the
device that continued blanking the CDRW. Since I then didn't want to wait
for the blanking to finish I had to switch off the device: immediate OOPS,
possibly due to mis-handling the two processes still waiting on a busy device
which then got switched off completely. Kernel 2.6.17-rc6-mm2.

Thanks!

Andreas Mohr

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [linux-usb-devel] USB/hal: USB open() broken? (USB CD burner underruns, USB HDD hard resets)
  2006-06-20  9:05     ` Andreas Mohr
  2006-06-20 10:18       ` Alan Cox
@ 2006-06-20 14:22       ` Alan Stern
  2006-06-21  9:33         ` Andreas Mohr
  1 sibling, 1 reply; 18+ messages in thread
From: Alan Stern @ 2006-06-20 14:22 UTC (permalink / raw)
  To: andi; +Cc: Alan Cox, Andrew Morton, gregkh, linux-kernel, hal, linux-usb-devel

On Tue, 20 Jun 2006, Andreas Mohr wrote:

> But how would HAL safely determine whether a (IDE/USB) drive is busy?
> As my test app demonstrates (without HAL running), the *very first* open()
> happening during an ongoing burning operation will kill it instantly, in the
> USB case.
> Are there any options left for HAL at all? Still seems to strongly point
> towards a kernel issue so far.
> 
> One (rather less desireable) way I can make up might be to have HAL
> keep the device open permanently and do an ioctl query on whether it's "busy"
> and then quickly close the device again before the newly started
> burning process gets disrupted (if this even properly works at all).

The open() call is not in itself the problem.

I would guess that the problem is sparked by the TEST UNIT READY command
automatically sent when the device file is opened.  Although a drive
should have no difficulty handling this command while carrying out a burn,
apparently yours aborts.  In other words, this is likely to be a firmware 
problem in the CD drive.

I can't tell what's going on with the USB HDD since you haven't provided 
any information.

If you want to find out what's actually happening instead of just 
guessing, turn on CONFIG_USB_STORAGE_DEBUG and see what the kernel log has 
to say for the time when the underrun/reset occurs.

Alan Stern


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [linux-usb-devel] USB/hal: USB open() broken? (USB CD burner underruns, USB HDD hard resets)
  2006-06-20  9:05     ` Andreas Mohr
@ 2006-06-20 10:18       ` Alan Cox
  2006-06-20 14:22       ` Alan Stern
  1 sibling, 0 replies; 18+ messages in thread
From: Alan Cox @ 2006-06-20 10:18 UTC (permalink / raw)
  To: andi; +Cc: Andrew Morton, gregkh, linux-kernel, linux-usb-devel, hal

Ar Maw, 2006-06-20 am 11:05 +0200, ysgrifennodd Andreas Mohr:
> But how would HAL safely determine whether a (IDE/USB) drive is busy?
> As my test app demonstrates (without HAL running), the *very first* open()
> happening during an ongoing burning operation will kill it instantly, in the
> USB case.
> Are there any options left for HAL at all? Still seems to strongly point
> towards a kernel issue so far.

In the IDE space O_EXCL has the needed semantics. At least it does on
Fedora and I don't think thats a Fedora patch, not sure if this is the
case for the USB side of things. 

> One (rather less desireable) way I can make up might be to have HAL
> keep the device open permanently and do an ioctl query on whether it's "busy"
> and then quickly close the device again before the newly started
> burning process gets disrupted (if this even properly works at all).

O_EXCL used by cdrecord is probably the right thing


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [linux-usb-devel] USB/hal: USB open() broken? (USB CD burner underruns, USB HDD hard resets)
  2006-06-20  8:37 ` Andrew Morton
@ 2006-06-20  9:06   ` Alan Cox
  2006-06-20  9:05     ` Andreas Mohr
  0 siblings, 1 reply; 18+ messages in thread
From: Alan Cox @ 2006-06-20  9:06 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Andreas Mohr, gregkh, linux-kernel, linux-usb-devel, hal

Ar Maw, 2006-06-20 am 01:37 -0700, ysgrifennodd Andrew Morton:
> [hald polling causes cdrecord to go bad on a USB CD drive]
> 
> One possible reason is that we're shooting down the device's pagecache by
> accident as a result of hald activity. 

On IDE hal causes problems with some drives because the additional
commands sent while the drive is busy end up timing out which triggers a
bus reset and breaks everything. Really HAL should have better manners
than to poll a drive that is busy.

Alan


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [linux-usb-devel] USB/hal: USB open() broken? (USB CD burner underruns, USB HDD hard resets)
  2006-06-20  9:06   ` [linux-usb-devel] " Alan Cox
@ 2006-06-20  9:05     ` Andreas Mohr
  2006-06-20 10:18       ` Alan Cox
  2006-06-20 14:22       ` Alan Stern
  0 siblings, 2 replies; 18+ messages in thread
From: Andreas Mohr @ 2006-06-20  9:05 UTC (permalink / raw)
  To: Alan Cox; +Cc: Andrew Morton, gregkh, linux-kernel, linux-usb-devel, hal

Hi,

On Tue, Jun 20, 2006 at 10:06:56AM +0100, Alan Cox wrote:
> Ar Maw, 2006-06-20 am 01:37 -0700, ysgrifennodd Andrew Morton:
> > [hald polling causes cdrecord to go bad on a USB CD drive]
> > 
> > One possible reason is that we're shooting down the device's pagecache by
> > accident as a result of hald activity. 
> 
> On IDE hal causes problems with some drives because the additional
> commands sent while the drive is busy end up timing out which triggers a
> bus reset and breaks everything. Really HAL should have better manners
> than to poll a drive that is busy.

But how would HAL safely determine whether a (IDE/USB) drive is busy?
As my test app demonstrates (without HAL running), the *very first* open()
happening during an ongoing burning operation will kill it instantly, in the
USB case.
Are there any options left for HAL at all? Still seems to strongly point
towards a kernel issue so far.

One (rather less desireable) way I can make up might be to have HAL
keep the device open permanently and do an ioctl query on whether it's "busy"
and then quickly close the device again before the newly started
burning process gets disrupted (if this even properly works at all).

Andreas Mohr

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2006-06-21 20:39 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <6pnj7-32Q-7@gated-at.bofh.it>
     [not found] ` <6pJWg-34g-5@gated-at.bofh.it>
     [not found]   ` <6pKfL-3sx-29@gated-at.bofh.it>
     [not found]     ` <6pKpl-3Sx-23@gated-at.bofh.it>
     [not found]       ` <6pLll-5iq-15@gated-at.bofh.it>
2006-06-21  0:07         ` [linux-usb-devel] USB/hal: USB open() broken? (USB CD burner underruns, USB HDD hard resets) Bodo Eggert
2006-06-21 10:53           ` Alan Cox
2006-06-21 16:16             ` Bodo Eggert
2006-06-21 16:34               ` Andreas Mohr
2006-06-21 19:02                 ` Alan Stern
2006-06-21 19:16                   ` Andreas Mohr
2006-06-21 19:56                     ` Alan Stern
2006-06-21 20:38                     ` Bodo Eggert
2006-06-21 20:37                   ` Alan Cox
2006-06-19  8:21 Andreas Mohr
2006-06-20  8:37 ` Andrew Morton
2006-06-20  9:06   ` [linux-usb-devel] " Alan Cox
2006-06-20  9:05     ` Andreas Mohr
2006-06-20 10:18       ` Alan Cox
2006-06-20 14:22       ` Alan Stern
2006-06-21  9:33         ` Andreas Mohr
2006-06-21 16:15           ` Alan Stern
2006-06-21 16:44             ` Andreas Mohr
2006-06-21 19:06               ` Alan Stern
2006-06-21 20:52                 ` Alan Cox

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).