linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [Bug 1412] Copy from USB1 CF/SM reader stalls, no actual content is read (only directory structure)
       [not found] <1067633171.3886.1.camel@m70.net81-64-235.noos.fr>
@ 2003-11-01 15:47 ` Alan Stern
  2003-11-04  7:49   ` Jens Axboe
  0 siblings, 1 reply; 15+ messages in thread
From: Alan Stern @ 2003-11-01 15:47 UTC (permalink / raw)
  To: Nicolas Mailhot; +Cc: USB development list, linux-kernel

On Fri, 31 Oct 2003, Nicolas Mailhot wrote:

> Le ven 31/10/2003 à 19:33, bugme-daemon@osdl.org a écrit :
> > ------- Additional Comments From stern@rowland.harvard.edu  2003-10-31 10:33 -------
> > Created an attachment (id=1303)
> >  --> (http://bugme.osdl.org/attachment.cgi?id=1303&action=view)
> > New diagnostic patch
> > 
> > Try using this diagnostic patch instead of the one sent earlier by Matt Dharm
> > in attachment #14.  It may reveal that the source of the problem is not in the
> > USB system but higher up.
> 
> Done. Do you want to be CCed on this bug ?

There's no need.  The output from that diagnostic is definitive -- this 
isn't a USB problem.  It's got to be a problem higher up, maybe in the 
SCSI layer but more likely in the block layer or the file system code.

Oct 31 21:12:09 rousalka kernel: usb-storage: READ_10: read page 160 pagect 8
Oct 31 21:12:09 rousalka kernel: usb-storage: sddr09_read_data: use_sg = 1, sg[0].page = c18c5b40, .offset = 0, sg_address = 00000000
Oct 31 21:12:09 rousalka kernel: usb-storage: Read 8 pages, from PBA 11 (LBA 5) page 0
Oct 31 21:12:09 rousalka kernel: usb-storage: usb_stor_ctrl_transfer: rq=00 rqtype=41 value=0000 index=00 len=12
Oct 31 21:12:09 rousalka kernel: usb-storage: Status code 0; transferred 12/12
Oct 31 21:12:09 rousalka kernel: usb-storage: -- transfer complete
Oct 31 21:12:09 rousalka kernel: usb-storage: usb_stor_bulk_transfer_buf: xfer 4096 bytes
Oct 31 21:12:09 rousalka kernel: usb-storage: Status code 0; transferred 4096/4096
Oct 31 21:12:09 rousalka kernel: usb-storage: -- transfer complete
Oct 31 21:12:09 rousalka kernel: usb-storage: us_copy_to_sgbuf: buffer = f38e6000, len = 4096, content = f7e01e40, *index = 0, *offset = 0, use_sg = 1
Oct 31 21:12:09 rousalka kernel: usb-storage: in loop, ptr = 00000000

The line where it says sg_address = 00000000 indicates that something has
asked us to read data from the device and store it at a virtual address
that isn't mapped to memory!  You can see the same thing in the last line,
where ptr = 00000000.  No wonder you get an oops.

This bug should be reassigned, but I don't know to whom.  Maybe someone on 
the linux-kernel mailing list can help.

Alan Stern


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Bug 1412] Copy from USB1 CF/SM reader stalls, no actual content is read (only directory structure)
  2003-11-01 15:47 ` [Bug 1412] Copy from USB1 CF/SM reader stalls, no actual content is read (only directory structure) Alan Stern
@ 2003-11-04  7:49   ` Jens Axboe
  2003-11-04 17:33     ` Alan Stern
  0 siblings, 1 reply; 15+ messages in thread
From: Jens Axboe @ 2003-11-04  7:49 UTC (permalink / raw)
  To: Alan Stern; +Cc: Nicolas Mailhot, USB development list, linux-kernel

On Sat, Nov 01 2003, Alan Stern wrote:
> On Fri, 31 Oct 2003, Nicolas Mailhot wrote:
> 
> > Le ven 31/10/2003 à 19:33, bugme-daemon@osdl.org a écrit :
> > > ------- Additional Comments From stern@rowland.harvard.edu  2003-10-31 10:33 -------
> > > Created an attachment (id=1303)
> > >  --> (http://bugme.osdl.org/attachment.cgi?id=1303&action=view)
> > > New diagnostic patch
> > > 
> > > Try using this diagnostic patch instead of the one sent earlier by Matt Dharm
> > > in attachment #14.  It may reveal that the source of the problem is not in the
> > > USB system but higher up.
> > 
> > Done. Do you want to be CCed on this bug ?
> 
> There's no need.  The output from that diagnostic is definitive -- this 
> isn't a USB problem.  It's got to be a problem higher up, maybe in the 
> SCSI layer but more likely in the block layer or the file system code.
> 
> Oct 31 21:12:09 rousalka kernel: usb-storage: READ_10: read page 160 pagect 8
> Oct 31 21:12:09 rousalka kernel: usb-storage: sddr09_read_data: use_sg = 1, sg[0].page = c18c5b40, .offset = 0, sg_address = 00000000
> Oct 31 21:12:09 rousalka kernel: usb-storage: Read 8 pages, from PBA 11 (LBA 5) page 0
> Oct 31 21:12:09 rousalka kernel: usb-storage: usb_stor_ctrl_transfer: rq=00 rqtype=41 value=0000 index=00 len=12
> Oct 31 21:12:09 rousalka kernel: usb-storage: Status code 0; transferred 12/12
> Oct 31 21:12:09 rousalka kernel: usb-storage: -- transfer complete
> Oct 31 21:12:09 rousalka kernel: usb-storage: usb_stor_bulk_transfer_buf: xfer 4096 bytes
> Oct 31 21:12:09 rousalka kernel: usb-storage: Status code 0; transferred 4096/4096
> Oct 31 21:12:09 rousalka kernel: usb-storage: -- transfer complete
> Oct 31 21:12:09 rousalka kernel: usb-storage: us_copy_to_sgbuf: buffer = f38e6000, len = 4096, content = f7e01e40, *index = 0, *offset = 0, use_sg = 1
> Oct 31 21:12:09 rousalka kernel: usb-storage: in loop, ptr = 00000000
> 
> The line where it says sg_address = 00000000 indicates that something has
> asked us to read data from the device and store it at a virtual address
> that isn't mapped to memory!  You can see the same thing in the last line,
> where ptr = 00000000.  No wonder you get an oops.

I don't think your analysis is correct. The sg entry has a page element
assigned, it just looks like someone didn't dma map the sg entry yet. I
had a quick peek in sddr09, lots of ugliness in there... The
sg_address() crap needs to go. And kfree() on ptr + offset, irk. And I
hope you are not dma'ing into these pages. Missing checks for kmalloc()
failure. Etc.

Someone needs to read up on DMA-mapping.txt and fix it.  Clean these
bits up and I bet it works.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Bug 1412] Copy from USB1 CF/SM reader stalls, no actual content is read (only directory structure)
  2003-11-04  7:49   ` Jens Axboe
@ 2003-11-04 17:33     ` Alan Stern
  2003-11-05  8:40       ` Jens Axboe
  0 siblings, 1 reply; 15+ messages in thread
From: Alan Stern @ 2003-11-04 17:33 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Nicolas Mailhot, USB development list, linux-kernel

On Tue, 4 Nov 2003, Jens Axboe wrote:

> On Sat, Nov 01 2003, Alan Stern wrote:
> > 
> > There's no need.  The output from that diagnostic is definitive -- this 
> > isn't a USB problem.  It's got to be a problem higher up, maybe in the 
> > SCSI layer but more likely in the block layer or the file system code.
> > 
> > Oct 31 21:12:09 rousalka kernel: usb-storage: READ_10: read page 160 pagect 8
> > Oct 31 21:12:09 rousalka kernel: usb-storage: sddr09_read_data: use_sg = 1, sg[0].page = c18c5b40, .offset = 0, sg_address = 00000000
> > Oct 31 21:12:09 rousalka kernel: usb-storage: Read 8 pages, from PBA 11 (LBA 5) page 0
> > Oct 31 21:12:09 rousalka kernel: usb-storage: usb_stor_ctrl_transfer: rq=00 rqtype=41 value=0000 index=00 len=12
> > Oct 31 21:12:09 rousalka kernel: usb-storage: Status code 0; transferred 12/12
> > Oct 31 21:12:09 rousalka kernel: usb-storage: -- transfer complete
> > Oct 31 21:12:09 rousalka kernel: usb-storage: usb_stor_bulk_transfer_buf: xfer 4096 bytes
> > Oct 31 21:12:09 rousalka kernel: usb-storage: Status code 0; transferred 4096/4096
> > Oct 31 21:12:09 rousalka kernel: usb-storage: -- transfer complete
> > Oct 31 21:12:09 rousalka kernel: usb-storage: us_copy_to_sgbuf: buffer = f38e6000, len = 4096, content = f7e01e40, *index = 0, *offset = 0, use_sg = 1
> > Oct 31 21:12:09 rousalka kernel: usb-storage: in loop, ptr = 00000000
> > 
> > The line where it says sg_address = 00000000 indicates that something has
> > asked us to read data from the device and store it at a virtual address
> > that isn't mapped to memory!  You can see the same thing in the last line,
> > where ptr = 00000000.  No wonder you get an oops.
> 
> I don't think your analysis is correct. The sg entry has a page element
> assigned, it just looks like someone didn't dma map the sg entry yet. I
> had a quick peek in sddr09, lots of ugliness in there... The
> sg_address() crap needs to go. And kfree() on ptr + offset, irk. And I
> hope you are not dma'ing into these pages. Missing checks for kmalloc()
> failure. Etc.
> 
> Someone needs to read up on DMA-mapping.txt and fix it.  Clean these
> bits up and I bet it works.


I think you're partly right but wrong where it matters.

	1. No, the sg entry was DMA-mapped previously and then unmapped.
The problem Nicholas encountered is an oops that occurs later during a
memcpy, not during a DMA operation.  Apparently the page is not locked in
memory.

	2. Yes, sddr09 is ugly.  I don't mind agreeing with that since I 
didn't write it :-)

	3. What's wrong with sg_adress()?  I notice that bio_data() in 
include/linux/bio.h is practically the same.

	4. What's wrong with kfree() on ptr + offset?  As long as that 
points to the same address as was kmalloc'ed, it should work fine.

	5. We are doing DMA into those pages.  What's wrong with that?  
They were allocated using kmalloc, so there should be no difficulty in 
mapping and using them.

	6. There _are_ checks for kmalloc() failing; they just aren't 
where you would expect to see them.  The check depends on virt_to_page() 
returning 0 when passed an argument of 0.  Maybe that assumption is wrong.
Certainly it wouldn't hurt to do the checking properly.

	7. DMA-mapping isn't the issue.  The problem is that 
sddr09_read_data() -- not sddr09_read_map() as you seem to have assumed -- 
has been passed an sg entry for which page_address() returns 0.

Alan Stern


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Bug 1412] Copy from USB1 CF/SM reader stalls, no actual content is read (only directory structure)
  2003-11-04 17:33     ` Alan Stern
@ 2003-11-05  8:40       ` Jens Axboe
  2003-11-05 15:47         ` Alan Stern
  0 siblings, 1 reply; 15+ messages in thread
From: Jens Axboe @ 2003-11-05  8:40 UTC (permalink / raw)
  To: Alan Stern; +Cc: Nicolas Mailhot, USB development list, linux-kernel

On Tue, Nov 04 2003, Alan Stern wrote:
> On Tue, 4 Nov 2003, Jens Axboe wrote:
> 
> > On Sat, Nov 01 2003, Alan Stern wrote:
> > > 
> > > There's no need.  The output from that diagnostic is definitive -- this 
> > > isn't a USB problem.  It's got to be a problem higher up, maybe in the 
> > > SCSI layer but more likely in the block layer or the file system code.
> > > 
> > > Oct 31 21:12:09 rousalka kernel: usb-storage: READ_10: read page 160 pagect 8
> > > Oct 31 21:12:09 rousalka kernel: usb-storage: sddr09_read_data: use_sg = 1, sg[0].page = c18c5b40, .offset = 0, sg_address = 00000000
> > > Oct 31 21:12:09 rousalka kernel: usb-storage: Read 8 pages, from PBA 11 (LBA 5) page 0
> > > Oct 31 21:12:09 rousalka kernel: usb-storage: usb_stor_ctrl_transfer: rq=00 rqtype=41 value=0000 index=00 len=12
> > > Oct 31 21:12:09 rousalka kernel: usb-storage: Status code 0; transferred 12/12
> > > Oct 31 21:12:09 rousalka kernel: usb-storage: -- transfer complete
> > > Oct 31 21:12:09 rousalka kernel: usb-storage: usb_stor_bulk_transfer_buf: xfer 4096 bytes
> > > Oct 31 21:12:09 rousalka kernel: usb-storage: Status code 0; transferred 4096/4096
> > > Oct 31 21:12:09 rousalka kernel: usb-storage: -- transfer complete
> > > Oct 31 21:12:09 rousalka kernel: usb-storage: us_copy_to_sgbuf: buffer = f38e6000, len = 4096, content = f7e01e40, *index = 0, *offset = 0, use_sg = 1
> > > Oct 31 21:12:09 rousalka kernel: usb-storage: in loop, ptr = 00000000
> > > 
> > > The line where it says sg_address = 00000000 indicates that something has
> > > asked us to read data from the device and store it at a virtual address
> > > that isn't mapped to memory!  You can see the same thing in the last line,
> > > where ptr = 00000000.  No wonder you get an oops.
> > 
> > I don't think your analysis is correct. The sg entry has a page element
> > assigned, it just looks like someone didn't dma map the sg entry yet. I
> > had a quick peek in sddr09, lots of ugliness in there... The
> > sg_address() crap needs to go. And kfree() on ptr + offset, irk. And I
> > hope you are not dma'ing into these pages. Missing checks for kmalloc()
> > failure. Etc.
> > 
> > Someone needs to read up on DMA-mapping.txt and fix it.  Clean these
> > bits up and I bet it works.
> 
> 
> I think you're partly right but wrong where it matters.
> 
> 	1. No, the sg entry was DMA-mapped previously and then unmapped.
> The problem Nicholas encountered is an oops that occurs later during a
> memcpy, not during a DMA operation.  Apparently the page is not locked in
> memory.

Well what's the difference? Still a driver bug. Where does the dma
mapping happen?

> 	2. Yes, sddr09 is ugly.  I don't mind agreeing with that since I 
> didn't write it :-)

:)

> 	3. What's wrong with sg_adress()?  I notice that bio_data() in 
> include/linux/bio.h is practically the same.

Yeah, bio_data() is not supposed to be used either, but it also has a
comment to that effect. sg_address() is worse, because it's readding
something that 2.5 explicitly removed to encourage proper use of the pci
dma mapping api.

> 	4. What's wrong with kfree() on ptr + offset?  As long as that 
> points to the same address as was kmalloc'ed, it should work fine.

Ah you are right, there's nothing wrong with that. It's just the
implementation that looks odd. Why not do away with that kmalloc() and
virt_to_page(), and just alloc_page() explicitly? Kill the 2.4 ifdefs at
the same time.

> 	5. We are doing DMA into those pages.  What's wrong with that?  
> They were allocated using kmalloc, so there should be no difficulty in 
> mapping and using them.

Where are they mapped? Are you flushing buffers appropriately?

> 	6. There _are_ checks for kmalloc() failing; they just aren't 
> where you would expect to see them.  The check depends on virt_to_page() 
> returning 0 when passed an argument of 0.  Maybe that assumption is wrong.
> Certainly it wouldn't hurt to do the checking properly.

Irk that's even more ugly... And it's wrong, too.

> 	7. DMA-mapping isn't the issue.  The problem is that 
> sddr09_read_data() -- not sddr09_read_map() as you seem to have assumed -- 
> has been passed an sg entry for which page_address() returns 0.

But why does that happen, if ->page is set? That's the mystery, and that
heavily points to a driver bug.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Bug 1412] Copy from USB1 CF/SM reader stalls, no actual content is read (only directory structure)
  2003-11-05  8:40       ` Jens Axboe
@ 2003-11-05 15:47         ` Alan Stern
  2003-11-07  8:24           ` Jens Axboe
  0 siblings, 1 reply; 15+ messages in thread
From: Alan Stern @ 2003-11-05 15:47 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Nicolas Mailhot, USB development list, linux-kernel

On Wed, 5 Nov 2003, Jens Axboe wrote:

> On Tue, Nov 04 2003, Alan Stern wrote:
> > 
> > 	1. No, the sg entry was DMA-mapped previously and then unmapped.
> > The problem Nicholas encountered is an oops that occurs later during a
> > memcpy, not during a DMA operation.  Apparently the page is not locked in
> > memory.
> 
> Well what's the difference? Still a driver bug. Where does the dma
> mapping happen?

Sorry, my answer above was wrong.  In Nicholas's particular case there was
no DMA transfer at all, only memcpy().  There are other cases that do use
DMA, of course, like the one you were looking at.


> > 	3. What's wrong with sg_adress()?  I notice that bio_data() in 
> > include/linux/bio.h is practically the same.
> 
> Yeah, bio_data() is not supposed to be used either, but it also has a
> comment to that effect. sg_address() is worse, because it's readding
> something that 2.5 explicitly removed to encourage proper use of the pci
> dma mapping api.

Okay.  The real trouble we're facing is that the driver is handed an sg 
list, but sometimes it has to copy data directly to/from the memory area 
(via memcpy() for example) rather than doing I/O -- and even when doing 
I/O it may sometimes have to use PIO rather than DMA.

Anyway, we have to be able to access the underlying memory for the sg 
buffer.  The code currently uses

	page_address(sg[i].page) + sg[i].offset

to do this.  Is this now wrong?  And if so, what should it use instead?
I don't see anything about it in DMA-mapping.txt.


> > 	4. What's wrong with kfree() on ptr + offset?  As long as that 
> > points to the same address as was kmalloc'ed, it should work fine.
> 
> Ah you are right, there's nothing wrong with that. It's just the
> implementation that looks odd. Why not do away with that kmalloc() and
> virt_to_page(), and just alloc_page() explicitly? Kill the 2.4 ifdefs at
> the same time.

Certainly the 2.4 ifdefs can go away now.  We use kmalloc() rather than
alloc_page() because we will have to examine the allocated memory after
the DMA operation is complete.


> > 	5. We are doing DMA into those pages.  What's wrong with that?  
> > They were allocated using kmalloc, so there should be no difficulty in 
> > mapping and using them.
> 
> Where are they mapped? Are you flushing buffers appropriately?

They are mapped in drivers/usb/core/usb.c:usb_buffer_map_sg().  I don't 
know what buffers you're referring to.


> > 	6. There _are_ checks for kmalloc() failing; they just aren't 
> > where you would expect to see them.  The check depends on virt_to_page() 
> > returning 0 when passed an argument of 0.  Maybe that assumption is wrong.
> > Certainly it wouldn't hurt to do the checking properly.
> 
> Irk that's even more ugly... And it's wrong, too.

I'll change it.  But first we better settle the other questions I asked 
above.


> > 	7. DMA-mapping isn't the issue.  The problem is that 
> > sddr09_read_data() -- not sddr09_read_map() as you seem to have assumed -- 
> > has been passed an sg entry for which page_address() returns 0.
> 
> But why does that happen, if ->page is set? That's the mystery, and that
> heavily points to a driver bug.

You mean, why does page_address() return 0 if ->page is set?  I don't
know.  In fact, it's not even certain that ->page _is_ set -- I didn't
think to print it out in the diagnostic patch.

In any case, it quite likely _does_ point to a driver bug.  But since
sddr09_read_data() was handed this sg entry and didn't change it, if there
is such a bug it must lie in a higher-level driver.  Maybe the scsi layer, 
maybe the block layer, maybe the memory-management system, maybe the file 
system.  That was my original point.

Alan Stern


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Bug 1412] Copy from USB1 CF/SM reader stalls, no actual content is read (only directory structure)
  2003-11-05 15:47         ` Alan Stern
@ 2003-11-07  8:24           ` Jens Axboe
  2003-11-07  8:50             ` Nicolas Mailhot
                               ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Jens Axboe @ 2003-11-07  8:24 UTC (permalink / raw)
  To: Alan Stern; +Cc: Nicolas Mailhot, USB development list, linux-kernel

On Wed, Nov 05 2003, Alan Stern wrote:
> > > 	3. What's wrong with sg_adress()?  I notice that bio_data() in 
> > > include/linux/bio.h is practically the same.
> > 
> > Yeah, bio_data() is not supposed to be used either, but it also has a
> > comment to that effect. sg_address() is worse, because it's readding
> > something that 2.5 explicitly removed to encourage proper use of the pci
> > dma mapping api.
> 
> Okay.  The real trouble we're facing is that the driver is handed an sg 
> list, but sometimes it has to copy data directly to/from the memory area 
> (via memcpy() for example) rather than doing I/O -- and even when doing 
> I/O it may sometimes have to use PIO rather than DMA.
> 
> Anyway, we have to be able to access the underlying memory for the sg 
> buffer.  The code currently uses
> 
> 	page_address(sg[i].page) + sg[i].offset
> 
> to do this.  Is this now wrong?  And if so, what should it use instead?
> I don't see anything about it in DMA-mapping.txt.

No that looks alright, given you are allocating low memory pages. The
devices can probably do full 32-bit dma I bet, though. Not having looked
too closely, that will likely require invasive changes to support if you
are used to virtually mapped pages.

> > > 	4. What's wrong with kfree() on ptr + offset?  As long as that 
> > > points to the same address as was kmalloc'ed, it should work fine.
> > 
> > Ah you are right, there's nothing wrong with that. It's just the
> > implementation that looks odd. Why not do away with that kmalloc() and
> > virt_to_page(), and just alloc_page() explicitly? Kill the 2.4 ifdefs at
> > the same time.
> 
> Certainly the 2.4 ifdefs can go away now.  We use kmalloc() rather than
> alloc_page() because we will have to examine the allocated memory after
> the DMA operation is complete.

The alloc_page() would be nicer if you supported highmem io as well. It
still not a problem to access it, simply kmap it.

> > > 	5. We are doing DMA into those pages.  What's wrong with that?  
> > > They were allocated using kmalloc, so there should be no difficulty in 
> > > mapping and using them.
> > 
> > Where are they mapped? Are you flushing buffers appropriately?
> 
> They are mapped in drivers/usb/core/usb.c:usb_buffer_map_sg().  I don't 
> know what buffers you're referring to.

The sg buffers. The mapping looks ok to me, no problem there.

> > > 	7. DMA-mapping isn't the issue.  The problem is that 
> > > sddr09_read_data() -- not sddr09_read_map() as you seem to have assumed -- 
> > > has been passed an sg entry for which page_address() returns 0.
> > 
> > But why does that happen, if ->page is set? That's the mystery, and that
> > heavily points to a driver bug.
> 
> You mean, why does page_address() return 0 if ->page is set?  I don't
> know.  In fact, it's not even certain that ->page _is_ set -- I didn't
> think to print it out in the diagnostic patch.

I've lost the original mail in the thread, but I'm quite sure it listed
sg[0].page as being valid. Maybe the driver cleared it somewhere too
early? page_address() will not have returned NULL for a valid page.
Unless it's a highmem page, in that case you have to map it and unmap
after use. For drivers like this that aren't performance critical and
are by no means highmem ready, it's easier just to set your dma mask low
enough that you wont be handed such pages.

> In any case, it quite likely _does_ point to a driver bug.  But since
> sddr09_read_data() was handed this sg entry and didn't change it, if there
> is such a bug it must lie in a higher-level driver.  Maybe the scsi layer, 
> maybe the block layer, maybe the memory-management system, maybe the file 
> system.  That was my original point.

Well, the sg entry looks perfectly valid. And that was my original
point :-). And that is why I said it looks like a driver bug, not in
upper layers. How much memory did the system that crashed have? If the
system has highmem, try testing with scsi_calculate_bounce_limit()
unconditionally returning BLK_BOUNCE_HIGH.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Bug 1412] Copy from USB1 CF/SM reader stalls, no actual content is read (only directory structure)
  2003-11-07  8:24           ` Jens Axboe
@ 2003-11-07  8:50             ` Nicolas Mailhot
  2003-11-07  9:09               ` Jens Axboe
  2003-11-07 15:48             ` Alan Stern
  2003-11-07 18:56             ` [linux-usb-devel] " David Brownell
  2 siblings, 1 reply; 15+ messages in thread
From: Nicolas Mailhot @ 2003-11-07  8:50 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Alan Stern, USB development list, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 906 bytes --]

Le ven 07/11/2003 à 09:24, Jens Axboe a écrit :
> On Wed, Nov 05 2003, Alan Stern wrote:

> > In any case, it quite likely _does_ point to a driver bug.  But since
> > sddr09_read_data() was handed this sg entry and didn't change it, if there
> > is such a bug it must lie in a higher-level driver.  Maybe the scsi layer, 
> > maybe the block layer, maybe the memory-management system, maybe the file 
> > system.  That was my original point.
> 
> Well, the sg entry looks perfectly valid. And that was my original
> point :-). And that is why I said it looks like a driver bug, not in
> upper layers. How much memory did the system that crashed have? If the
> system has highmem, try testing with scsi_calculate_bounce_limit()
> unconditionally returning BLK_BOUNCE_HIGH.

The system has 1 GiB of memory, ie just enough to make stuff like
radeonfb fail

Cheers,

-- 
Nicolas Mailhot

[-- Attachment #2: Ceci est une partie de message numériquement signée. --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Bug 1412] Copy from USB1 CF/SM reader stalls, no actual content is read (only directory structure)
  2003-11-07  8:50             ` Nicolas Mailhot
@ 2003-11-07  9:09               ` Jens Axboe
  2003-11-07  9:25                 ` Nicolas Mailhot
  0 siblings, 1 reply; 15+ messages in thread
From: Jens Axboe @ 2003-11-07  9:09 UTC (permalink / raw)
  To: Nicolas Mailhot; +Cc: Alan Stern, USB development list, linux-kernel

On Fri, Nov 07 2003, Nicolas Mailhot wrote:
> Le ven 07/11/2003 à 09:24, Jens Axboe a écrit :
> > On Wed, Nov 05 2003, Alan Stern wrote:
> 
> > > In any case, it quite likely _does_ point to a driver bug.  But since
> > > sddr09_read_data() was handed this sg entry and didn't change it, if there
> > > is such a bug it must lie in a higher-level driver.  Maybe the scsi layer, 
> > > maybe the block layer, maybe the memory-management system, maybe the file 
> > > system.  That was my original point.
> > 
> > Well, the sg entry looks perfectly valid. And that was my original
> > point :-). And that is why I said it looks like a driver bug, not in
> > upper layers. How much memory did the system that crashed have? If the
> > system has highmem, try testing with scsi_calculate_bounce_limit()
> > unconditionally returning BLK_BOUNCE_HIGH.
> 
> The system has 1 GiB of memory, ie just enough to make stuff like
> radeonfb fail

Try with this debug patch then, does it work now?

===== drivers/scsi/scsi_lib.c 1.77 vs edited =====
--- 1.77/drivers/scsi/scsi_lib.c	Tue Oct 14 09:28:06 2003
+++ edited/drivers/scsi/scsi_lib.c	Fri Nov  7 10:08:52 2003
@@ -1215,6 +1215,7 @@
 
 u64 scsi_calculate_bounce_limit(struct Scsi_Host *shost)
 {
+#if 0
 	struct device *host_dev;
 
 	if (shost->unchecked_isa_dma)
@@ -1229,6 +1230,9 @@
 	 * hardware have no practical limit.
 	 */
 	return BLK_BOUNCE_ANY;
+#else
+	return BLK_BOUNCE_HIGH;
+#endif
 }
 
 struct request_queue *scsi_alloc_queue(struct scsi_device *sdev)

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Bug 1412] Copy from USB1 CF/SM reader stalls, no actual content is read (only directory structure)
  2003-11-07  9:09               ` Jens Axboe
@ 2003-11-07  9:25                 ` Nicolas Mailhot
  2003-11-07 21:02                   ` Nicolas Mailhot
  0 siblings, 1 reply; 15+ messages in thread
From: Nicolas Mailhot @ 2003-11-07  9:25 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Alan Stern, USB development list, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1836 bytes --]

Le ven 07/11/2003 à 10:09, Jens Axboe a écrit :
> On Fri, Nov 07 2003, Nicolas Mailhot wrote:
> > Le ven 07/11/2003 à 09:24, Jens Axboe a écrit :
> > > On Wed, Nov 05 2003, Alan Stern wrote:
> > 
> > > > In any case, it quite likely _does_ point to a driver bug.  But since
> > > > sddr09_read_data() was handed this sg entry and didn't change it, if there
> > > > is such a bug it must lie in a higher-level driver.  Maybe the scsi layer, 
> > > > maybe the block layer, maybe the memory-management system, maybe the file 
> > > > system.  That was my original point.
> > > 
> > > Well, the sg entry looks perfectly valid. And that was my original
> > > point :-). And that is why I said it looks like a driver bug, not in
> > > upper layers. How much memory did the system that crashed have? If the
> > > system has highmem, try testing with scsi_calculate_bounce_limit()
> > > unconditionally returning BLK_BOUNCE_HIGH.
> > 
> > The system has 1 GiB of memory, ie just enough to make stuff like
> > radeonfb fail
> 
> Try with this debug patch then, does it work now?
> 
> ===== drivers/scsi/scsi_lib.c 1.77 vs edited =====
> --- 1.77/drivers/scsi/scsi_lib.c	Tue Oct 14 09:28:06 2003
> +++ edited/drivers/scsi/scsi_lib.c	Fri Nov  7 10:08:52 2003
> @@ -1215,6 +1215,7 @@
>  
>  u64 scsi_calculate_bounce_limit(struct Scsi_Host *shost)
>  {
> +#if 0
>  	struct device *host_dev;
>  
>  	if (shost->unchecked_isa_dma)
> @@ -1229,6 +1230,9 @@
>  	 * hardware have no practical limit.
>  	 */
>  	return BLK_BOUNCE_ANY;
> +#else
> +	return BLK_BOUNCE_HIGH;
> +#endif
>  }
>  
>  struct request_queue *scsi_alloc_queue(struct scsi_device *sdev)

Will try this evening when I have physical access to the system. (It's
difficult to plug a USB device via ssh;)

Cheers,

-- 
Nicolas Mailhot

[-- Attachment #2: Ceci est une partie de message numériquement signée. --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Bug 1412] Copy from USB1 CF/SM reader stalls, no actual content is read (only directory structure)
  2003-11-07  8:24           ` Jens Axboe
  2003-11-07  8:50             ` Nicolas Mailhot
@ 2003-11-07 15:48             ` Alan Stern
  2003-11-07 21:22               ` Jens Axboe
  2003-11-07 18:56             ` [linux-usb-devel] " David Brownell
  2 siblings, 1 reply; 15+ messages in thread
From: Alan Stern @ 2003-11-07 15:48 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Nicolas Mailhot, USB development list, linux-kernel

On Fri, 7 Nov 2003, Jens Axboe wrote:

> I've lost the original mail in the thread, but I'm quite sure it listed
> sg[0].page as being valid. Maybe the driver cleared it somewhere too
> early? page_address() will not have returned NULL for a valid page.
> Unless it's a highmem page, in that case you have to map it and unmap
> after use. For drivers like this that aren't performance critical and
> are by no means highmem ready, it's easier just to set your dma mask low
> enough that you wont be handed such pages.

I think you may have put your finger on the problem.  Our difficulty is
that we have no control over where these sg pages are allocated; that
depends on the capabilities of the USB host controller that our device
happens to be plugged into.  Probably all the spots in the usb-storage
driver that directly access those sg pages will have to be rewritten to
take into account that they may not be located in mapped memory.

So I will need to learn the proper way of doing that.  Is it as simple as 
calling page_address(), and if the result is 0 then calling kmap() 
followed by kunmap()?

Alan Stern


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [linux-usb-devel] Re: [Bug 1412] Copy from USB1 CF/SM reader stalls, no actual content is read (only directory structure)
  2003-11-07  8:24           ` Jens Axboe
  2003-11-07  8:50             ` Nicolas Mailhot
  2003-11-07 15:48             ` Alan Stern
@ 2003-11-07 18:56             ` David Brownell
  2 siblings, 0 replies; 15+ messages in thread
From: David Brownell @ 2003-11-07 18:56 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Alan Stern, Nicolas Mailhot, USB development list, linux-kernel

Jens Axboe wrote:
> No that looks alright, given you are allocating low memory pages. The
> devices can probably do full 32-bit dma I bet, though. 

Typically ... most usb host controllers you'll see are on
PCI (OHCI, UHCI, EHCI) with no restrictions, and only some
EHCI controllers can do 64-bit DMA.  That's all visible in
the the dma_mask for each interface in the device with the
mass storage support, usually still at its "32-bit dma is ok"
pci controller default.

But it seems that most current 2.6 DMA API implementations
have some problems in those areas.  See for example:

   http://marc.theaimsgroup.com/?l=linux-kernel&m=106746453218943&w=2
   http://marc.theaimsgroup.com/?l=linux-usb-devel&m=106789996221347&w=2

That second patch is a partial workaround for the first patch
presumably not getting applied before 2.6.0-final.  Net result,
some systems with gobs of memory and no IOMMU may do needless
buffer copies during USB I/O.

Though a quick glance suggested to me that SCSI infrastructure
is consulting dma_mask directly, instead of using the DMA API
calls which do that.  I'm not sure I'd trust it to be any
more correct, given GIGO ...

- Dave


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Bug 1412] Copy from USB1 CF/SM reader stalls, no actual content is read (only directory structure)
  2003-11-07  9:25                 ` Nicolas Mailhot
@ 2003-11-07 21:02                   ` Nicolas Mailhot
  2003-11-07 21:03                     ` Nicolas Mailhot
  2003-11-07 21:22                     ` Jens Axboe
  0 siblings, 2 replies; 15+ messages in thread
From: Nicolas Mailhot @ 2003-11-07 21:02 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Alan Stern, USB development list, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1173 bytes --]

Le ven 07/11/2003 à 10:25, Nicolas Mailhot a écrit :
> Le ven 07/11/2003 à 10:09, Jens Axboe a écrit :

> > Try with this debug patch then, does it work now?
> > 
> > ===== drivers/scsi/scsi_lib.c 1.77 vs edited =====
> > --- 1.77/drivers/scsi/scsi_lib.c	Tue Oct 14 09:28:06 2003
> > +++ edited/drivers/scsi/scsi_lib.c	Fri Nov  7 10:08:52 2003
> > @@ -1215,6 +1215,7 @@
> >  
> >  u64 scsi_calculate_bounce_limit(struct Scsi_Host *shost)
> >  {
> > +#if 0
> >  	struct device *host_dev;
> >  
> >  	if (shost->unchecked_isa_dma)
> > @@ -1229,6 +1230,9 @@
> >  	 * hardware have no practical limit.
> >  	 */
> >  	return BLK_BOUNCE_ANY;
> > +#else
> > +	return BLK_BOUNCE_HIGH;
> > +#endif
> >  }
> >  
> >  struct request_queue *scsi_alloc_queue(struct scsi_device *sdev)
> 
> Will try this evening when I have physical access to the system. (It's
> difficult to plug a USB device via ssh;)

Well, it does work now (couldn't believe my eyes, tried three times in a
row just to be sure). Is this supposed to be a definitive fix that will
be in the next bk snapshots or should I wait for something else ?

Regards,

-- 
Nicolas Mailhot

[-- Attachment #2: Ceci est une partie de message numériquement signée. --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Bug 1412] Copy from USB1 CF/SM reader stalls, no actual content is read (only directory structure)
  2003-11-07 21:02                   ` Nicolas Mailhot
@ 2003-11-07 21:03                     ` Nicolas Mailhot
  2003-11-07 21:22                     ` Jens Axboe
  1 sibling, 0 replies; 15+ messages in thread
From: Nicolas Mailhot @ 2003-11-07 21:03 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Alan Stern, USB development list, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 679 bytes --]

Le ven 07/11/2003 à 22:02, Nicolas Mailhot a écrit :
> Le ven 07/11/2003 à 10:25, Nicolas Mailhot a écrit :
> > Le ven 07/11/2003 à 10:09, Jens Axboe a écrit :
> 
> > > Try with this debug patch then, does it work now?

[...]

> > Will try this evening when I have physical access to the system. (It's
> > difficult to plug a USB device via ssh;)
> 
> Well, it does work now (couldn't believe my eyes, tried three times in a
> row just to be sure). Is this supposed to be a definitive fix that will
> be in the next bk snapshots or should I wait for something else ?

Thanks a lot everyone btw - I'm so shocked I'm forgetting my manners;)

-- 
Nicolas Mailhot

[-- Attachment #2: Ceci est une partie de message numériquement signée. --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Bug 1412] Copy from USB1 CF/SM reader stalls, no actual content is read (only directory structure)
  2003-11-07 15:48             ` Alan Stern
@ 2003-11-07 21:22               ` Jens Axboe
  0 siblings, 0 replies; 15+ messages in thread
From: Jens Axboe @ 2003-11-07 21:22 UTC (permalink / raw)
  To: Alan Stern; +Cc: Nicolas Mailhot, USB development list, linux-kernel

On Fri, Nov 07 2003, Alan Stern wrote:
> On Fri, 7 Nov 2003, Jens Axboe wrote:
> 
> > I've lost the original mail in the thread, but I'm quite sure it listed
> > sg[0].page as being valid. Maybe the driver cleared it somewhere too
> > early? page_address() will not have returned NULL for a valid page.
> > Unless it's a highmem page, in that case you have to map it and unmap
> > after use. For drivers like this that aren't performance critical and
> > are by no means highmem ready, it's easier just to set your dma mask low
> > enough that you wont be handed such pages.
> 
> I think you may have put your finger on the problem.  Our difficulty is
> that we have no control over where these sg pages are allocated; that
> depends on the capabilities of the USB host controller that our device
> happens to be plugged into.  Probably all the spots in the usb-storage
> driver that directly access those sg pages will have to be rewritten to
> take into account that they may not be located in mapped memory.
> 
> So I will need to learn the proper way of doing that.  Is it as simple as 
> calling page_address(), and if the result is 0 then calling kmap() 
> followed by kunmap()?

Forget page_address(), you should never use it. _Always_ kmap the page
(or kmap_atomic() in irq context) and it will do the right thing
regardless.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Bug 1412] Copy from USB1 CF/SM reader stalls, no actual content is read (only directory structure)
  2003-11-07 21:02                   ` Nicolas Mailhot
  2003-11-07 21:03                     ` Nicolas Mailhot
@ 2003-11-07 21:22                     ` Jens Axboe
  1 sibling, 0 replies; 15+ messages in thread
From: Jens Axboe @ 2003-11-07 21:22 UTC (permalink / raw)
  To: Nicolas Mailhot; +Cc: Alan Stern, USB development list, linux-kernel

On Fri, Nov 07 2003, Nicolas Mailhot wrote:
> Le ven 07/11/2003 à 10:25, Nicolas Mailhot a écrit :
> > Le ven 07/11/2003 à 10:09, Jens Axboe a écrit :
> 
> > > Try with this debug patch then, does it work now?
> > > 
> > > ===== drivers/scsi/scsi_lib.c 1.77 vs edited =====
> > > --- 1.77/drivers/scsi/scsi_lib.c	Tue Oct 14 09:28:06 2003
> > > +++ edited/drivers/scsi/scsi_lib.c	Fri Nov  7 10:08:52 2003
> > > @@ -1215,6 +1215,7 @@
> > >  
> > >  u64 scsi_calculate_bounce_limit(struct Scsi_Host *shost)
> > >  {
> > > +#if 0
> > >  	struct device *host_dev;
> > >  
> > >  	if (shost->unchecked_isa_dma)
> > > @@ -1229,6 +1230,9 @@
> > >  	 * hardware have no practical limit.
> > >  	 */
> > >  	return BLK_BOUNCE_ANY;
> > > +#else
> > > +	return BLK_BOUNCE_HIGH;
> > > +#endif
> > >  }
> > >  
> > >  struct request_queue *scsi_alloc_queue(struct scsi_device *sdev)
> > 
> > Will try this evening when I have physical access to the system. (It's
> > difficult to plug a USB device via ssh;)
> 
> Well, it does work now (couldn't believe my eyes, tried three times in a
> row just to be sure). Is this supposed to be a definitive fix that will
> be in the next bk snapshots or should I wait for something else ?

No it's not a definitive fix. It was just a test - if it works with the
patch applied, it confirms the theory that usb storage is broken wrt
highmem.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2003-11-08  0:33 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1067633171.3886.1.camel@m70.net81-64-235.noos.fr>
2003-11-01 15:47 ` [Bug 1412] Copy from USB1 CF/SM reader stalls, no actual content is read (only directory structure) Alan Stern
2003-11-04  7:49   ` Jens Axboe
2003-11-04 17:33     ` Alan Stern
2003-11-05  8:40       ` Jens Axboe
2003-11-05 15:47         ` Alan Stern
2003-11-07  8:24           ` Jens Axboe
2003-11-07  8:50             ` Nicolas Mailhot
2003-11-07  9:09               ` Jens Axboe
2003-11-07  9:25                 ` Nicolas Mailhot
2003-11-07 21:02                   ` Nicolas Mailhot
2003-11-07 21:03                     ` Nicolas Mailhot
2003-11-07 21:22                     ` Jens Axboe
2003-11-07 15:48             ` Alan Stern
2003-11-07 21:22               ` Jens Axboe
2003-11-07 18:56             ` [linux-usb-devel] " David Brownell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).