linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: Inefficient PCI DMA usage (was: [experimental patch] UHCI updates)
@ 2001-01-20 18:00 Manfred Spraul
  2001-01-20 18:08 ` Johannes Erdfelt
  0 siblings, 1 reply; 11+ messages in thread
From: Manfred Spraul @ 2001-01-20 18:00 UTC (permalink / raw)
  To: johannes, linux-kernel

> 
> TD's are around 32 bytes big (actually, they may be 48 or even 64 now, I 
> haven't checked recently). That's a waste of space for an entire page. 
> 
> However, having every driver implement it's own slab cache seems a 
> complete waste of time when we already have the code to do so in 
> mm/slab.c. It would be nice if we could extend the generic slab code to 
> understand the PCI DMA API for us. 
>
I missed the beginning of the thread:

What are the exact requirements for TD's?
I have 3 tiny updates for mm/slab.c that I'll send to Linus as soon as
2.4 has stabilized a bit more, perhaps I can integrate the code for USB.

--
	Manfred
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Inefficient PCI DMA usage (was: [experimental patch] UHCI updates)
  2001-01-20 18:00 Inefficient PCI DMA usage (was: [experimental patch] UHCI updates) Manfred Spraul
@ 2001-01-20 18:08 ` Johannes Erdfelt
  2001-01-20 23:15   ` Russell King
  0 siblings, 1 reply; 11+ messages in thread
From: Johannes Erdfelt @ 2001-01-20 18:08 UTC (permalink / raw)
  To: Manfred Spraul; +Cc: linux-kernel

On Sat, Jan 20, 2001, Manfred Spraul <manfred@colorfullife.com> wrote:
> > 
> > TD's are around 32 bytes big (actually, they may be 48 or even 64 now, I 
> > haven't checked recently). That's a waste of space for an entire page. 
> > 
> > However, having every driver implement it's own slab cache seems a 
> > complete waste of time when we already have the code to do so in 
> > mm/slab.c. It would be nice if we could extend the generic slab code to 
> > understand the PCI DMA API for us. 
> >
> I missed the beginning of the thread:
> 
> What are the exact requirements for TD's?
> I have 3 tiny updates for mm/slab.c that I'll send to Linus as soon as
> 2.4 has stabilized a bit more, perhaps I can integrate the code for USB.

They need to be visible via DMA. They need to be 16 byte aligned. We
also have QH's which have similar requirements, but we don't use as many
of them.

JE

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Inefficient PCI DMA usage (was: [experimental patch] UHCI updates)
  2001-01-20 18:08 ` Johannes Erdfelt
@ 2001-01-20 23:15   ` Russell King
  2001-01-21  8:36     ` Manfred Spraul
  2001-01-21 17:37     ` Johannes Erdfelt
  0 siblings, 2 replies; 11+ messages in thread
From: Russell King @ 2001-01-20 23:15 UTC (permalink / raw)
  To: Johannes Erdfelt; +Cc: Manfred Spraul, linux-kernel

Johannes Erdfelt writes:
> They need to be visible via DMA. They need to be 16 byte aligned. We
> also have QH's which have similar requirements, but we don't use as many
> of them.

Can we get away from the "16 byte aligned" and make it "n byte aligned"?
I believe that slab already has support for this?
   _____
  |_____| ------------------------------------------------- ---+---+-
  |   |         Russell King        rmk@arm.linux.org.uk      --- ---
  | | | | http://www.arm.linux.org.uk/personal/aboutme.html   /  /  |
  | +-+-+                                                     --- -+-
  /   |               THE developer of ARM Linux              |+| /|\
 /  | | |                                                     ---  |
    +-+-+ -------------------------------------------------  /\\\  |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Inefficient PCI DMA usage (was: [experimental patch] UHCI updates)
  2001-01-20 23:15   ` Russell King
@ 2001-01-21  8:36     ` Manfred Spraul
  2001-01-21 10:51       ` Russell King
  2001-01-21 17:37     ` Johannes Erdfelt
  1 sibling, 1 reply; 11+ messages in thread
From: Manfred Spraul @ 2001-01-21  8:36 UTC (permalink / raw)
  To: Russell King; +Cc: Johannes Erdfelt, linux-kernel

Russell King wrote:
> 
> Johannes Erdfelt writes:
> > They need to be visible via DMA. They need to be 16 byte aligned. We
> > also have QH's which have similar requirements, but we don't use as many
> > of them.
> 
> Can we get away from the "16 byte aligned" and make it "n byte aligned"?
> I believe that slab already has support for this?
>

Not yet, but that would be a 2 line patch (currently it's hardcoded to
BYTES_PER_WORD align or L1_CACHE_BYTES, depending on the HWCACHE_ALIGN
flag).

But there are 2 other problems:
* kmem_cache_alloc returns one pointer, pci_alloc_consistent 2 pointers:
one dma address, one virtual address.
* The code relies on the virt_to_page() macro.

The second problem is the difficult one, I don't see how I could remove
that dependency without a major overhaul.

--
	Manfred
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Inefficient PCI DMA usage (was: [experimental patch] UHCI updates)
  2001-01-21  8:36     ` Manfred Spraul
@ 2001-01-21 10:51       ` Russell King
  2001-01-21 11:49         ` Manfred Spraul
  0 siblings, 1 reply; 11+ messages in thread
From: Russell King @ 2001-01-21 10:51 UTC (permalink / raw)
  To: Manfred Spraul; +Cc: Johannes Erdfelt, linux-kernel

Manfred Spraul writes:
> Not yet, but that would be a 2 line patch (currently it's hardcoded to
> BYTES_PER_WORD align or L1_CACHE_BYTES, depending on the HWCACHE_ALIGN
> flag).

I don't think there's a problem then.  However, if slab can be told "I want
1024 bytes aligned to 1024 bytes" then I can get rid of
arch/arm/mm/small_page.c (separate problem to the one we're discussing
though) ;)

> But there are 2 other problems:
> * kmem_cache_alloc returns one pointer, pci_alloc_consistent 2 pointers:
> one dma address, one virtual address.
> * The code relies on the virt_to_page() macro.

What I'm wondering is what about a wrapper around the slab allocator, in
a similar way to pci_alloc_consistent() is a wrapper around gfp.  Since
the slab allocator returns "pointers" in the same space as gfp returns
page references, there shouldn't be a problem (Linus may complain here).

ie, we could make pci_alloc_consistent() a little more intelligent and
allocate from the slab for small sizes, but use gfp for larger sizes?

Comments, anyone (DaveM, Linus, et al) ?
   _____
  |_____| ------------------------------------------------- ---+---+-
  |   |         Russell King        rmk@arm.linux.org.uk      --- ---
  | | | | http://www.arm.linux.org.uk/personal/aboutme.html   /  /  |
  | +-+-+                                                     --- -+-
  /   |               THE developer of ARM Linux              |+| /|\
 /  | | |                                                     ---  |
    +-+-+ -------------------------------------------------  /\\\  |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Inefficient PCI DMA usage (was: [experimental patch] UHCI updates)
  2001-01-21 10:51       ` Russell King
@ 2001-01-21 11:49         ` Manfred Spraul
  0 siblings, 0 replies; 11+ messages in thread
From: Manfred Spraul @ 2001-01-21 11:49 UTC (permalink / raw)
  To: Russell King; +Cc: Johannes Erdfelt, linux-kernel

Russell King wrote:
> 
> Manfred Spraul writes:
> > Not yet, but that would be a 2 line patch (currently it's hardcoded to
> > BYTES_PER_WORD align or L1_CACHE_BYTES, depending on the HWCACHE_ALIGN
> > flag).
> 
> I don't think there's a problem then.  However, if slab can be told "I want
> 1024 bytes aligned to 1024 bytes" then I can get rid of
> arch/arm/mm/small_page.c (separate problem to the one we're discussing
> though) ;)
> 

That's easy, I'll include it in my next slab update.

--
	Manfred
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Inefficient PCI DMA usage (was: [experimental patch] UHCI updates)
  2001-01-20 23:15   ` Russell King
  2001-01-21  8:36     ` Manfred Spraul
@ 2001-01-21 17:37     ` Johannes Erdfelt
  2001-01-21 23:11       ` Russell King
  1 sibling, 1 reply; 11+ messages in thread
From: Johannes Erdfelt @ 2001-01-21 17:37 UTC (permalink / raw)
  To: Russell King; +Cc: Manfred Spraul, linux-kernel

On Sat, Jan 20, 2001, Russell King <rmk@arm.linux.org.uk> wrote:
> Johannes Erdfelt writes:
> > They need to be visible via DMA. They need to be 16 byte aligned. We
> > also have QH's which have similar requirements, but we don't use as many
> > of them.
> 
> Can we get away from the "16 byte aligned" and make it "n byte aligned"?
> I believe that slab already has support for this?

If you look at the part of the message that I quoted and you cut off,
the requirements for UHCI are the data structures MUST be 16 byte aligned.

I don't mind if the API is more generalized, but those are the
requirements that were asked about in this specific case.

JE

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Inefficient PCI DMA usage (was: [experimental patch] UHCI updates)
  2001-01-21 17:37     ` Johannes Erdfelt
@ 2001-01-21 23:11       ` Russell King
  0 siblings, 0 replies; 11+ messages in thread
From: Russell King @ 2001-01-21 23:11 UTC (permalink / raw)
  To: Johannes Erdfelt; +Cc: Manfred Spraul, linux-kernel

Hi,

You may have already found out that there's a problem using
pci_alloc_consistent and friends in the USB layer which will
only be obvious on CPUs where they need to do page table remapping
- that is that pci_alloc_consistent/pci_free_consistent aren't
guaranteed to be interrupt-safe.

I'm not sure what the correct way around this is yet, but I do
know its a major problem. ;(

Maybe we need to do a get_free_pages-type thing with this and
keep a set amount of consistent area in reserve for atomic
allocations (as per GFP_ATOMIC)?  Yes, I know its not nice, but
I don't see any other option at the moment with USB.

(yes, I'm hacking the 2.2.18 ohci driver for my own ends to get
something up and running on one of my machines).
   _____
  |_____| ------------------------------------------------- ---+---+-
  |   |         Russell King        rmk@arm.linux.org.uk      --- ---
  | | | | http://www.arm.linux.org.uk/personal/aboutme.html   /  /  |
  | +-+-+                                                     --- -+-
  /   |               THE developer of ARM Linux              |+| /|\
 /  | | |                                                     ---  |
    +-+-+ -------------------------------------------------  /\\\  |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Inefficient PCI DMA usage (was: [experimental patch] UHCI updates)
  2001-01-20  8:28   ` Russell King
@ 2001-01-20 17:34     ` Johannes Erdfelt
  0 siblings, 0 replies; 11+ messages in thread
From: Johannes Erdfelt @ 2001-01-20 17:34 UTC (permalink / raw)
  To: Russell King; +Cc: linux-kernel, linux-usb-devel

On Sat, Jan 20, 2001, Russell King <rmk@arm.linux.org.uk> wrote:
> Johannes Erdfelt writes:
> > On Fri, Jan 19, 2001, Miles Lane <miles@megapathdsl.net> wrote:
> > > Johannes Erdfelt wrote:
> > > 
> > > > TODO
> > > > ----
> > > > - The PCI DMA architecture is horribly inefficient on x86 and ia64. The
> > > >   result is a page is allocated for each TD. This is evil. Perhaps a slab
> > > >   cache internally? Or modify the generic slab cache to handle PCI DMA
> > > >   pages instead?
> > > 
> > > This might be the kind of thing to run past Linus when the 2.5 tree 
> > > opens up.  Are these inefficiencies necessary evils due to workarounds 
> > > for whacky bugs in BIOSen or PCI chipsets or are they due to poor 
> > > design/implementation?
> > 
> > Looks like poor design/implementation. Or perhaps it was designed for
> > another reason than I want to use it for.
> 
> Why?  What are you trying to do?  Allocate one area per small structure?
> Why not allocate one big area and allocate from that (like the tulip
> drivers do for their TX and RX rings)?
> 
> I don't really know what you're trying to do/what the problem is because
> there isn't enough context left in the original mail above, and I have
> no idea whether the original mail appeared here or where I can read it.

I was hoping the context from the original TODO up there was sufficient
and it looked like it was enough.

TD's are around 32 bytes big (actually, they may be 48 or even 64 now, I
haven't checked recently). That's a waste of space for an entire page.

However, having every driver implement it's own slab cache seems a
complete waste of time when we already have the code to do so in
mm/slab.c. It would be nice if we could extend the generic slab code to
understand the PCI DMA API for us.

> > I should also check architectures other than x86 and ia64.
> 
> This is an absolute must.

Not really. The 2 interesting architectures are x86 and ia64 since
that's where you commonly see UHCI controllers. While you can add UHCI
controllers to most any other architecture which has PCI, you usually
see OHCI on those systems.

I was curious to see if any other architectures implemented it
differently and I was just expecting too much out of the API. You pretty
much confirmed my suspicions when you suggested doing what the tulip
driver does.

JE

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Inefficient PCI DMA usage (was: [experimental patch] UHCI updates)
  2001-01-20  5:38 ` Johannes Erdfelt
@ 2001-01-20  8:28   ` Russell King
  2001-01-20 17:34     ` Johannes Erdfelt
  0 siblings, 1 reply; 11+ messages in thread
From: Russell King @ 2001-01-20  8:28 UTC (permalink / raw)
  To: Johannes Erdfelt; +Cc: linux-kernel

Johannes Erdfelt writes:
> On Fri, Jan 19, 2001, Miles Lane <miles@megapathdsl.net> wrote:
> > Johannes Erdfelt wrote:
> > 
> > > TODO
> > > ----
> > > - The PCI DMA architecture is horribly inefficient on x86 and ia64. The
> > >   result is a page is allocated for each TD. This is evil. Perhaps a slab
> > >   cache internally? Or modify the generic slab cache to handle PCI DMA
> > >   pages instead?
> > 
> > This might be the kind of thing to run past Linus when the 2.5 tree 
> > opens up.  Are these inefficiencies necessary evils due to workarounds 
> > for whacky bugs in BIOSen or PCI chipsets or are they due to poor 
> > design/implementation?
> 
> Looks like poor design/implementation. Or perhaps it was designed for
> another reason than I want to use it for.

Why?  What are you trying to do?  Allocate one area per small structure?
Why not allocate one big area and allocate from that (like the tulip
drivers do for their TX and RX rings)?

I don't really know what you're trying to do/what the problem is because
there isn't enough context left in the original mail above, and I have
no idea whether the original mail appeared here or where I can read it.

> I should also check architectures other than x86 and ia64.

This is an absolute must.
   _____
  |_____| ------------------------------------------------- ---+---+-
  |   |         Russell King        rmk@arm.linux.org.uk      --- ---
  | | | | http://www.arm.linux.org.uk/personal/aboutme.html   /  /  |
  | +-+-+                                                     --- -+-
  /   |               THE developer of ARM Linux              |+| /|\
 /  | | |                                                     ---  |
    +-+-+ -------------------------------------------------  /\\\  |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Inefficient PCI DMA usage (was: [experimental patch] UHCI updates)
       [not found] <3A691043.F18CA6CA@megapathdsl.net>
@ 2001-01-20  5:38 ` Johannes Erdfelt
  2001-01-20  8:28   ` Russell King
  0 siblings, 1 reply; 11+ messages in thread
From: Johannes Erdfelt @ 2001-01-20  5:38 UTC (permalink / raw)
  To: linux-kernel

On Fri, Jan 19, 2001, Miles Lane <miles@megapathdsl.net> wrote:
> Johannes Erdfelt wrote:
> 
> > TODO
> > ----
> > - The PCI DMA architecture is horribly inefficient on x86 and ia64. The
> >   result is a page is allocated for each TD. This is evil. Perhaps a slab
> >   cache internally? Or modify the generic slab cache to handle PCI DMA
> >   pages instead?
> 
> This might be the kind of thing to run past Linus when the 2.5 tree 
> opens up.  Are these inefficiencies necessary evils due to workarounds 
> for whacky bugs in BIOSen or PCI chipsets or are they due to poor 
> design/implementation?

Looks like poor design/implementation. Or perhaps it was designed for
another reason than I want to use it for.

2.5 is probably where any core changes will happen, if any. But for now
I suspect I'll need to workaround it in my driver.

I should also check architectures other than x86 and ia64.

JE

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2001-01-22  6:31 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-01-20 18:00 Inefficient PCI DMA usage (was: [experimental patch] UHCI updates) Manfred Spraul
2001-01-20 18:08 ` Johannes Erdfelt
2001-01-20 23:15   ` Russell King
2001-01-21  8:36     ` Manfred Spraul
2001-01-21 10:51       ` Russell King
2001-01-21 11:49         ` Manfred Spraul
2001-01-21 17:37     ` Johannes Erdfelt
2001-01-21 23:11       ` Russell King
     [not found] <3A691043.F18CA6CA@megapathdsl.net>
2001-01-20  5:38 ` Johannes Erdfelt
2001-01-20  8:28   ` Russell King
2001-01-20 17:34     ` Johannes Erdfelt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).