linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [parisc-linux] Untested port of parisc_device to generic device interface
@ 2002-11-09  4:51 Adam J. Richter
  2002-11-09  5:21 ` Matthew Wilcox
  2002-11-09 18:04 ` Grant Grundler
  0 siblings, 2 replies; 71+ messages in thread
From: Adam J. Richter @ 2002-11-09  4:51 UTC (permalink / raw)
  To: willy; +Cc: andmike, hch, James.Bottomley, linux-kernel, mochel, parisc-linux

Matthew Wilcox wrote:
>Actually I think the generic device model is crap. [...]

My patch is a net deletion of 57 lines and will allow simplification
of parisc DMA allocation.

Although I agree with most of your criticisms about the generic device
model, most of the problems with it are the way people use it (the
first thing everyone wants to do is a driverfs file system) and some
conventions that I disagree with, such as the idea that drivers that
embed struct device and struct device_driver should not initialize
those fields directly, but should have xxx_register_device copy them
in.  parisc can use the generic driver API without getting fat.

Problems specific to the generic device API can be incrementally
improved and nobody is treating it as set in stone.  I think the
generic device API is close enough already so that it's worth porting
to, even if future clean-ups will then require some small changes to
the code that is ported to it.

Please do not throw the baby out with the bath water.  The generic
driver interface in its present form really can make parisc smaller
and cleaner.

Adam J. Richter     __     ______________   575 Oroville Road
adam@yggdrasil.com     \ /                  Miplitas, California 95035
+1 408 309-6081         | g g d r a s i l   United States of America
                         "Free Software For The Rest Of Us."

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [parisc-linux] Untested port of parisc_device to generic device interface
  2002-11-09  4:51 [parisc-linux] Untested port of parisc_device to generic device interface Adam J. Richter
@ 2002-11-09  5:21 ` Matthew Wilcox
  2002-11-09  6:03   ` Greg KH
  2002-11-09  7:58   ` Marc Zyngier
  2002-11-09 18:04 ` Grant Grundler
  1 sibling, 2 replies; 71+ messages in thread
From: Matthew Wilcox @ 2002-11-09  5:21 UTC (permalink / raw)
  To: Adam J. Richter
  Cc: willy, andmike, hch, James.Bottomley, linux-kernel, mochel, parisc-linux

On Fri, Nov 08, 2002 at 08:51:28PM -0800, Adam J. Richter wrote:
> My patch is a net deletion of 57 lines and will allow simplification
> of parisc DMA allocation.

57 lines of clean elegant code, replacing them with overly generic ugly
code and bloated data structures.  struct device is a round 256 bytes
on x86.  more on 64-bit architectures.

> in.  parisc can use the generic driver API without getting fat.

no.  it can't.

> Problems specific to the generic device API can be incrementally
> improved and nobody is treating it as set in stone.  I think the
> generic device API is close enough already so that it's worth porting
> to, even if future clean-ups will then require some small changes to
> the code that is ported to it.

Everyone's saying "ra!  ra!  generic device model!" without asking
what the cost is.  Don't you think it's reasonable that _as the most
common device type_, struct device should be able to support PCI in a
clean manner?  Don't you think that the fact that it fails to do so is
a problem?  Don't you look at the locks sprinkled all over the struct
device system and wonder what they're all _for_?

Don't get me wrong.  I want a generic device model.  But I think it's
clear the current one has failed to show anything more than eye candy.
Perhaps it's time to start over, with something small and sane -- maybe
kobject (it's not quite what we need, but it's close).  Put one of those
in struct pci_dev.  Remove duplicate fields.  Now maybe grow kobject a
little, or perhaps start a new struct with a kobject as its first member.

And, for gods sake, don't fuck it up by integrating it with USB too early
in the game.  Let's get it right for PCI, maybe some other internal busses
(i'm gagging to write an EISA subsystem ;-).  SCSI is more interesting
than USB.  Above all, don't fall into the trap of "It's a bus and it
has devices on it, therefore it must be a part of devicefs".

*sigh*.  halloween was a week ago.

-- 
Revolutions do not require corporate support.

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [parisc-linux] Untested port of parisc_device to generic device interface
  2002-11-09  5:21 ` Matthew Wilcox
@ 2002-11-09  6:03   ` Greg KH
  2002-11-09 15:33     ` J.E.J. Bottomley
  2002-11-09  7:58   ` Marc Zyngier
  1 sibling, 1 reply; 71+ messages in thread
From: Greg KH @ 2002-11-09  6:03 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Adam J. Richter, andmike, hch, James.Bottomley, linux-kernel,
	mochel, parisc-linux

On Sat, Nov 09, 2002 at 05:21:50AM +0000, Matthew Wilcox wrote:
> 
> Everyone's saying "ra!  ra!  generic device model!" without asking
> what the cost is.  Don't you think it's reasonable that _as the most
> common device type_, struct device should be able to support PCI in a
> clean manner?

No I do not.

> Don't you think that the fact that it fails to do so is a problem?

Yes I do.

> Don't you look at the locks sprinkled all over the struct device
> system and wonder what they're all _for_?

Nope :)
(yes, I do wonder, and yes, they will be cleaned up...)

> Don't get me wrong.  I want a generic device model.  But I think it's
> clear the current one has failed to show anything more than eye candy.
> Perhaps it's time to start over, with something small and sane -- maybe
> kobject (it's not quite what we need, but it's close).  Put one of those
> in struct pci_dev.  Remove duplicate fields.  Now maybe grow kobject a
> little, or perhaps start a new struct with a kobject as its first member.

No, lets start pulling stuff out of pci_dev and relying on struct
device.  The reason this hasn't happened yet is no one has been willing
to break all of the PCI drivers, yet.

I know Pat is going to be doing this soon, and if he doesn't get to it,
I will.  But as Adam said, don't throw away the idea because it looks
crufty now.  This has been a _constantly_ evolving model as we work to
get it right.  It will take time, and we are still getting there.

> And, for gods sake, don't fuck it up by integrating it with USB too early
> in the game.

In my defense, USB was the _only_ bus willing to step up and try to do
the integration to work the initial kinks out.  The SCSI people are
being drug kicking and screaming into it, _finally_ now (hell, SCSI is
still not using the updated PCI interface, those people _never_ update
their drivers if they can avoid it.)

> Let's get it right for PCI, maybe some other internal busses
> (i'm gagging to write an EISA subsystem ;-).  SCSI is more interesting
> than USB.  Above all, don't fall into the trap of "It's a bus and it
> has devices on it, therefore it must be a part of devicefs".

Sure SCSI's more interesting, to you :)

By having USB be one of the first adopters (after PCI), we have found a
_lot_ of issues and bugs that happened due to devices showing up and
disappearing at odd times.  Which was _much_ easier to debug than PCI
would have been.  SCSI can't even do hotplug devices _yet_.  How would
we have debugged this stuff then?

And yes, USB belongs in the model, if for no other reason, that "it's a
bus and it has devices on it" :)

> *sigh*.  halloween was a week ago.

Patches for this stuff are going to be happening for quite some time
now, don't despair.

And they are greatly appreciated, and welcomed from everyone :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [parisc-linux] Untested port of parisc_device to generic device interface
  2002-11-09  5:21 ` Matthew Wilcox
  2002-11-09  6:03   ` Greg KH
@ 2002-11-09  7:58   ` Marc Zyngier
  1 sibling, 0 replies; 71+ messages in thread
From: Marc Zyngier @ 2002-11-09  7:58 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Adam J. Richter, andmike, hch, James.Bottomley, linux-kernel,
	mochel, parisc-linux

>>>>> "Matthew" == Matthew Wilcox <willy@debian.org> writes:

Matthew> (i'm gagging to write an EISA subsystem ;-).

Humm, please don't... :-)

maz@midlife-crisis:~$ ls -R /sys/bus/eisa/ 
/sys/bus/eisa/:
devices  drivers

/sys/bus/eisa/devices:
00:00  00:01  00:02

/sys/bus/eisa/drivers:
3c509

/sys/bus/eisa/drivers/3c509:
00:02
maz@midlife-crisis:~$ 

I have it working on x86 and Alpha, will test parisc and mips over the
week-end.

        M.
-- 
Places change, faces change. Life is so very strange.

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [parisc-linux] Untested port of parisc_device to generic device  interface
  2002-11-09  6:03   ` Greg KH
@ 2002-11-09 15:33     ` J.E.J. Bottomley
  2002-11-13  6:13       ` Greg KH
  0 siblings, 1 reply; 71+ messages in thread
From: J.E.J. Bottomley @ 2002-11-09 15:33 UTC (permalink / raw)
  To: Greg KH
  Cc: Matthew Wilcox, Adam J. Richter, andmike, hch, James.Bottomley,
	linux-kernel, mochel, parisc-linux

First off, lets remember that this discussion started over two different, but 
related, problem sets.  One was the DMA api and the other was the device model.

As far as the DMA API goes, consistent memory allocation has always annoyed me 
because there are three separate cases

1. machine is consistent, consistent allocation always succeeds (unless we're 
out of memory)

2. machine is fully inconsistent, consistent allocation always fails.

3. Machine is partially consistent.  consistent allocation may fail because 
we're out of consistent memory so we have to fall back to the old.

What I'd like is an improvement to the DMA API where drivers can advertise 
levels of conformance (I only work with consistent memory or I work correctly 
with any dma'able memory and do all the sync points), and where all the sync 
point stuff optimises out for a machine architecture which is recognisably 
fully consistent at compile time.

Ok, I'll get off my soapbox now.  I never quite recovered from the awful 
#ifdef mess that doing the above correctly for the parisc introduced into the 
53c700 driver...

As far as the device model goes:

greg@kroah.com said:
> No, lets start pulling stuff out of pci_dev and relying on struct
> device.  The reason this hasn't happened yet is no one has been
> willing to break all of the PCI drivers, yet. 

I'd like to see that.  It's always annoyed me that my MCA machines have to 
bounce highmem just because they don't have a pci_dev to put the bounce mask 
in.

> The SCSI people are being drug kicking and screaming into it,
> _finally_ now (hell, SCSI is still not using the updated PCI
> interface, those people _never_ update their drivers if they can avoid
> it.)

That't not entirely fair.  Most of the unbroken drivers in the tree (those 
with active 2.5 maintainers) are using the up to date pci/dma interface.  The 
mid layer is `sort of' using the device api.

Where I'd like to see the device model go for SCSI is:

- we have a device node for every struct scsi_device (even unattached ones)

- unattached devices are really minimal entities with as few resources 
allocated as we can get away with, so we can have lazy attachment more easily.

- on attachment, the device node gets customised by the attachment type (and 
remember, we can have more than one attachment).

- whatever the permanent `name' field for the device is going to be needs to 
be writeable from user level, that way it can eventually be determined by the 
user level and simply written there as a string (rather than having all the 
wwn fallback cruft in the mid-layer).

- Ultimately, I'd like us to dump the host/channel/target numbering scheme in 
favour of the unique device node name (we may still number them in the 
mid-layer for convenience) so that we finesse the FC mapping problems---FC 
devices can embed the necessary identification in the target strings.

- Oh, and of course, we move to a hotplug/coldplug model where the root device 
is attached in initramfs and everything else is discovered in user space from 
the boot process.

> Patches for this stuff are going to be happening for quite some time
> now, don't despair.

> And they are greatly appreciated, and welcomed from everyone :) 

As far as extending the generic device model goes, I'll do it for the MCA bus. 
 I have looked at doing it previously, but giving the MCA bus a struct pci_dev 
is a real pain because of the underlying assumptions when one of these exists 
in an x86 machine.

But, while were on the subject of sorting out the device model abstraction, 
does the `power' node entry really belong at the top level?  It serves no 
purpose for something like a legacy MCA bus.

James



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [parisc-linux] Untested port of parisc_device to generic device interface
  2002-11-09  4:51 [parisc-linux] Untested port of parisc_device to generic device interface Adam J. Richter
  2002-11-09  5:21 ` Matthew Wilcox
@ 2002-11-09 18:04 ` Grant Grundler
  1 sibling, 0 replies; 71+ messages in thread
From: Grant Grundler @ 2002-11-09 18:04 UTC (permalink / raw)
  To: Adam J. Richter
  Cc: andmike, hch, James.Bottomley, linux-kernel, mochel, parisc-linux

"Adam J. Richter" wrote:
> Please do not throw the baby out with the bath water.  The generic
> driver interface in its present form really can make parisc smaller
> and cleaner.

I hope that's true. I was just as disappointed as willy.

Documentation/driver-model/overview.txt:
| Note also that it is at the _end_ of struct pci_dev. This is
| to make people think about what they're doing when switching between the bus
| driver and the global driver; and to prevent against mindless casts between
| the two.

Until this changes, I don't see this as a useful replacement for
either PCI or parisc devices. The "mindless casts" can be fixed.
But without the ability to easily go from generic device type to
bus specific type, people will just get lost in the maze of pointers.

Common code needs to take a common parameter.  Operations on the tree
(eg probe) often require calls to bus specific (or arch or platform
specific) code which may in turn need to make other IO tree operations.
Those code paths convert back and forth between types regularly.
That's why I want to make it as simple as possible at the risk
a few people will get it wrong.

HPUX has had a "unified" IO tree since 10.0 in ~1994. Previous
releases had an IO tree for "Server IO" but not the PA-RISC
workstations. I've work on HPUX IO subsystem 6 years (PCI Code owner for
2 years) and it had several key features:
  (a) traverse the complete tree (from "central bus" to SCSI LUN)
      with shared code,
  (b) determine which type of bus any node was "on",
  (c) associate arbitrary local data with any node.
       (this includes bus *operations*! eg probe, dma, irq setup)

Maybe I'm not seeing it, but (b) and (c) are missing from basic
design or not well described in driver-model/overview.txt.

BTW, I couldn't find Documentation/filesystems/driverfs.txt.

Lastly, the example of an "irq" entry in overview.txt is interesting.
iosapic code "owns" the IRQ. And it could make visible other info
regarding IRQs - eg type and which CPU it's directed at.
But I get the feeling only bus specific code can do that since
it "owns" the directory. Do I misunderstand?

thanks,
grant

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [parisc-linux] Untested port of parisc_device to generic device interface
  2002-11-09 15:33     ` J.E.J. Bottomley
@ 2002-11-13  6:13       ` Greg KH
  2002-11-13  7:46         ` Miles Bader
  0 siblings, 1 reply; 71+ messages in thread
From: Greg KH @ 2002-11-13  6:13 UTC (permalink / raw)
  To: J.E.J. Bottomley
  Cc: Matthew Wilcox, Adam J. Richter, andmike, hch, linux-kernel,
	mochel, parisc-linux

On Sat, Nov 09, 2002 at 10:33:56AM -0500, J.E.J. Bottomley wrote:
> 
> > The SCSI people are being drug kicking and screaming into it,
> > _finally_ now (hell, SCSI is still not using the updated PCI
> > interface, those people _never_ update their drivers if they can avoid
> > it.)
> 
> That't not entirely fair.  Most of the unbroken drivers in the tree (those 
> with active 2.5 maintainers) are using the up to date pci/dma interface.  The 
> mid layer is `sort of' using the device api.

I was referring to the pci_module_init() model of PCI drivers, which, as
of 2.5.47, is only implemented in the ips, nsp32 and aic7xxx drivers.
Every other PCI SCSI controller driver will crash and burn a nasty death
if placed in a machine with a PCI hotplug controller, and someone tries
to remove it.  Hopefully someday this will be fixed... :)

> Where I'd like to see the device model go for SCSI is:
> 
> - we have a device node for every struct scsi_device (even unattached ones)
> 
> - unattached devices are really minimal entities with as few resources 
> allocated as we can get away with, so we can have lazy attachment more easily.
> 
> - on attachment, the device node gets customised by the attachment type (and 
> remember, we can have more than one attachment).
> 
> - whatever the permanent `name' field for the device is going to be needs to 
> be writeable from user level, that way it can eventually be determined by the 
> user level and simply written there as a string (rather than having all the 
> wwn fallback cruft in the mid-layer).
> 
> - Ultimately, I'd like us to dump the host/channel/target numbering scheme in 
> favour of the unique device node name (we may still number them in the 
> mid-layer for convenience) so that we finesse the FC mapping problems---FC 
> devices can embed the necessary identification in the target strings.
> 
> - Oh, and of course, we move to a hotplug/coldplug model where the root device 
> is attached in initramfs and everything else is discovered in user space from 
> the boot process.

All of that sounds very reasonable, and would be nice to see
implemented.

> > Patches for this stuff are going to be happening for quite some time
> > now, don't despair.
> 
> > And they are greatly appreciated, and welcomed from everyone :) 
> 
> As far as extending the generic device model goes, I'll do it for the MCA bus. 
>  I have looked at doing it previously, but giving the MCA bus a struct pci_dev 
> is a real pain because of the underlying assumptions when one of these exists 
> in an x86 machine.

What is the real reason for needing this, pci_alloc_consistent()?  We
have talked about renaming that to dev_alloc_consistent() in the past,
which I think will work for you, right?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [parisc-linux] Untested port of parisc_device to generic device interface
  2002-11-13  6:13       ` Greg KH
@ 2002-11-13  7:46         ` Miles Bader
  2002-11-13  7:52           ` Greg KH
  0 siblings, 1 reply; 71+ messages in thread
From: Miles Bader @ 2002-11-13  7:46 UTC (permalink / raw)
  To: Greg KH
  Cc: J.E.J. Bottomley, Matthew Wilcox, Adam J. Richter, andmike, hch,
	linux-kernel, mochel, parisc-linux

Greg KH <greg@kroah.com> writes:
> What is the real reason for needing this, pci_alloc_consistent()?  We
> have talked about renaming that to dev_alloc_consistent() in the past,
> which I think will work for you, right?

This this would end up [or have the ability to] invoking a bus-specific
routine at some point, right?  [so that a truly PCI-specific definition
could be still be had]

Thanks,

-Miles
-- 
[|nurgle|]  ddt- demonic? so quake will have an evil kinda setting? one that 
            will  make every christian in the world foamm at the mouth? 
[iddt]      nurg, that's the goal 

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [parisc-linux] Untested port of parisc_device to generic device interface
  2002-11-13  7:46         ` Miles Bader
@ 2002-11-13  7:52           ` Greg KH
  2002-11-13  8:02             ` Miles Bader
                               ` (3 more replies)
  0 siblings, 4 replies; 71+ messages in thread
From: Greg KH @ 2002-11-13  7:52 UTC (permalink / raw)
  To: Miles Bader
  Cc: J.E.J. Bottomley, Matthew Wilcox, Adam J. Richter, andmike, hch,
	linux-kernel, mochel, parisc-linux

On Wed, Nov 13, 2002 at 04:46:58PM +0900, Miles Bader wrote:
> Greg KH <greg@kroah.com> writes:
> > What is the real reason for needing this, pci_alloc_consistent()?  We
> > have talked about renaming that to dev_alloc_consistent() in the past,
> > which I think will work for you, right?
> 
> This this would end up [or have the ability to] invoking a bus-specific
> routine at some point, right?  [so that a truly PCI-specific definition
> could be still be had]

If that was needed, yes, we should not break that functionality.

Are there any existing archs that need more than just dma_mask moved to
struct device out of pci_dev?  Hm, ppc might need a bit more...

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [parisc-linux] Untested port of parisc_device to generic device interface
  2002-11-13  7:52           ` Greg KH
@ 2002-11-13  8:02             ` Miles Bader
  2002-11-13  8:10               ` Greg KH
       [not found]               ` <miles@lsi.nec.co.jp>
  2002-11-13 11:59             ` Ivan Kokshaysky
                               ` (2 subsequent siblings)
  3 siblings, 2 replies; 71+ messages in thread
From: Miles Bader @ 2002-11-13  8:02 UTC (permalink / raw)
  To: Greg KH
  Cc: J.E.J. Bottomley, Matthew Wilcox, Adam J. Richter, andmike, hch,
	linux-kernel, mochel, parisc-linux

Greg KH <greg@kroah.com> writes:
> > This this would end up [or have the ability to] invoking a bus-specific
> > routine at some point, right?  [so that a truly PCI-specific definition
> > could be still be had]
> 
> If that was needed, yes, we should not break that functionality.
> 
> Are there any existing archs that need more than just dma_mask moved to
> struct device out of pci_dev?  Hm, ppc might need a bit more...

I can't speak for `real machines,' but on my wierd embedded board,
pci_alloc_consistent allocates from a special area of memory (not
located at 0) that is the only shared memory between PCI devices and the
CPU.  pci_alloc_consistent happens to fit this situation quite well, but
I don't think a bitmask is enough to express the situation.

-Miles
-- 
Ich bin ein Virus. Mach' mit und kopiere mich in Deine .signature.

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [parisc-linux] Untested port of parisc_device to generic device interface
  2002-11-13  8:02             ` Miles Bader
@ 2002-11-13  8:10               ` Greg KH
  2002-11-13  8:26                 ` Miles Bader
       [not found]               ` <miles@lsi.nec.co.jp>
  1 sibling, 1 reply; 71+ messages in thread
From: Greg KH @ 2002-11-13  8:10 UTC (permalink / raw)
  To: Miles Bader
  Cc: J.E.J. Bottomley, Matthew Wilcox, Adam J. Richter, andmike, hch,
	linux-kernel, mochel, parisc-linux

On Wed, Nov 13, 2002 at 05:02:39PM +0900, Miles Bader wrote:
> Greg KH <greg@kroah.com> writes:
> > > This this would end up [or have the ability to] invoking a bus-specific
> > > routine at some point, right?  [so that a truly PCI-specific definition
> > > could be still be had]
> > 
> > If that was needed, yes, we should not break that functionality.
> > 
> > Are there any existing archs that need more than just dma_mask moved to
> > struct device out of pci_dev?  Hm, ppc might need a bit more...
> 
> I can't speak for `real machines,' but on my wierd embedded board,
> pci_alloc_consistent allocates from a special area of memory (not
> located at 0) that is the only shared memory between PCI devices and the
> CPU.  pci_alloc_consistent happens to fit this situation quite well, but
> I don't think a bitmask is enough to express the situation.

What does your pci_alloc_consistent() function need from the pci_dev
structure in order to do what you need it to do?  Anything other than
the dma_mask value?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [parisc-linux] Untested port of parisc_device to generic device interface
  2002-11-13  8:26                 ` Miles Bader
@ 2002-11-13  8:25                   ` Greg KH
  2002-11-13  9:05                     ` Miles Bader
  0 siblings, 1 reply; 71+ messages in thread
From: Greg KH @ 2002-11-13  8:25 UTC (permalink / raw)
  To: Miles Bader
  Cc: J.E.J. Bottomley, Matthew Wilcox, Adam J. Richter, andmike, hch,
	linux-kernel, mochel, parisc-linux

On Wed, Nov 13, 2002 at 05:26:34PM +0900, Miles Bader wrote:
> Greg KH <greg@kroah.com> writes:
> > > I can't speak for `real machines,' but on my wierd embedded board,
> > > pci_alloc_consistent allocates from a special area of memory (not
> > > located at 0) that is the only shared memory between PCI devices and the
> > > CPU.  pci_alloc_consistent happens to fit this situation quite well, but
> > > I don't think a bitmask is enough to express the situation.
> > 
> > What does your pci_alloc_consistent() function need from the pci_dev
> > structure in order to do what you need it to do?  Anything other than
> > the dma_mask value?
> 
> Currently, it ignores the pci_dev argument entirely (I've never had a
> device that needed the mask, so I haven't bothered with it).  It just
> allocates a block from the special memory region and returns the result.

So merely renaming that function to dev_alloc_consistent(), changing the
first paramater to be a struct device, and proving a macro for all of
the pci drivers for the old pci_alloc_consistent() name would work just
fine for you?

greg k-h

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [parisc-linux] Untested port of parisc_device to generic device interface
  2002-11-13  8:10               ` Greg KH
@ 2002-11-13  8:26                 ` Miles Bader
  2002-11-13  8:25                   ` Greg KH
  0 siblings, 1 reply; 71+ messages in thread
From: Miles Bader @ 2002-11-13  8:26 UTC (permalink / raw)
  To: Greg KH
  Cc: J.E.J. Bottomley, Matthew Wilcox, Adam J. Richter, andmike, hch,
	linux-kernel, mochel, parisc-linux

Greg KH <greg@kroah.com> writes:
> > I can't speak for `real machines,' but on my wierd embedded board,
> > pci_alloc_consistent allocates from a special area of memory (not
> > located at 0) that is the only shared memory between PCI devices and the
> > CPU.  pci_alloc_consistent happens to fit this situation quite well, but
> > I don't think a bitmask is enough to express the situation.
> 
> What does your pci_alloc_consistent() function need from the pci_dev
> structure in order to do what you need it to do?  Anything other than
> the dma_mask value?

Currently, it ignores the pci_dev argument entirely (I've never had a
device that needed the mask, so I haven't bothered with it).  It just
allocates a block from the special memory region and returns the result.

-Miles
-- 
自らを空にして、心を開く時、道は開かれる

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [parisc-linux] Untested port of parisc_device to generic device interface
  2002-11-13  8:25                   ` Greg KH
@ 2002-11-13  9:05                     ` Miles Bader
  0 siblings, 0 replies; 71+ messages in thread
From: Miles Bader @ 2002-11-13  9:05 UTC (permalink / raw)
  To: Greg KH
  Cc: J.E.J. Bottomley, Matthew Wilcox, Adam J. Richter, andmike, hch,
	linux-kernel, mochel, parisc-linux

Greg KH <greg@kroah.com> writes:
> > Currently, it ignores the pci_dev argument entirely (I've never had a
> > device that needed the mask, so I haven't bothered with it).  It just
> > allocates a block from the special memory region and returns the result.
> 
> So merely renaming that function to dev_alloc_consistent(), changing the
> first paramater to be a struct device, and proving a macro for all of
> the pci drivers for the old pci_alloc_consistent() name would work just
> fine for you?

Except that this function doesn't make any sense except for PCI devices.

I don't know whether there will ever be any devices that (1) call
`dev_alloc_consistent', (2) aren't PCI devices, and (3) would stand a
chance of ever working on this platform -- probably not.

Never-the-less, it provides (a non-artificial) example of a case where
it's wrong to assume that all busses are the same, and I think that
merits some attention.

-Miles
-- 
97% of everything is grunge

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [parisc-linux] Untested port of parisc_device to generic device interface
  2002-11-13  7:52           ` Greg KH
  2002-11-13  8:02             ` Miles Bader
@ 2002-11-13 11:59             ` Ivan Kokshaysky
  2002-11-13 12:36               ` Marc Zyngier
  2002-11-13 16:32             ` Bjorn Helgaas
  2002-11-13 20:12             ` Grant Grundler
  3 siblings, 1 reply; 71+ messages in thread
From: Ivan Kokshaysky @ 2002-11-13 11:59 UTC (permalink / raw)
  To: Greg KH
  Cc: Miles Bader, J.E.J. Bottomley, Matthew Wilcox, Adam J. Richter,
	andmike, hch, linux-kernel, mochel, parisc-linux

On Tue, Nov 12, 2002 at 11:52:23PM -0800, Greg KH wrote:
> On Wed, Nov 13, 2002 at 04:46:58PM +0900, Miles Bader wrote:
> > This this would end up [or have the ability to] invoking a bus-specific
> > routine at some point, right?  [so that a truly PCI-specific definition
> > could be still be had]
> 
> If that was needed, yes, we should not break that functionality.
> 
> Are there any existing archs that need more than just dma_mask moved to
> struct device out of pci_dev?  Hm, ppc might need a bit more...

Add alpha, parisc, sparc and so on. ;-)

pci_dev->sysdata needs to be moved as well, but not only.
It seems that two things are fundamentally missing in generic
device model:
1. clean way to detect the type of container structure from arbitrary
   struct device *;
2. parent/child relationship between devices of different bus types.

Example (not exactly real life, but close enough):
to do DMA mapping properly for, say, some legacy device, I need to know
that it's sitting behind ISA-to-PCI bridge X belonging in PCI domain Y of
the root-level IO controller Z.

Ivan.

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [parisc-linux] Untested port of parisc_device to generic device interface
  2002-11-13 11:59             ` Ivan Kokshaysky
@ 2002-11-13 12:36               ` Marc Zyngier
  0 siblings, 0 replies; 71+ messages in thread
From: Marc Zyngier @ 2002-11-13 12:36 UTC (permalink / raw)
  To: Ivan Kokshaysky
  Cc: Greg KH, Miles Bader, J.E.J. Bottomley, Matthew Wilcox,
	Adam J. Richter, andmike, hch, linux-kernel, mochel,
	parisc-linux

>>>>> "Ivan" == Ivan Kokshaysky <ink@jurassic.park.msu.ru> writes:

Ivan> It seems that two things are fundamentally missing in generic
Ivan> device model:
Ivan> 1. clean way to detect the type of container structure from arbitrary
Ivan>    struct device *;

Indeed.

I'm using the following stuff in some EISA drivers :

#ifdef CONFIG_EISA
#define DEVICE_EISA(dev) (((dev)->bus == &eisa_bus_type) ? to_eisa_device((dev)) : NULL)
#else
#define DEVICE_EISA(dev) NULL
#endif

and frankly, it's really awful. On drivers which are both EISA and
PCI (3c59x, aic7xxx), this is a major pain.

        M.
-- 
Places change, faces change. Life is so very strange.

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [parisc-linux] Untested port of parisc_device to generic device interface
  2002-11-13  7:52           ` Greg KH
  2002-11-13  8:02             ` Miles Bader
  2002-11-13 11:59             ` Ivan Kokshaysky
@ 2002-11-13 16:32             ` Bjorn Helgaas
  2002-11-13 17:23               ` J.E.J. Bottomley
  2002-11-13 20:12             ` Grant Grundler
  3 siblings, 1 reply; 71+ messages in thread
From: Bjorn Helgaas @ 2002-11-13 16:32 UTC (permalink / raw)
  To: Greg KH, Miles Bader
  Cc: J.E.J. Bottomley, Matthew Wilcox, Adam J. Richter, andmike, hch,
	linux-kernel, mochel, parisc-linux

On Wednesday 13 November 2002 12:52 am, Greg KH wrote:
> On Wed, Nov 13, 2002 at 04:46:58PM +0900, Miles Bader wrote:
> > Greg KH <greg@kroah.com> writes:
> > > What is the real reason for needing this, pci_alloc_consistent()?  We
> > > have talked about renaming that to dev_alloc_consistent() in the past,
> > > which I think will work for you, right?
> > 
> > This this would end up [or have the ability to] invoking a bus-specific
> > routine at some point, right?  [so that a truly PCI-specific definition
> > could be still be had]
> 
> If that was needed, yes, we should not break that functionality.
> 
> Are there any existing archs that need more than just dma_mask moved to
> struct device out of pci_dev?  Hm, ppc might need a bit more...

Absolutely.  Boxes with multiple IOMMUs (at least ia64, sparc64, parisc)
need to look up the correct IOMMU with which to map the allocated buffer.
Typically this is in the pci_dev sysdata.

Bjorn


^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [parisc-linux] Untested port of parisc_device to generic device  interface
  2002-11-13 16:32             ` Bjorn Helgaas
@ 2002-11-13 17:23               ` J.E.J. Bottomley
  2002-11-13 20:33                 ` Grant Grundler
  0 siblings, 1 reply; 71+ messages in thread
From: J.E.J. Bottomley @ 2002-11-13 17:23 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Greg KH, Miles Bader, J.E.J. Bottomley, Matthew Wilcox,
	Adam J. Richter, andmike, hch, linux-kernel, mochel,
	parisc-linux

> Absolutely.  Boxes with multiple IOMMUs (at least ia64, sparc64,
> parisc) need to look up the correct IOMMU with which to map the
> allocated buffer. Typically this is in the pci_dev sysdata. 

Actually, I think all of the DMA mapping api needs to become bus independent 
and take a struct device * instead of a pci_dev.  How this lookup/mapping is 
done could be abstracted per architecture inside the DMA api internals.

We should also allow devices to do all the setup through bus generic 
functions, but leave open the possibility that the driver may (once it knows 
the bus type) obtain the pci_dev (or whatever) from the struct device if it 
really, really has to muck with bus specific registers.

As far as the SCSI mid layer goes, all we really need from struct device is 
the dma_mask for setting up the I/O bounce buffers.

The simplest way to do all of this is probably to add a pointer to the 
dma_mask in struct device and make it point to the same thing in pci_dev.  If 
we find we need more than this per device, it could become a pointer to a 
generic dma information structure later on.

Drivers need to advertise DMA conformance (at the moment, requires consistent 
allocation, or fully writeback/invalidate compliant)

We should also adopt Adam's pointer approach to the sync/invalidate points so 
we can treat a dma_alloc_consistent failure as a real failure and not clutter 
the code with writeback/invalidate fallbacks.

The above changes would allow me to yank all of the pci_dev specific code out 
of the scsi mid layer, and also introduce a mca_dev type, convert the 53c700 
driver to using the generic dma API and *finally* get us to the point where I 
don't have to use bounce buffers for highmem access on the MCA bus.

Since the 53c700 is also used by parisc (including some machines with 
IOMMUs---which, unfortunately, I don't have access to), it probably makes an 
ideal conversion test case.

This can probably all be wrappered so the current SCSI pci drivers don't 
notice anything wrong.

James



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [parisc-linux] Untested port of parisc_device to generic device interface
  2002-11-13  7:52           ` Greg KH
                               ` (2 preceding siblings ...)
  2002-11-13 16:32             ` Bjorn Helgaas
@ 2002-11-13 20:12             ` Grant Grundler
  3 siblings, 0 replies; 71+ messages in thread
From: Grant Grundler @ 2002-11-13 20:12 UTC (permalink / raw)
  To: Greg KH
  Cc: Miles Bader, J.E.J. Bottomley, Matthew Wilcox, Adam J. Richter,
	andmike, hch, linux-kernel, mochel, parisc-linux

Greg KH wrote:
> Are there any existing archs that need more than just dma_mask moved to
> struct device out of pci_dev?  Hm, ppc might need a bit more...

"out of pci_dev" and into struct device?
I think that's all parisc port would need now.

At some point I'd like to propose "dma_hint" field.
But I don't have a specific proposal yet.
Something to help drivers communicate DMA characteristics to
the IOMMU support code. ie bandwidth needed, cacheline line aware,
MWLI support, etc.

grant

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [parisc-linux] Untested port of parisc_device to generic device interface
       [not found]               ` <miles@lsi.nec.co.jp>
@ 2002-11-13 20:13                 ` Grant Grundler
  2002-11-13 20:21                   ` J.E.J. Bottomley
  0 siblings, 1 reply; 71+ messages in thread
From: Grant Grundler @ 2002-11-13 20:13 UTC (permalink / raw)
  To: Miles Bader
  Cc: Greg KH, J.E.J. Bottomley, Matthew Wilcox, Adam J. Richter,
	andmike, hch, linux-kernel, mochel, parisc-linux

Miles Bader wrote:
> I can't speak for `real machines,' but on my wierd embedded board,
> pci_alloc_consistent allocates from a special area of memory (not
> located at 0) that is the only shared memory between PCI devices and the
> CPU.  pci_alloc_consistent happens to fit this situation quite well, but
> I don't think a bitmask is enough to express the situation.

HP PARISC V-Class do that as well. The "consistent" memory lives
on the PCI Bus Controller - not in host mem.
Note that parisc-linux does not (yet) support V-class.

grant

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [parisc-linux] Untested port of parisc_device to generic device  interface
  2002-11-13 20:13                 ` Grant Grundler
@ 2002-11-13 20:21                   ` J.E.J. Bottomley
  2002-11-13 20:37                     ` Grant Grundler
  0 siblings, 1 reply; 71+ messages in thread
From: J.E.J. Bottomley @ 2002-11-13 20:21 UTC (permalink / raw)
  To: Grant Grundler
  Cc: Miles Bader, Greg KH, J.E.J. Bottomley, Matthew Wilcox,
	Adam J. Richter, andmike, hch, linux-kernel, mochel,
	parisc-linux

Miles Bader wrote:
> I can't speak for `real machines,' but on my wierd embedded board,
> pci_alloc_consistent allocates from a special area of memory (not
> located at 0) that is the only shared memory between PCI devices and the
> CPU.  pci_alloc_consistent happens to fit this situation quite well, but
> I don't think a bitmask is enough to express the situation.

grundler@dsl2.external.hp.com said:
> HP PARISC V-Class do that as well. The "consistent" memory lives on
> the PCI Bus Controller - not in host mem. Note that parisc-linux does
> not (yet) support V-class. 

Actually, I think dma_mask and consistent memory are orthogonal problems.  
dma_masks are used by the I/O subsystem to determine whether direct DMA to a 
memory region containing an I/O buffer is possible or whether it has to be 
bounced.  Consistent memory is usually allocated for driver specific 
transfers.  The I/O subsystem doesn't usually require the actual I/O buffers 
to be in consistent memory.

James



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [parisc-linux] Untested port of parisc_device to generic device interface
  2002-11-13 17:23               ` J.E.J. Bottomley
@ 2002-11-13 20:33                 ` Grant Grundler
  2002-11-13 20:44                   ` J.E.J. Bottomley
  0 siblings, 1 reply; 71+ messages in thread
From: Grant Grundler @ 2002-11-13 20:33 UTC (permalink / raw)
  To: J.E.J. Bottomley
  Cc: Bjorn Helgaas, Greg KH, Miles Bader, J.E.J. Bottomley,
	Matthew Wilcox, Adam J. Richter, andmike, hch, linux-kernel,
	mochel, parisc-linux

"J.E.J. Bottomley" wrote:
> We should also allow devices to do all the setup through bus generic 
> functions, but leave open the possibility that the driver may (once it knows 
> the bus type) obtain the pci_dev (or whatever) from the struct device if it 
> really, really has to muck with bus specific registers.

For device discovery and initialization, the generic PCI code has to muck
with PCI specific resources (IO Port, MMIO, and IRQ related stuff primarily).

> As far as the SCSI mid layer goes, all we really need from struct device is 
> the dma_mask for setting up the I/O bounce buffers.
> 
> The simplest way to do all of this is probably to add a pointer to the 
> dma_mask in struct device and make it point to the same thing in pci_dev.
> If we find we need more than this per device, it could become a pointer
> to a generic dma information structure later on.

uhmm...If we are going to touch dma_mask in pci_dev, then just move it
to struct device and be done with it. Then fixup pci_set_dma_mask()
to do the right thing.

...
> Since the 53c700 is also used by parisc (including some machines with 
> IOMMUs---which, unfortunately, I don't have access to), it probably makes an 
> ideal conversion test case.

Duck! (that's going to get fixed it seems) ;^)

thanks,
grant

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [parisc-linux] Untested port of parisc_device to generic device interface
  2002-11-13 20:21                   ` J.E.J. Bottomley
@ 2002-11-13 20:37                     ` Grant Grundler
  0 siblings, 0 replies; 71+ messages in thread
From: Grant Grundler @ 2002-11-13 20:37 UTC (permalink / raw)
  To: J.E.J. Bottomley
  Cc: Miles Bader, Greg KH, Matthew Wilcox, Adam J. Richter, andmike,
	hch, linux-kernel, mochel, parisc-linux

"J.E.J. Bottomley" wrote:
> Actually, I think dma_mask and consistent memory are orthogonal problems.  

No. consistent memory needs to be reachable by the device as well.
dma_mask constrains which memory pci_alloc_consistent() can use.

> dma_masks are used by the I/O subsystem to determine whether direct DMA to a 
> memory region containing an I/O buffer is possible or whether it has to be 
> bounced.  Consistent memory is usually allocated for driver specific 
> transfers.  The I/O subsystem doesn't usually require the actual I/O buffers 
> to be in consistent memory.

right.

grant

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [parisc-linux] Untested port of parisc_device to generic device  interface
  2002-11-13 20:33                 ` Grant Grundler
@ 2002-11-13 20:44                   ` J.E.J. Bottomley
  2002-11-13 21:42                     ` Grant Grundler
  0 siblings, 1 reply; 71+ messages in thread
From: J.E.J. Bottomley @ 2002-11-13 20:44 UTC (permalink / raw)
  To: Grant Grundler
  Cc: Bjorn Helgaas, Greg KH, Miles Bader, J.E.J. Bottomley,
	Matthew Wilcox, Adam J. Richter, andmike, hch, linux-kernel,
	mochel, parisc-linux

grundler@dsl2.external.hp.com said:
> For device discovery and initialization, the generic PCI code has to
> muck with PCI specific resources (IO Port, MMIO, and IRQ related stuff
> primarily). 

Oh, I agree.  If we conduct a phased approach to this, what happens initially 
is that the pci drivers simply pull pci_dev out of the struct device and use 
it as previously.

However, I think the ultimate destination is to see how much of the bus 
specific stuff we can abstract by throwing an API around it.  I think IRQ, 
port and mmio are feasible.  Specific knowledge of bus posting et al may not 
be.

> uhmm...If we are going to touch dma_mask in pci_dev, then just move it
> to struct device and be done with it. Then fixup pci_set_dma_mask() to
> do the right thing. 

Well...OK.  The advantage of a pointer in struct device is that the code can 
be converted as is, and no-one has to muck with the direct accessors of the 
pci_dev->dma_mask.  However, I'll see how many of them there actually are, its 
probably just the drivers that transfer the information to 
blk_queue_bounce_limit.

> Duck! (that's going to get fixed it seems) ;^) 

I thought the 53c700 was working OK?  Richard Hirst did some extensive testing 
on a parisc with an IO-MMU for me (he caught a lot of early mapping leaks 
which I fixed).

James



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [parisc-linux] Untested port of parisc_device to generic device interface
  2002-11-13 20:44                   ` J.E.J. Bottomley
@ 2002-11-13 21:42                     ` Grant Grundler
  0 siblings, 0 replies; 71+ messages in thread
From: Grant Grundler @ 2002-11-13 21:42 UTC (permalink / raw)
  To: J.E.J. Bottomley
  Cc: Bjorn Helgaas, Greg KH, Miles Bader, Matthew Wilcox,
	Adam J. Richter, andmike, linux-kernel, mochel, parisc-linux

"J.E.J. Bottomley" wrote:
> However, I think the ultimate destination is to see how much of the bus 
> specific stuff we can abstract by throwing an API around it.  I think IRQ, 
> port and mmio are feasible.  Specific knowledge of bus posting et al may not 
> be.

I was thinking how many BARs are present/used is PCI specific.

arch code already handles most of the IRQ fixups anyway and it
doesn't really matter where IRQ info is stored as long as the
device driver knows where to find it.

> > Duck! (that's going to get fixed it seems) ;^) 
> 
> I thought the 53c700 was working OK?

sorry - "going to get fixed" meant we are looking for a C180
or similar machine to send you.

thanks!
grant

^ permalink raw reply	[flat|nested] 71+ messages in thread

* [RFC] generic device DMA implementation
@ 2002-12-04 17:47 James Bottomley
  2002-12-04 18:27 ` Jeff Garzik
                   ` (3 more replies)
  0 siblings, 4 replies; 71+ messages in thread
From: James Bottomley @ 2002-12-04 17:47 UTC (permalink / raw)
  To: linux-kernel; +Cc: James.Bottomley

[-- Attachment #1: Type: text/plain, Size: 1426 bytes --]

Currently our only DMA API is highly PCI specific (making any non-pci bus with 
a DMA controller create fake PCI devices to help it function).

Now that we have the generic device model, it should be equally possible to 
rephrase the entire API for generic devices instead of pci_devs.

This patch does just that (for x86---although I also have working code for 
parisc, that's where I actually tested the DMA capability).

The API is substantially the same as the PCI DMA one, with one important 
exception with regard to consistent memory:

The PCI api has pci_alloc_consistent which allocates only consistent memory 
and fails the allocation if none is available thus leading to driver writers 
who might need to function with inconsistent memory to detect this and employ 
a fallback strategy.

The new DMA API allows a driver to advertise its level of consistent memory 
compliance to dma_alloc_consistent.  There are essentially two levels:

- I only work with consistent memory, fail if I cannot get it, or
- I can work with inconsistent memory, try consistent first but return 
inconsistent if it's not available.

The idea is that the memory type can be coded into dma_addr_t which the 
subsequent memory sync operations can use to determine whether 
wback/invalidate should be a nop or not.

Using this scheme allows me to eliminate all the inconsistent memory fallbacks 
from my drivers.

Comments welcome.

James


[-- Attachment #2: tmp.diff --]
[-- Type: text/plain , Size: 17207 bytes --]

# This is a BitKeeper generated patch for the following project:
# Project Name: Linux kernel tree
# This patch format is intended for GNU patch command version 2.5 or higher.
# This patch includes the following deltas:
#	           ChangeSet	1.926   -> 1.929  
#	arch/i386/kernel/pci-dma.c	1.8     -> 1.9    
#	   drivers/pci/pci.c	1.50    -> 1.51   
#	include/asm-i386/pci.h	1.17    -> 1.18   
#	 include/linux/pci.h	1.52    -> 1.54   
#	               (new)	        -> 1.3     include/asm-i386/dma-mapping.h
#	               (new)	        -> 1.1     include/asm-generic/dma-mapping.h
#	               (new)	        -> 1.1     include/linux/dma-mapping.h
#
# The following is the BitKeeper ChangeSet Log
# --------------------------------------------
# 02/12/03	jejb@raven.il.steeleye.com	1.927
# Implement a generic device based DMA API
# 
# implement a DMA mapping API based on generic devices instead of
# PCI ones.
# 
# The API is almost identical to the PCI one except for
# 
# - the dma_alloc_consistent takes a conformance level, so the driver
# may choose to support only consistent or also non-consistent memory.
# --------------------------------------------
# 02/12/03	jejb@raven.il.steeleye.com	1.928
# Update to include dma_supported in the API
# --------------------------------------------
# 02/12/03	jejb@raven.il.steeleye.com	1.929
# minor fixes to x86 dma implementation
# --------------------------------------------
#
diff -Nru a/arch/i386/kernel/pci-dma.c b/arch/i386/kernel/pci-dma.c
--- a/arch/i386/kernel/pci-dma.c	Wed Dec  4 11:25:59 2002
+++ b/arch/i386/kernel/pci-dma.c	Wed Dec  4 11:25:59 2002
@@ -13,13 +13,17 @@
 #include <linux/pci.h>
 #include <asm/io.h>
 
-void *pci_alloc_consistent(struct pci_dev *hwdev, size_t size,
-			   dma_addr_t *dma_handle)
+void *dma_alloc_consistent(struct device *dev, size_t size,
+			   dma_addr_t *dma_handle,
+			   enum dma_conformance_level level)
 {
 	void *ret;
 	int gfp = GFP_ATOMIC;
 
-	if (hwdev == NULL || ((u32)hwdev->dma_mask != 0xffffffff))
+	if(level == DMA_CONFORMANCE_NONE)
+		return NULL;
+
+	if (dev == NULL || ((u32)*dev->dma_mask != 0xffffffff))
 		gfp |= GFP_DMA;
 	ret = (void *)__get_free_pages(gfp, get_order(size));
 
@@ -30,7 +34,7 @@
 	return ret;
 }
 
-void pci_free_consistent(struct pci_dev *hwdev, size_t size,
+void dma_free_consistent(struct device *dev, size_t size,
 			 void *vaddr, dma_addr_t dma_handle)
 {
 	free_pages((unsigned long)vaddr, get_order(size));
diff -Nru a/drivers/pci/pci.c b/drivers/pci/pci.c
--- a/drivers/pci/pci.c	Wed Dec  4 11:25:59 2002
+++ b/drivers/pci/pci.c	Wed Dec  4 11:25:59 2002
@@ -680,7 +680,7 @@
 int
 pci_set_dma_mask(struct pci_dev *dev, u64 mask)
 {
-	if (!pci_dma_supported(dev, mask))
+	if (!dma_supported(&dev->dev, mask))
 		return -EIO;
 
 	dev->dma_mask = mask;
diff -Nru a/include/asm-generic/dma-mapping.h b/include/asm-generic/dma-mapping.h
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/include/asm-generic/dma-mapping.h	Wed Dec  4 11:25:59 2002
@@ -0,0 +1,29 @@
+#ifndef _ASM_GENERIC_DMA_MAPPING_H
+#define _ASM_GENERIC_DMA_MAPPING_H
+
+int dma_set_mask(struct device *dev, u64 dma_mask);
+void *dma_alloc_consistent(struct device *dev, size_t size,
+			   dma_addr_t *dma_handle,
+			   enum dma_conformance_level level);
+enum dma_conformance_level dma_get_conformance(dma_addr_t dma_handle);
+void dma_free_consistent(struct device *dev, size_t size, void *cpu_addr,
+			  dma_addr_t dma_handle);
+dma_addr_t dma_map_single(struct device *dev, void *cpu_addr, size_t size,
+			   enum dma_data_direction direction);
+void dma_unmap_single(struct device *dev, dma_addr_t dma_addr, size_t size,
+		      enum dma_data_direction direction);
+dma_addr_t dma_map_page(struct device *dev, struct page *page,
+			unsigned long offset, size_t size,
+			enum dma_data_direction direction);
+void dma_unmap_page(struct device *dev, dma_addr_t dma_address, size_t size,
+		    enum dma_data_direction direction);
+int dma_map_sg(struct device *dev, struct scatterlist *sg, int nents,
+	       enum dma_data_direction direction);
+void dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nhwentries,
+		  enum dma_data_direction direction);
+void dma_sync_single(struct device *dev, dma_addr_t dma_handle, size_t size,
+		     enum dma_data_direction direction);
+void dma_sync_sg(struct device *dev, struct scatterlist *sg, int nelems,
+		 enum dma_data_direction direction);
+#endif
+
diff -Nru a/include/asm-i386/dma-mapping.h b/include/asm-i386/dma-mapping.h
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/include/asm-i386/dma-mapping.h	Wed Dec  4 11:25:59 2002
@@ -0,0 +1,113 @@
+#ifndef _ASM_I386_DMA_MAPPING_H
+#define _ASM_I386_DMA_MAPPING_H
+
+void *dma_alloc_consistent(struct device *dev, size_t size,
+			   dma_addr_t *dma_handle,
+			   enum dma_conformance_level level);
+
+void dma_free_consistent(struct device *dev, size_t size,
+			 void *vaddr, dma_addr_t dma_handle);
+
+static inline dma_addr_t
+dma_map_single(struct device *dev, void *ptr, size_t size,
+	       enum dma_data_direction direction)
+{
+	BUG_ON(direction == DMA_NONE);
+	flush_write_buffers();
+	return virt_to_phys(ptr);
+}
+
+static inline void
+dma_unmap_single(struct device *dev, dma_addr_t dma_addr, size_t size,
+		 enum dma_data_direction direction)
+{
+	BUG_ON(direction == DMA_NONE);
+}
+
+static inline int
+dma_map_sg(struct device *dev, struct scatterlist *sg, int nents,
+	   enum dma_data_direction direction)
+{
+	int i;
+
+	BUG_ON(direction == DMA_NONE);
+
+	for (i = 0; i < nents; i++ ) {
+		BUG_ON(!sg[i].page);
+
+		sg[i].dma_address = page_to_phys(sg[i].page) + sg[i].offset;
+	}
+
+	flush_write_buffers();
+	return nents;
+}
+
+static inline dma_addr_t
+dma_map_page(struct device *dev, struct page *page, unsigned long offset,
+	     size_t size, enum dma_data_direction direction)
+{
+	BUG_ON(direction == DMA_NONE);
+	return (dma_addr_t)(page_to_pfn(page)) * PAGE_SIZE + offset;
+}
+
+static inline void
+dma_unmap_page(struct device *dev, dma_addr_t dma_address, size_t size,
+	       enum dma_data_direction direction)
+{
+	BUG_ON(direction == DMA_NONE);
+}
+
+
+static inline void
+dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nhwentries,
+	     enum dma_data_direction direction)
+{
+	BUG_ON(direction == DMA_NONE);
+}
+
+static inline void
+dma_sync_single(struct device *dev, dma_addr_t dma_handle, size_t size,
+		enum dma_data_direction direction)
+{
+	flush_write_buffers();
+}
+
+static inline void
+dma_sync_sg(struct device *dev, struct scatterlist *sg, int nelems,
+		 enum dma_data_direction direction)
+{
+	flush_write_buffers();
+}
+
+static inline enum dma_conformance_level
+dma_get_conformance(dma_addr_t dma_handle)
+{
+	return DMA_CONFORMANCE_CONSISTENT;
+}
+
+static inline int
+dma_supported(struct device *dev, u64 mask)
+{
+        /*
+         * we fall back to GFP_DMA when the mask isn't all 1s,
+         * so we can't guarantee allocations that must be
+         * within a tighter range than GFP_DMA..
+         */
+        if(mask < 0x00ffffff)
+                return 0;
+
+	return 1;
+}
+
+static inline int
+dma_set_mask(struct device *dev, u64 mask)
+{
+	if(!dev->dma_mask || !dma_supported(dev, mask))
+		return -EIO;
+
+	*dev->dma_mask = mask;
+
+	return 0;
+}
+
+#endif
diff -Nru a/include/asm-i386/pci.h b/include/asm-i386/pci.h
--- a/include/asm-i386/pci.h	Wed Dec  4 11:25:59 2002
+++ b/include/asm-i386/pci.h	Wed Dec  4 11:25:59 2002
@@ -6,6 +6,9 @@
 #ifdef __KERNEL__
 #include <linux/mm.h>		/* for struct page */
 
+/* we support the new DMA API, but still provide the old one */
+#define PCI_NEW_DMA_COMPAT_API	1
+
 /* Can be used to override the logic in pci_scan_bus for skipping
    already-configured bus numbers - to be used for buggy BIOSes
    or architectures with incomplete PCI setup by the loader */
@@ -46,78 +49,6 @@
  */
 #define PCI_DMA_BUS_IS_PHYS	(1)
 
-/* Allocate and map kernel buffer using consistent mode DMA for a device.
- * hwdev should be valid struct pci_dev pointer for PCI devices,
- * NULL for PCI-like buses (ISA, EISA).
- * Returns non-NULL cpu-view pointer to the buffer if successful and
- * sets *dma_addrp to the pci side dma address as well, else *dma_addrp
- * is undefined.
- */
-extern void *pci_alloc_consistent(struct pci_dev *hwdev, size_t size,
-				  dma_addr_t *dma_handle);
-
-/* Free and unmap a consistent DMA buffer.
- * cpu_addr is what was returned from pci_alloc_consistent,
- * size must be the same as what as passed into pci_alloc_consistent,
- * and likewise dma_addr must be the same as what *dma_addrp was set to.
- *
- * References to the memory and mappings associated with cpu_addr/dma_addr
- * past this call are illegal.
- */
-extern void pci_free_consistent(struct pci_dev *hwdev, size_t size,
-				void *vaddr, dma_addr_t dma_handle);
-
-/* Map a single buffer of the indicated size for DMA in streaming mode.
- * The 32-bit bus address to use is returned.
- *
- * Once the device is given the dma address, the device owns this memory
- * until either pci_unmap_single or pci_dma_sync_single is performed.
- */
-static inline dma_addr_t pci_map_single(struct pci_dev *hwdev, void *ptr,
-					size_t size, int direction)
-{
-	if (direction == PCI_DMA_NONE)
-		BUG();
-	flush_write_buffers();
-	return virt_to_phys(ptr);
-}
-
-/* Unmap a single streaming mode DMA translation.  The dma_addr and size
- * must match what was provided for in a previous pci_map_single call.  All
- * other usages are undefined.
- *
- * After this call, reads by the cpu to the buffer are guarenteed to see
- * whatever the device wrote there.
- */
-static inline void pci_unmap_single(struct pci_dev *hwdev, dma_addr_t dma_addr,
-				    size_t size, int direction)
-{
-	if (direction == PCI_DMA_NONE)
-		BUG();
-	/* Nothing to do */
-}
-
-/*
- * pci_{map,unmap}_single_page maps a kernel page to a dma_addr_t. identical
- * to pci_map_single, but takes a struct page instead of a virtual address
- */
-static inline dma_addr_t pci_map_page(struct pci_dev *hwdev, struct page *page,
-				      unsigned long offset, size_t size, int direction)
-{
-	if (direction == PCI_DMA_NONE)
-		BUG();
-
-	return (dma_addr_t)(page_to_pfn(page)) * PAGE_SIZE + offset;
-}
-
-static inline void pci_unmap_page(struct pci_dev *hwdev, dma_addr_t dma_address,
-				  size_t size, int direction)
-{
-	if (direction == PCI_DMA_NONE)
-		BUG();
-	/* Nothing to do */
-}
-
 /* pci_unmap_{page,single} is a nop so... */
 #define DECLARE_PCI_UNMAP_ADDR(ADDR_NAME)
 #define DECLARE_PCI_UNMAP_LEN(LEN_NAME)
@@ -126,84 +57,6 @@
 #define pci_unmap_len(PTR, LEN_NAME)		(0)
 #define pci_unmap_len_set(PTR, LEN_NAME, VAL)	do { } while (0)
 
-/* Map a set of buffers described by scatterlist in streaming
- * mode for DMA.  This is the scather-gather version of the
- * above pci_map_single interface.  Here the scatter gather list
- * elements are each tagged with the appropriate dma address
- * and length.  They are obtained via sg_dma_{address,length}(SG).
- *
- * NOTE: An implementation may be able to use a smaller number of
- *       DMA address/length pairs than there are SG table elements.
- *       (for example via virtual mapping capabilities)
- *       The routine returns the number of addr/length pairs actually
- *       used, at most nents.
- *
- * Device ownership issues as mentioned above for pci_map_single are
- * the same here.
- */
-static inline int pci_map_sg(struct pci_dev *hwdev, struct scatterlist *sg,
-			     int nents, int direction)
-{
-	int i;
-
-	if (direction == PCI_DMA_NONE)
-		BUG();
-
-	for (i = 0; i < nents; i++ ) {
-		if (!sg[i].page)
-			BUG();
-
-		sg[i].dma_address = page_to_phys(sg[i].page) + sg[i].offset;
-	}
-
-	flush_write_buffers();
-	return nents;
-}
-
-/* Unmap a set of streaming mode DMA translations.
- * Again, cpu read rules concerning calls here are the same as for
- * pci_unmap_single() above.
- */
-static inline void pci_unmap_sg(struct pci_dev *hwdev, struct scatterlist *sg,
-				int nents, int direction)
-{
-	if (direction == PCI_DMA_NONE)
-		BUG();
-	/* Nothing to do */
-}
-
-/* Make physical memory consistent for a single
- * streaming mode DMA translation after a transfer.
- *
- * If you perform a pci_map_single() but wish to interrogate the
- * buffer using the cpu, yet do not wish to teardown the PCI dma
- * mapping, you must call this function before doing so.  At the
- * next point you give the PCI dma address back to the card, the
- * device again owns the buffer.
- */
-static inline void pci_dma_sync_single(struct pci_dev *hwdev,
-				       dma_addr_t dma_handle,
-				       size_t size, int direction)
-{
-	if (direction == PCI_DMA_NONE)
-		BUG();
-	flush_write_buffers();
-}
-
-/* Make physical memory consistent for a set of streaming
- * mode DMA translations after a transfer.
- *
- * The same as pci_dma_sync_single but for a scatter-gather list,
- * same rules and usage.
- */
-static inline void pci_dma_sync_sg(struct pci_dev *hwdev,
-				   struct scatterlist *sg,
-				   int nelems, int direction)
-{
-	if (direction == PCI_DMA_NONE)
-		BUG();
-	flush_write_buffers();
-}
 
 /* Return whether the given PCI device DMA address mask can
  * be supported properly.  For example, if your device can
diff -Nru a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/include/linux/dma-mapping.h	Wed Dec  4 11:25:59 2002
@@ -0,0 +1,32 @@
+#ifndef _ASM_LINUX_DMA_MAPPING_H
+#define _ASM_LINUX_DMA_MAPPING_H
+
+/* This is the level of conformance required for consistent allocation
+ *
+ * DMA_CONFORMANCE_NONE - debugging: always fail consistent allocation
+ * DMA_CONFORMANCE_CONSISTENT - only allocate consistent memory.  Fail
+ *	if consistent memory cannot be allocated.
+ * DMA_CONFORMANCE_NON_CONSISTENT - driver has full writeback/invalidate
+ *	compliance.  Return non-consistent memory if consistent cannot be
+ *	allocated.
+ */
+enum dma_conformance_level {
+	DMA_CONFORMANCE_NONE,
+	DMA_CONFORMANCE_CONSISTENT,
+	DMA_CONFORMANCE_NON_CONSISTENT,
+};
+
+/* These definitions mirror those in pci.h, so they can be used
+ * interchangeably with their PCI_ counterparts */
+enum dma_data_direction {
+	DMA_BIDIRECTIONAL = 0,
+	DMA_TO_DEVICE = 1,
+	DMA_FROM_DEVICE = 2,
+	DMA_NONE = 3,
+};
+
+#include <asm/dma-mapping.h>
+
+#endif
+
+
diff -Nru a/include/linux/pci.h b/include/linux/pci.h
--- a/include/linux/pci.h	Wed Dec  4 11:25:59 2002
+++ b/include/linux/pci.h	Wed Dec  4 11:25:59 2002
@@ -788,5 +788,95 @@
 #define PCIPCI_VIAETBF		8
 #define PCIPCI_VSFX		16
 
+#include <linux/dma-mapping.h>
+
+/* If you define this macro it means you support the new DMA API.
+ *
+ * The old pci_dma... API is deprecated, but for now it is simply translated
+ * into the new generic device based API.
+ */
+#ifdef PCI_NEW_DMA_COMPAT_API
+
+/* note pci_set_dma_mask isn't here, since it's a public function
+ * exported from drivers/pci, use dma_supported instead */
+
+static inline int
+pci_dma_supported(struct pci_dev *hwdev, u64 mask)
+{
+	return dma_supported(&hwdev->dev, mask);
+}
+
+static inline void *
+pci_alloc_consistent(struct pci_dev *hwdev, size_t size,
+		     dma_addr_t *dma_handle)
+{
+	return dma_alloc_consistent(&hwdev->dev, size, dma_handle,
+				    DMA_CONFORMANCE_CONSISTENT);
+}
+
+static inline void
+pci_free_consistent(struct pci_dev *hwdev, size_t size,
+		    void *vaddr, dma_addr_t dma_handle)
+{
+	dma_free_consistent(&hwdev->dev, size, vaddr, dma_handle);
+}
+
+static inline dma_addr_t
+pci_map_single(struct pci_dev *hwdev, void *ptr, size_t size, int direction)
+{
+	return dma_map_single(&hwdev->dev, ptr, size, (enum dma_data_direction)direction);
+}
+
+static inline void
+pci_unmap_single(struct pci_dev *hwdev, dma_addr_t dma_addr,
+		 size_t size, int direction)
+{
+	dma_unmap_single(&hwdev->dev, dma_addr, size, (enum dma_data_direction)direction);
+}
+
+static inline dma_addr_t
+pci_map_page(struct pci_dev *hwdev, struct page *page,
+	     unsigned long offset, size_t size, int direction)
+{
+	return dma_map_page(&hwdev->dev, page, offset, size, (enum dma_data_direction)direction);
+}
+
+static inline void
+pci_unmap_page(struct pci_dev *hwdev, dma_addr_t dma_address,
+	       size_t size, int direction)
+{
+	dma_unmap_page(&hwdev->dev, dma_address, size, (enum dma_data_direction)direction);
+}
+
+static inline int
+pci_map_sg(struct pci_dev *hwdev, struct scatterlist *sg,
+	   int nents, int direction)
+{
+	return dma_map_sg(&hwdev->dev, sg, nents, (enum dma_data_direction)direction);
+}
+
+static inline void
+pci_unmap_sg(struct pci_dev *hwdev, struct scatterlist *sg,
+	     int nents, int direction)
+{
+	dma_unmap_sg(&hwdev->dev, sg, nents, (enum dma_data_direction)direction);
+}
+
+static inline void
+pci_dma_sync_single(struct pci_dev *hwdev, dma_addr_t dma_handle,
+		    size_t size, int direction)
+{
+	dma_sync_single(&hwdev->dev, dma_handle, size, (enum dma_data_direction)direction);
+}
+
+static inline void
+pci_dma_sync_sg(struct pci_dev *hwdev, struct scatterlist *sg,
+		int nelems, int direction)
+{
+	dma_sync_sg(&hwdev->dev, sg, nelems, (enum dma_data_direction)direction);
+}
+
+#endif
+
 #endif /* __KERNEL__ */
 #endif /* LINUX_PCI_H */

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC] generic device DMA implementation
  2002-12-04 17:47 [RFC] " James Bottomley
@ 2002-12-04 18:27 ` Jeff Garzik
  2002-12-04 19:36   ` James Bottomley
  2002-12-04 21:19 ` Miles Bader
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 71+ messages in thread
From: Jeff Garzik @ 2002-12-04 18:27 UTC (permalink / raw)
  To: James Bottomley; +Cc: linux-kernel

You should update Documentation/DMA-mapping.txt too :)


^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC] generic device DMA implementation
  2002-12-04 18:27 ` Jeff Garzik
@ 2002-12-04 19:36   ` James Bottomley
  0 siblings, 0 replies; 71+ messages in thread
From: James Bottomley @ 2002-12-04 19:36 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: James Bottomley, linux-kernel

jgarzik@pobox.com said:
> You should update Documentation/DMA-mapping.txt too :)

Oh, yes, and convert all the other arch's too.  That's on my list of things 
todo (and will be there when I post a patch).  I was just throwing out a 
request for comments to see what turned up.

James



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC] generic device DMA implementation
  2002-12-04 17:47 [RFC] " James Bottomley
  2002-12-04 18:27 ` Jeff Garzik
@ 2002-12-04 21:19 ` Miles Bader
  2002-12-04 21:21 ` Miles Bader
  2002-12-05  0:47 ` David Gibson
  3 siblings, 0 replies; 71+ messages in thread
From: Miles Bader @ 2002-12-04 21:19 UTC (permalink / raw)
  To: James, James.Bottomley@SteelEye.com; +Cc: linux-kernel

James Bottomley writes:
> Currently our only DMA API is highly PCI specific (making any non-pci
> bus with a DMA controller create fake PCI devices to help it
> function).
>
> Now that we have the generic device model, it should be equally
> possible to rephrase the entire API for generic devices instead of
> pci_devs.

Keep in mind that sometimes the actual _implementation_ is also highly
PCI-specific -- that is, what works for PCI devices may not work for
other devices and vice-versa.

So perhaps instead of just replacing `pci_...' with `dma_...', it would
be better to add new function pointers to `struct bus_type' for all this
stuff (or something like that).

> The PCI api has pci_alloc_consistent which allocates only consistent memory
> and fails the allocation if none is available thus leading to driver writers
> who might need to function with inconsistent memory to detect this and employ
> a fallback strategy.
> ...
> The idea is that the memory type can be coded into dma_addr_t which the
> subsequent memory sync operations can use to determine whether
> wback/invalidate should be a nop or not.

How is the driver supposed to tell whether a given dma_addr_t value
represents consistent memory or not?  It seems like an (arch-specific)
`dma_addr_is_consistent' function is necessary, but I couldn't see one
in your patch.

Thanks,

-Miles
-- 
We are all lying in the gutter, but some of us are looking at the stars.
-Oscar Wilde

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC] generic device DMA implementation
  2002-12-04 17:47 [RFC] " James Bottomley
  2002-12-04 18:27 ` Jeff Garzik
  2002-12-04 21:19 ` Miles Bader
@ 2002-12-04 21:21 ` Miles Bader
  2002-12-04 21:42   ` James Bottomley
  2002-12-04 21:46   ` James Bottomley
  2002-12-05  0:47 ` David Gibson
  3 siblings, 2 replies; 71+ messages in thread
From: Miles Bader @ 2002-12-04 21:21 UTC (permalink / raw)
  To: James Bottomley; +Cc: linux-kernel

James Bottomley writes:
> Currently our only DMA API is highly PCI specific (making any non-pci
> bus with a DMA controller create fake PCI devices to help it
> function).
>
> Now that we have the generic device model, it should be equally
> possible to rephrase the entire API for generic devices instead of
> pci_devs.

Keep in mind that sometimes the actual _implementation_ is also highly
PCI-specific -- that is, what works for PCI devices may not work for
other devices and vice-versa.

So perhaps instead of just replacing `pci_...' with `dma_...', it would
be better to add new function pointers to `struct bus_type' for all this
stuff (or something like that).

> The PCI api has pci_alloc_consistent which allocates only consistent memory
> and fails the allocation if none is available thus leading to driver writers
> who might need to function with inconsistent memory to detect this and employ
> a fallback strategy.
> ...
> The idea is that the memory type can be coded into dma_addr_t which the
> subsequent memory sync operations can use to determine whether
> wback/invalidate should be a nop or not.

How is the driver supposed to tell whether a given dma_addr_t value
represents consistent memory or not?  It seems like an (arch-specific)
`dma_addr_is_consistent' function is necessary, but I couldn't see one
in your patch.

Thanks,

-Miles
-- 
We are all lying in the gutter, but some of us are looking at the stars.
-Oscar Wilde

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC] generic device DMA implementation
  2002-12-04 21:21 ` Miles Bader
@ 2002-12-04 21:42   ` James Bottomley
  2002-12-05  5:44     ` Miles Bader
  2002-12-04 21:46   ` James Bottomley
  1 sibling, 1 reply; 71+ messages in thread
From: James Bottomley @ 2002-12-04 21:42 UTC (permalink / raw)
  To: Miles Bader; +Cc: James Bottomley, linux-kernel

miles@gnu.org said:
> Keep in mind that sometimes the actual _implementation_ is also highly
> PCI-specific -- that is, what works for PCI devices may not work for
> other devices and vice-versa.

> So perhaps instead of just replacing `pci_...' with `dma_...', it
> would be better to add new function pointers to `struct bus_type' for
> all this stuff (or something like that). 

Not really, that can all be taken care of in the platform implementation.

The parisc implementation has exactly that problem.  The platform 
implementation uses the generic device platform_data to cache the iommu 
accessor methods (it actually finds the iommu by walking up the device parents 
until it gets to the iommu driver--which means it needs to walk off the PCI 
bus).

In general, the generic device already has enough information that the 
platform implementation can be highly bus specific---and, of course, once you 
know exactly what bus it's on, you can cast it to the bus device if you want.

James



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC] generic device DMA implementation
  2002-12-04 21:21 ` Miles Bader
  2002-12-04 21:42   ` James Bottomley
@ 2002-12-04 21:46   ` James Bottomley
  2002-12-05  2:31     ` Miles Bader
  2002-12-05 11:15     ` Benjamin Herrenschmidt
  1 sibling, 2 replies; 71+ messages in thread
From: James Bottomley @ 2002-12-04 21:46 UTC (permalink / raw)
  To: Miles Bader; +Cc: James Bottomley, linux-kernel

miles@gnu.org said:
> How is the driver supposed to tell whether a given dma_addr_t value
> represents consistent memory or not?  It seems like an (arch-specific)
> `dma_addr_is_consistent' function is necessary, but I couldn't see one
> in your patch. 

well, the patch was only for x86, which is fully consistent.  For parisc, that 
becomes a field for the dma accessor functions.

However, even on parisc, the (supported) machines are either entirely 
consistent or entirely inconsistent.

If you have a machine that has both consistent and inconsistent blocks, you 
need to encode that in dma_addr_t (which is a platform definable type).

The sync functions would just decode the type and either nop or perform the 
sync.

James



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC] generic device DMA implementation
  2002-12-04 17:47 [RFC] " James Bottomley
                   ` (2 preceding siblings ...)
  2002-12-04 21:21 ` Miles Bader
@ 2002-12-05  0:47 ` David Gibson
  2002-12-05  0:54   ` Jeff Garzik
                     ` (2 more replies)
  3 siblings, 3 replies; 71+ messages in thread
From: David Gibson @ 2002-12-05  0:47 UTC (permalink / raw)
  To: James Bottomley; +Cc: linux-kernel

On Wed, Dec 04, 2002 at 11:47:14AM -0600, James Bottomley wrote:
> Currently our only DMA API is highly PCI specific (making any non-pci bus with 
> a DMA controller create fake PCI devices to help it function).
> 
> Now that we have the generic device model, it should be equally possible to 
> rephrase the entire API for generic devices instead of pci_devs.
> 
> This patch does just that (for x86---although I also have working code for 
> parisc, that's where I actually tested the DMA capability).
> 
> The API is substantially the same as the PCI DMA one, with one important 
> exception with regard to consistent memory:
> 
> The PCI api has pci_alloc_consistent which allocates only consistent memory 
> and fails the allocation if none is available thus leading to driver writers 
> who might need to function with inconsistent memory to detect this and employ 
> a fallback strategy.
> 
> The new DMA API allows a driver to advertise its level of consistent memory 
> compliance to dma_alloc_consistent.  There are essentially two levels:
> 
> - I only work with consistent memory, fail if I cannot get it, or
> - I can work with inconsistent memory, try consistent first but return 
> inconsistent if it's not available.

Do you have an example of where the second option is useful?  Off hand
the only places I can think of where you'd use a consistent_alloc()
rather than map_single() and friends is in cases where the hardware's
behaviour means you absolutely positively have to have consistent
memory.

-- 
David Gibson			| For every complex problem there is a
david@gibson.dropbear.id.au	| solution which is simple, neat and
				| wrong.
http://www.ozlabs.org/people/dgibson

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC] generic device DMA implementation
  2002-12-05  0:47 ` David Gibson
@ 2002-12-05  0:54   ` Jeff Garzik
  2002-12-05  1:44   ` James Bottomley
  2002-12-05 11:08   ` Benjamin Herrenschmidt
  2 siblings, 0 replies; 71+ messages in thread
From: Jeff Garzik @ 2002-12-05  0:54 UTC (permalink / raw)
  To: David Gibson; +Cc: James Bottomley, linux-kernel

David Gibson wrote:
> On Wed, Dec 04, 2002 at 11:47:14AM -0600, James Bottomley wrote:
>>The new DMA API allows a driver to advertise its level of consistent memory 
>>compliance to dma_alloc_consistent.  There are essentially two levels:
>>
>>- I only work with consistent memory, fail if I cannot get it, or
>>- I can work with inconsistent memory, try consistent first but return 
>>inconsistent if it's not available.
> 
> 
> Do you have an example of where the second option is useful?  Off hand
> the only places I can think of where you'd use a consistent_alloc()
> rather than map_single() and friends is in cases where the hardware's
> behaviour means you absolutely positively have to have consistent
> memory.


agreed, good catch.  Returning inconsistent memory when you asked for 
consistent makes not much sense:  the programmer either knows what the 
hardware wants, or the programmer is silly and should not be using 
alloc_consistent anyway.



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC] generic device DMA implementation
  2002-12-05  0:47 ` David Gibson
  2002-12-05  0:54   ` Jeff Garzik
@ 2002-12-05  1:44   ` James Bottomley
  2002-12-05  2:38     ` David Gibson
  2002-12-05 11:08   ` Benjamin Herrenschmidt
  2 siblings, 1 reply; 71+ messages in thread
From: James Bottomley @ 2002-12-05  1:44 UTC (permalink / raw)
  To: David Gibson, Adam J. Richter, James Bottomley, linux-kernel

david@gibson.dropbear.id.au said:
> Do you have an example of where the second option is useful?  Off hand
> the only places I can think of where you'd use a consistent_alloc()
> rather than map_single() and friends is in cases where the hardware's
> behaviour means you absolutely positively have to have consistent
> memory. 

Well, it comes from parisc drivers.  Here you'd really rather have consistent 
memory because it's more efficient, but on certain platforms it's just not 
possible.

In the drivers that do this, it leads to this type of awfulness:

consistent = 1;
if(!mem = pci_alloc_consistent() {
	mem = __get_free_pages
	mem = pci_map_single()
	consistent = 1;
}
....
if(!consistent)
	dma_cache_wback()

etc.

The idea is that this translates to

mem = dma_alloc_consistent(... DMA_CONFORMANCE_NON_CONSISTENT)

...

dma_sync_single(mem..)

Where if you have consistent memory then the sync is a nop.

adam@yggdrasil.com said:
> 	If these routines can allocate non-consistent memory, then how about
> renaming them to something less misleading, like dma_{malloc,free}? 

Yes, I think the above makes this point.  I'll change the names.

James



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC] generic device DMA implementation
  2002-12-04 21:46   ` James Bottomley
@ 2002-12-05  2:31     ` Miles Bader
  2002-12-05  3:06       ` James Bottomley
  2002-12-05  5:02       ` David Gibson
  2002-12-05 11:15     ` Benjamin Herrenschmidt
  1 sibling, 2 replies; 71+ messages in thread
From: Miles Bader @ 2002-12-05  2:31 UTC (permalink / raw)
  To: James Bottomley; +Cc: linux-kernel

James Bottomley <James.Bottomley@steeleye.com> writes:
> > How is the driver supposed to tell whether a given dma_addr_t value
> > represents consistent memory or not?  It seems like an
> > (arch-specific) `dma_addr_is_consistent' function is necessary, but
> > I couldn't see one in your patch.
> 
> If you have a machine that has both consistent and inconsistent blocks, you 
> need to encode that in dma_addr_t (which is a platform definable type).
> 
> The sync functions would just decode the type and either nop or perform the 
> sync.

My thinking was that a driver might want to do things like --

  if (dma_addr_is_consistent (some_funky_addr)) {
    do it quickly;
  } else
    do_it_the_slow_way (some_funky_addr);

in other words, something besides just calling the sync functions, in
the case where the memory was consistent.

-Miles
-- 
Suburbia: where they tear out the trees and then name streets after them.

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC] generic device DMA implementation
  2002-12-05  1:44   ` James Bottomley
@ 2002-12-05  2:38     ` David Gibson
  2002-12-05  3:13       ` James Bottomley
                         ` (2 more replies)
  0 siblings, 3 replies; 71+ messages in thread
From: David Gibson @ 2002-12-05  2:38 UTC (permalink / raw)
  To: James Bottomley; +Cc: Adam J. Richter, linux-kernel

On Wed, Dec 04, 2002 at 07:44:17PM -0600, James Bottomley wrote:
> david@gibson.dropbear.id.au said:
> > Do you have an example of where the second option is useful?  Off hand
> > the only places I can think of where you'd use a consistent_alloc()
> > rather than map_single() and friends is in cases where the hardware's
> > behaviour means you absolutely positively have to have consistent
> > memory. 
> 
> Well, it comes from parisc drivers.  Here you'd really rather have
> consistent memory because it's more efficient, but on certain
> platforms it's just not possible.

Hmm... that doesn't seem sufficient to explain it.

Some background: I work with PPC embedded chips (the 4xx family) whose
only way to get consistent memory is by entirely disabling the cache.
However in some cases you *have* to have consistent memory despite
this very high cost.  In all other cases you want to use inconsistent
memory (just allocated with kmalloc() or get_free_pages()) and
explicit cache flushes.

It seems the "try to get consistent memory, but otherwise give me
inconsistent" is only useful on machines which:
	(1) Are not fully consisent, BUT
	(2) Can get consistent memory without disabling the cache, BUT
	(3) Not very much of it, so you might run out.

The point is, there has to be an advantage to using consistent memory
if it is available AND the possibility of it not being available.

Otherwise, drivers which absolutely need consistent memory, no matter
the cost, should use consistent_alloc(), all other drivers just use
kmalloc() (or whatever) then use the DMA flushing functions which
compile to NOPs on platforms with consistent memory.

Are there actually any machines with the properties described above?
The machines I know about don't:
	- x86 and normal PPC are fully consistent, so the question
doesn't arise
	- PPC 4xx and 8xx are incconsistent if cached, so you never
want consistent if you don't absolutely need it
	- PA Risc is fully non-consistent (I'm told), so the question
doesn't arise.

-- 
David Gibson			| For every complex problem there is a
david@gibson.dropbear.id.au	| solution which is simple, neat and
				| wrong.
http://www.ozlabs.org/people/dgibson

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC] generic device DMA implementation
  2002-12-05  2:31     ` Miles Bader
@ 2002-12-05  3:06       ` James Bottomley
  2002-12-05  5:02       ` David Gibson
  1 sibling, 0 replies; 71+ messages in thread
From: James Bottomley @ 2002-12-05  3:06 UTC (permalink / raw)
  To: Miles Bader; +Cc: James Bottomley, linux-kernel

miles@lsi.nec.co.jp said:
> My thinking was that a driver might want to do things like --
>   if (dma_addr_is_consistent (some_funky_addr)) {
>     do it quickly;
>   } else
>     do_it_the_slow_way (some_funky_addr);
> in other words, something besides just calling the sync functions, in
> the case where the memory was consistent. 

Actually, I did code an api for that case, it's the dma_get_conformance() one 
which tells you the consistency type of memory that you actually got, so if 
you really need to tell the difference, you can.

James




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC] generic device DMA implementation
  2002-12-05  2:38     ` David Gibson
@ 2002-12-05  3:13       ` James Bottomley
  2002-12-05  5:05         ` David Gibson
  2002-12-05  3:17       ` Miles Bader
  2002-12-05  3:41       ` Jeff Garzik
  2 siblings, 1 reply; 71+ messages in thread
From: James Bottomley @ 2002-12-05  3:13 UTC (permalink / raw)
  To: David Gibson, James Bottomley, Adam J. Richter, linux-kernel

david@gibson.dropbear.id.au said:
> The point is, there has to be an advantage to using consistent memory
> if it is available AND the possibility of it not being available. 

I'm really thinking of this from the driver writer's point of view.  The 
advantage of consistent memory is that you don't have to think about where to 
place all the sync points (sync points can be really subtle and nasty and an 
absolute pain---I shudder to recall all of the problems I ran into writing a 
driver on a fully inconsistent platform).

The advantage here is that you can code the driver only to use consistent 
memory and not bother with the sync points (whatever the cost of this is).  
Most platforms support reasonably cheap consistent memory, so most people 
simply don't want to bother with inconsistent memory if they can avoid it.

If you do the sync points, you can specify the DMA_CONFORMANCE_NON_CONSISTENT 
level and have the platform choose what type of memory you get.  For a 
platform which makes memory consistent by turning off CPU caching at the page 
level, it's probably better to return non-consistent memory if the driver can 
cope with it.

James



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC] generic device DMA implementation
  2002-12-05  2:38     ` David Gibson
  2002-12-05  3:13       ` James Bottomley
@ 2002-12-05  3:17       ` Miles Bader
  2002-12-05  6:06         ` David Gibson
  2002-12-05  3:41       ` Jeff Garzik
  2 siblings, 1 reply; 71+ messages in thread
From: Miles Bader @ 2002-12-05  3:17 UTC (permalink / raw)
  To: David Gibson; +Cc: James Bottomley, Adam J. Richter, linux-kernel

David Gibson <david@gibson.dropbear.id.au> writes:
> It seems the "try to get consistent memory, but otherwise give me
> inconsistent" is only useful on machines which:
> 	(1) Are not fully consisent, BUT
> 	(2) Can get consistent memory without disabling the cache, BUT
> 	(3) Not very much of it, so you might run out.
> 
> The point is, there has to be an advantage to using consistent memory
> if it is available AND the possibility of it not being available.
...
> Are there actually any machines with the properties described above?

As I mentioned in my previous message, one of my platforms is like that
-- PCI consistent memory must be allocated from a special pool of
memory, which is only 2 megabytes in size.

-Miles
-- 
`Life is a boundless sea of bitterness'

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC] generic device DMA implementation
  2002-12-05  2:38     ` David Gibson
  2002-12-05  3:13       ` James Bottomley
  2002-12-05  3:17       ` Miles Bader
@ 2002-12-05  3:41       ` Jeff Garzik
  2002-12-05  6:04         ` David Gibson
  2 siblings, 1 reply; 71+ messages in thread
From: Jeff Garzik @ 2002-12-05  3:41 UTC (permalink / raw)
  To: David Gibson, James Bottomley, Adam J. Richter, linux-kernel

On Thu, Dec 05, 2002 at 01:38:47PM +1100, David Gibson wrote:
> It seems the "try to get consistent memory, but otherwise give me
> inconsistent" is only useful on machines which:
> 	(1) Are not fully consisent, BUT
> 	(2) Can get consistent memory without disabling the cache, BUT
> 	(3) Not very much of it, so you might run out.
> 
> The point is, there has to be an advantage to using consistent memory
> if it is available AND the possibility of it not being available.

Agreed here.  Add to this

(4) quite silly from an API taste perspective.


> Otherwise, drivers which absolutely need consistent memory, no matter
> the cost, should use consistent_alloc(), all other drivers just use
> kmalloc() (or whatever) then use the DMA flushing functions which
> compile to NOPs on platforms with consistent memory.

Ug.  This is travelling backwards in time.

kmalloc is not intended to allocate memory for DMA'ing.  I (and others)
didn't spend all that time converting drivers to the PCI DMA API just to
see all that work undone.

Note that I am speaking from a driver API perspective, however.  If you
are talking about using kmalloc "under the hood" while still presenting
the same driver interface, that's fine.

	Jeff




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC] generic device DMA implementation
  2002-12-05  2:31     ` Miles Bader
  2002-12-05  3:06       ` James Bottomley
@ 2002-12-05  5:02       ` David Gibson
  1 sibling, 0 replies; 71+ messages in thread
From: David Gibson @ 2002-12-05  5:02 UTC (permalink / raw)
  To: Miles Bader; +Cc: James Bottomley, linux-kernel

On Thu, Dec 05, 2002 at 11:31:10AM +0900, Miles Bader wrote:
> James Bottomley <James.Bottomley@steeleye.com> writes:
> > > How is the driver supposed to tell whether a given dma_addr_t value
> > > represents consistent memory or not?  It seems like an
> > > (arch-specific) `dma_addr_is_consistent' function is necessary, but
> > > I couldn't see one in your patch.
> > 
> > If you have a machine that has both consistent and inconsistent blocks, you 
> > need to encode that in dma_addr_t (which is a platform definable type).
> > 
> > The sync functions would just decode the type and either nop or perform the 
> > sync.
> 
> My thinking was that a driver might want to do things like --
> 
>   if (dma_addr_is_consistent (some_funky_addr)) {
>     do it quickly;
>   } else
>     do_it_the_slow_way (some_funky_addr);
> 
> in other words, something besides just calling the sync functions, in
> the case where the memory was consistent.

Yes, but using consistent memory is not necessarily the fast way - in
fact it probably won't be.  Machines which don't do DMA cache snooping
will need to disable caching to get consistent memory, so using
consistent memory is very slow - on such a machine explicit syncs are
preferable wherever possible.

On a machine which is nicely consistent, the cache "flushes" should
become NOPs, so we'd expect the two sides of that if to do the same
thing.

-- 
David Gibson			| For every complex problem there is a
david@gibson.dropbear.id.au	| solution which is simple, neat and
				| wrong.
http://www.ozlabs.org/people/dgibson

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC] generic device DMA implementation
  2002-12-05  3:13       ` James Bottomley
@ 2002-12-05  5:05         ` David Gibson
  2002-12-05 15:03           ` James Bottomley
  0 siblings, 1 reply; 71+ messages in thread
From: David Gibson @ 2002-12-05  5:05 UTC (permalink / raw)
  To: James Bottomley; +Cc: Adam J. Richter, linux-kernel

On Wed, Dec 04, 2002 at 09:13:33PM -0600, James Bottomley wrote:
> david@gibson.dropbear.id.au said:
> > The point is, there has to be an advantage to using consistent memory
> > if it is available AND the possibility of it not being available. 
> 
> I'm really thinking of this from the driver writer's point of view.  The 
> advantage of consistent memory is that you don't have to think about where to 
> place all the sync points (sync points can be really subtle and nasty and an 
> absolute pain---I shudder to recall all of the problems I ran into writing a 
> driver on a fully inconsistent platform).
> 
> The advantage here is that you can code the driver only to use consistent 
> memory and not bother with the sync points (whatever the cost of this is).  
> Most platforms support reasonably cheap consistent memory, so most people 
> simply don't want to bother with inconsistent memory if they can avoid it.
> 
> If you do the sync points, you can specify the
> DMA_CONFORMANCE_NON_CONSISTENT level and have the platform choose
> what type of memory you get.  For a platform which makes memory
> consistent by turning off CPU caching at the page level, it's
> probably better to return non-consistent memory if the driver can
> cope with it.

But if you have the sync points, you don't need a special allocater
for the memory at all - any old RAM will do.  So why not just use
kmalloc() to get it.

-- 
David Gibson			| For every complex problem there is a
david@gibson.dropbear.id.au	| solution which is simple, neat and
				| wrong.
http://www.ozlabs.org/people/dgibson

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC] generic device DMA implementation
  2002-12-04 21:42   ` James Bottomley
@ 2002-12-05  5:44     ` Miles Bader
  0 siblings, 0 replies; 71+ messages in thread
From: Miles Bader @ 2002-12-05  5:44 UTC (permalink / raw)
  To: James Bottomley; +Cc: linux-kernel

James Bottomley <James.Bottomley@steeleye.com> writes:
> > Keep in mind that sometimes the actual _implementation_ is also highly
> > PCI-specific -- that is, what works for PCI devices may not work for
> > other devices and vice-versa.
> 
> that can all be taken care of in the platform implementation.
> 
> In general, the generic device already has enough information that the
> platform implementation can be highly bus specific---and, of course,
> once you know exactly what bus it's on, you can cast it to the bus
> device if you want.

I presume you mean something like (in an arch-specific file somewhere):

void *dma_alloc_consistent (struct device *dev, size_t size,
                            dma_addr_t *dma_handle,
        		    enum dma_conformance_level level)
{
    if (dev->SOME_FIELD == SOME_CONSTANT)
        return my_wierd_ass_pci_alloc_consistent ((struct pci_dev *)dev, ...);
    else
        return 0; /* or kmalloc(...); */
}

?

I did a bit of grovelling, but I'm still not quite sure what test I
can do (i.e., what SOME_FIELD and SOME_CONSTANT should be, if it's
really that simple).

Ah well, as long as it's possible I guess I'll figure it out when the
source hits the fan...

-Miles
-- 
I have seen the enemy, and he is us.  -- Pogo

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC] generic device DMA implementation
  2002-12-05  3:41       ` Jeff Garzik
@ 2002-12-05  6:04         ` David Gibson
  2002-12-05 16:29           ` Jeff Garzik
  0 siblings, 1 reply; 71+ messages in thread
From: David Gibson @ 2002-12-05  6:04 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: James Bottomley, Adam J. Richter, linux-kernel

On Wed, Dec 04, 2002 at 10:41:31PM -0500, Jeff Garzik wrote:
> On Thu, Dec 05, 2002 at 01:38:47PM +1100, David Gibson wrote:
> > It seems the "try to get consistent memory, but otherwise give me
> > inconsistent" is only useful on machines which:
> > 	(1) Are not fully consisent, BUT
> > 	(2) Can get consistent memory without disabling the cache, BUT
> > 	(3) Not very much of it, so you might run out.
> > 
> > The point is, there has to be an advantage to using consistent memory
> > if it is available AND the possibility of it not being available.
> 
> Agreed here.  Add to this
> 
> (4) quite silly from an API taste perspective.
> 
> 
> > Otherwise, drivers which absolutely need consistent memory, no matter
> > the cost, should use consistent_alloc(), all other drivers just use
> > kmalloc() (or whatever) then use the DMA flushing functions which
> > compile to NOPs on platforms with consistent memory.
> 
> Ug.  This is travelling backwards in time.
> 
> kmalloc is not intended to allocate memory for DMA'ing.  I (and others)
> didn't spend all that time converting drivers to the PCI DMA API just to
> see all that work undone.

But if there aren't any consistency constraints on the memory, why not
get it with kmalloc().  There are two approaches to handling DMA on a
not-fully-consistent machine:
	1) Allocate the memory specially so that it is consistent
	2) Use any old memory, and make sure we have explicit cache
frobbing.

We have to have both: some hardware requires approach (1), and the
structure of the kernel often requires (2) to avoid lots of copying
(e.g. a network device doesn't allocate its own skbs to transmit, so
it can't assume the memory has any special consistency properties).

Since in case (2), we can't make assumptions about where the memory
came from, it might as well come from kmalloc() (or a slab, or
get_free_pages() or whatever).

-- 
David Gibson			| For every complex problem there is a
david@gibson.dropbear.id.au	| solution which is simple, neat and
				| wrong.
http://www.ozlabs.org/people/dgibson

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC] generic device DMA implementation
  2002-12-05  3:17       ` Miles Bader
@ 2002-12-05  6:06         ` David Gibson
  2002-12-05  6:43           ` Miles Bader
  0 siblings, 1 reply; 71+ messages in thread
From: David Gibson @ 2002-12-05  6:06 UTC (permalink / raw)
  To: Miles Bader; +Cc: James Bottomley, Adam J. Richter, linux-kernel

On Thu, Dec 05, 2002 at 12:17:55PM +0900, Miles Bader wrote:
> David Gibson <david@gibson.dropbear.id.au> writes:
> > It seems the "try to get consistent memory, but otherwise give me
> > inconsistent" is only useful on machines which:
> > 	(1) Are not fully consisent, BUT
> > 	(2) Can get consistent memory without disabling the cache, BUT
> > 	(3) Not very much of it, so you might run out.
> > 
> > The point is, there has to be an advantage to using consistent memory
> > if it is available AND the possibility of it not being available.
> ...
> > Are there actually any machines with the properties described above?
> 
> As I mentioned in my previous message, one of my platforms is like that
> memory, which is only 2 megabytes in size.

Ok, that starts to make sense then (what platform is it,
incidentally).  Is using consistent memory actually faster than doing
the cache flushes expliticly?  Much?

-- 
David Gibson			| For every complex problem there is a
david@gibson.dropbear.id.au	| solution which is simple, neat and
				| wrong.
http://www.ozlabs.org/people/dgibson

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC] generic device DMA implementation
  2002-12-05  6:06         ` David Gibson
@ 2002-12-05  6:43           ` Miles Bader
  2002-12-05 23:44             ` David Gibson
  0 siblings, 1 reply; 71+ messages in thread
From: Miles Bader @ 2002-12-05  6:43 UTC (permalink / raw)
  To: David Gibson; +Cc: James Bottomley, Adam J. Richter, linux-kernel

David Gibson <david@gibson.dropbear.id.au> writes:
> > As I mentioned in my previous message, one of my platforms is like that
> > memory, which is only 2 megabytes in size.
> 
> Ok, that starts to make sense then (what platform is it,
> incidentally).  Is using consistent memory actually faster than doing
> the cache flushes expliticly?  Much?

It's an embedded evaluation board (Midas `RTE-MOTHER-A' and
`RTE-V850E-MA1-CB').

The thing is there _is_ no cache on this machine (it's very slow), so
cache-consistency is actually not an issue (and the cache-flushing
macros won't help me at all).

PCI devices are several busses removed from the CPU, and they only have
this one 2MB area in common.  So on this machine, PCI devices can _only_
use consistent memory.

When a driver uses the non-consistent interfaces, then:

  * pci_map_single allocates a `shadow area' of consistent memory and
    pci_unmap_single deallocates it

  * pci_dma_sync_... just does a memcpy to/from the `shadow' consistent
    memory from/to the drivers kalloc'd block (in principle I think this
    is incorrect, because it uses the `dir' parameter to determine the
    direction to copy, but it works in practice)

So you can see that for this platform, it would be better if drivers
could _always_ use alloc_consistent, but many don't.

Yes this is a wierd and frustrating design, but I think it does credit
to the linux PCI layer that I could get it work at all, without
modifying any drivers!  I guess my main goal in this discussion is to
ensure that remains the case...

-Miles
-- 
.Numeric stability is probably not all that important when you're guessing.

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC] generic device DMA implementation
  2002-12-05  0:47 ` David Gibson
  2002-12-05  0:54   ` Jeff Garzik
  2002-12-05  1:44   ` James Bottomley
@ 2002-12-05 11:08   ` Benjamin Herrenschmidt
  2002-12-05 11:35     ` Russell King
  2002-12-06  0:01     ` David Gibson
  2 siblings, 2 replies; 71+ messages in thread
From: Benjamin Herrenschmidt @ 2002-12-05 11:08 UTC (permalink / raw)
  To: David Gibson; +Cc: James Bottomley, linux-kernel

On Thu, 2002-12-05 at 01:47, David Gibson wrote:
> Do you have an example of where the second option is useful?  Off hand
> the only places I can think of where you'd use a consistent_alloc()
> rather than map_single() and friends is in cases where the hardware's
> behaviour means you absolutely positively have to have consistent
> memory.

Looking at our implementation (ppc32 on non-coherent CPUs like 405) of
pci_map_single, which just flushes the cache, I still feel we need a
consistent_alloc, that is an implementation that _disables_ caching for
the area.

A typical example is an USB OHCI driver. You really don't want to play
cache tricks with the shared area here. That will happen each time you
have a shared area in memory in which both the CPU and the device may
read/write in the same cache line.

For things like ring descriptors of a net driver, I feel it's very much
simpler (and possibly more efficient too) to also allocate non-cacheable
space for consistent instead of continuously flushing/invalidating.
Actually, flush/invalidate here can also have nasty side effects if
several descriptors fit in the same cache line.

The data buffers, of course (skbuffs typically) would preferably use
pci_map_* like APIs (hrm... did we ever make sure skbuffs would _not_
mix the data buffer with control datas in the same cache line ? This
have been a problem with non-coherent CPUs in the past).

Ben.


^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC] generic device DMA implementation
  2002-12-04 21:46   ` James Bottomley
  2002-12-05  2:31     ` Miles Bader
@ 2002-12-05 11:15     ` Benjamin Herrenschmidt
  2002-12-05 11:16       ` William Lee Irwin III
  2002-12-05 15:12       ` James Bottomley
  1 sibling, 2 replies; 71+ messages in thread
From: Benjamin Herrenschmidt @ 2002-12-05 11:15 UTC (permalink / raw)
  To: James Bottomley; +Cc: Miles Bader, linux-kernel

On Wed, 2002-12-04 at 22:46, James Bottomley wrote:
> miles@gnu.org said:
> > How is the driver supposed to tell whether a given dma_addr_t value
> > represents consistent memory or not?  It seems like an (arch-specific)
> > `dma_addr_is_consistent' function is necessary, but I couldn't see one
> > in your patch. 
> 
> well, the patch was only for x86, which is fully consistent.  For parisc, that 
> becomes a field for the dma accessor functions.
> 
> However, even on parisc, the (supported) machines are either entirely 
> consistent or entirely inconsistent.
> 
> If you have a machine that has both consistent and inconsistent blocks, you 
> need to encode that in dma_addr_t (which is a platform definable type).

I don't agree here. Encoding things in dma_addr_t, then special casing
in consistent_{map,unmap,sync,....) looks really ugly to me ! You want
dma_addr_t to contain a bus address for the given bus you are working
with and pass that to your device, period.

Consistency of memory (or simply, in some cases, accessibility of system
memory by a given device) is really a property of the bus. Tweaking
magic bits in dma_addr_t and testing them later is a hack. The proper
implementation is to have the consistent_{alloc,free,map,unmap,sync,...)
functions be function pointers in the generic bus structure.

Actually, the device model defines a bus "type" structure rather than a
"bus instance" structure (well, at least it did last I looked a couple
of weeks ago). That's a problem I beleive here, as those functions are
really a property of a given bus instance. One solution would eventually
be to have the set of functions pointers in the generic struct device
and by default be copied from parent to child.

Actually, to avoid bloat, I think a single pointer to a struct
containing the whole set of consistent functions is enough though, as
those will typically be statically defined.

Ben.


^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC] generic device DMA implementation
  2002-12-05 11:15     ` Benjamin Herrenschmidt
@ 2002-12-05 11:16       ` William Lee Irwin III
  2002-12-05 15:12       ` James Bottomley
  1 sibling, 0 replies; 71+ messages in thread
From: William Lee Irwin III @ 2002-12-05 11:16 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: James Bottomley, Miles Bader, linux-kernel

On Thu, Dec 05, 2002 at 12:15:30PM +0100, Benjamin Herrenschmidt wrote:
> Actually, the device model defines a bus "type" structure rather than a
> "bus instance" structure (well, at least it did last I looked a couple
> of weeks ago). That's a problem I beleive here, as those functions are
> really a property of a given bus instance. One solution would eventually
> be to have the set of functions pointers in the generic struct device
> and by default be copied from parent to child.

On an unrelated note, a "bus instance" structure would seem to be
required for proper handling of bridges in combination with PCI segments.


Bill

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC] generic device DMA implementation
  2002-12-05 11:08   ` Benjamin Herrenschmidt
@ 2002-12-05 11:35     ` Russell King
  2002-12-05 15:24       ` James Bottomley
  2002-12-06  0:01     ` David Gibson
  1 sibling, 1 reply; 71+ messages in thread
From: Russell King @ 2002-12-05 11:35 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: David Gibson, James Bottomley, linux-kernel

On Thu, Dec 05, 2002 at 12:08:16PM +0100, Benjamin Herrenschmidt wrote:
> For things like ring descriptors of a net driver, I feel it's very much
> simpler (and possibly more efficient too) to also allocate non-cacheable
> space for consistent instead of continuously flushing/invalidating.
> Actually, flush/invalidate here can also have nasty side effects if
> several descriptors fit in the same cache line.

Indeed.  Think about a 16-byte descriptor in a 32-byte cache line.
The net chip has written status information to the first word, you've
just written to the 4th word of that cache line.

To access the status word written by the chip, you need to invalidate
(without writeback) that cache line.  For the chip to access the word
you've just written, you need to writeback that cache line.

In other words, you _will_ loose information in this case, guaranteed.
I'd rather keep our existing pci_* API than be forced into this crap
again.

-- 
Russell King (rmk@arm.linux.org.uk)                The developer of ARM Linux
             http://www.arm.linux.org.uk/personal/aboutme.html


^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC] generic device DMA implementation
  2002-12-05  5:05         ` David Gibson
@ 2002-12-05 15:03           ` James Bottomley
  2002-12-05 23:54             ` David Gibson
  0 siblings, 1 reply; 71+ messages in thread
From: James Bottomley @ 2002-12-05 15:03 UTC (permalink / raw)
  To: David Gibson, James Bottomley, Adam J. Richter, linux-kernel

david@gibson.dropbear.id.au said:
> But if you have the sync points, you don't need a special allocater
> for the memory at all - any old RAM will do.  So why not just use
> kmalloc() to get it. 

Because with kmalloc, you have to be aware of platform implementations.  Most 
notably that cache flush/invalidate instructions only operate at the level of 
certain block of memory (called the cache line width).  If kmalloc returns 
less than a cache line width you have the potential for severe cockups because 
of the possibility of interfering cache operations on adjacent kmalloc regions 
that share the same cache line.

the dma_alloc... function guarantees to avoid this for you by passing the 
allocation to the platform layer which knows the cache characteristics and 
requirements for the machine (and dma controller) you're using.

James



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC] generic device DMA implementation
  2002-12-05 11:15     ` Benjamin Herrenschmidt
  2002-12-05 11:16       ` William Lee Irwin III
@ 2002-12-05 15:12       ` James Bottomley
  1 sibling, 0 replies; 71+ messages in thread
From: James Bottomley @ 2002-12-05 15:12 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: James Bottomley, Miles Bader, linux-kernel

benh@kernel.crashing.org said:
> actually, the device model defines a bus "type" structure rather than a
> "bus instance" structure (well, at least it did last I looked a couple
> of weeks ago). That's a problem I beleive here, as those functions are
> really a property of a given bus instance. One solution would
> eventually be to have the set of functions pointers in the generic
> struct device and by default be copied from parent to child. 

Well, the bus in the generic device model is just a struct device as well.

The parisc implementation I'm working on stores this type of conversion 
information in the platform_data field of the generic device (although the 
function pointers that make use of it are global).

I did do an implementation which added a dma_accessors set of functions to the 
struct device (and also struct bus_type), but I eventually concluded they had 
to be so platform specific that there was no utility to exposing them.

> Consistency of memory (or simply, in some cases, accessibility of
> system memory by a given device) is really a property of the bus.
> Tweaking magic bits in dma_addr_t and testing them later is a hack.
> The proper implementation is to have the consistent_{alloc,free,map,unm
> ap,sync,...) functions be function pointers in the generic bus
> structure. 

actually, in parisc, the implementation is simple.  The type of memory is 
determined globally per architecture (so it's not encoded in the dma_addr_t).  
As Adam said.  The preferred platform implementation for a machine that did 
both (I believe this is the fabled parisc V class, which I've never seen) 
would be to implement a consistent region so you could tell if dma_addr_t fell 
in that region for whether it was consistent or not.

I fully recognise that dma_addr_t actually has to be freely convertable to the 
physical address by simple casting, because that's the way the pci_ functions 
use it.

James



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC] generic device DMA implementation
  2002-12-05 11:35     ` Russell King
@ 2002-12-05 15:24       ` James Bottomley
  0 siblings, 0 replies; 71+ messages in thread
From: James Bottomley @ 2002-12-05 15:24 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, David Gibson, James Bottomley, linux-kernel

rmk@arm.linux.org.uk said:
> I'd rather keep our existing pci_* API than be forced into this crap
> again. 

Let me just clarify: I'm not planning to revoke the pci_* API, or to deviate 
substantially from it.  I'm not even planning to force any arch's to use it if 
they don't want to.  I'm actually thinking of putting something like this in 
the asm-generic implementations:

dma_*(struct device *dev, ...) {
	BUG_ON(dev->bus != &pci_bus_type)
	pci_*(to_pci_device(dev), ..)
}

The whole point is not to force another massive shift in the way drivers are 
written, but to provide a generic device based API for those who need it.  
There are very few drivers that actually have to allocate fake PCI devices 
today, but this API is aimed squarely at helping them.  Drivers that only ever 
see real PCI devices won't need touching.

James



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC] generic device DMA implementation
  2002-12-05  6:04         ` David Gibson
@ 2002-12-05 16:29           ` Jeff Garzik
  2002-12-05 23:59             ` David Gibson
  0 siblings, 1 reply; 71+ messages in thread
From: Jeff Garzik @ 2002-12-05 16:29 UTC (permalink / raw)
  To: David Gibson; +Cc: James Bottomley, Adam J. Richter, linux-kernel

David Gibson wrote:
> On Wed, Dec 04, 2002 at 10:41:31PM -0500, Jeff Garzik wrote:
> 
>>On Thu, Dec 05, 2002 at 01:38:47PM +1100, David Gibson wrote:
>>
>>>It seems the "try to get consistent memory, but otherwise give me
>>>inconsistent" is only useful on machines which:
>>>	(1) Are not fully consisent, BUT
>>>	(2) Can get consistent memory without disabling the cache, BUT
>>>	(3) Not very much of it, so you might run out.
>>>
>>>The point is, there has to be an advantage to using consistent memory
>>>if it is available AND the possibility of it not being available.
>>
>>Agreed here.  Add to this
>>
>>(4) quite silly from an API taste perspective.
>>
>>
>>
>>>Otherwise, drivers which absolutely need consistent memory, no matter
>>>the cost, should use consistent_alloc(), all other drivers just use
>>>kmalloc() (or whatever) then use the DMA flushing functions which
>>>compile to NOPs on platforms with consistent memory.
>>
>>Ug.  This is travelling backwards in time.
>>
>>kmalloc is not intended to allocate memory for DMA'ing.  I (and others)
>>didn't spend all that time converting drivers to the PCI DMA API just to
>>see all that work undone.
> 
> 
> But if there aren't any consistency constraints on the memory, why not
> get it with kmalloc().  There are two approaches to handling DMA on a
> not-fully-consistent machine:
> 	1) Allocate the memory specially so that it is consistent
> 	2) Use any old memory, and make sure we have explicit cache
> frobbing.

For me it's an API issue.  kmalloc does not return DMA'able memory.

If "your way" is acceptable to most, then at the very least I would want

	#define get_any_old_dmaable_memory kmalloc



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC] generic device DMA implementation
  2002-12-05  6:43           ` Miles Bader
@ 2002-12-05 23:44             ` David Gibson
  2002-12-06  2:23               ` Miles Bader
  0 siblings, 1 reply; 71+ messages in thread
From: David Gibson @ 2002-12-05 23:44 UTC (permalink / raw)
  To: Miles Bader; +Cc: James Bottomley, Adam J. Richter, linux-kernel

On Thu, Dec 05, 2002 at 03:43:58PM +0900, Miles Bader wrote:
> David Gibson <david@gibson.dropbear.id.au> writes:
> > > As I mentioned in my previous message, one of my platforms is like that
> > > memory, which is only 2 megabytes in size.
> > 
> > Ok, that starts to make sense then (what platform is it,
> > incidentally).  Is using consistent memory actually faster than doing
> > the cache flushes expliticly?  Much?
> 
> It's an embedded evaluation board (Midas `RTE-MOTHER-A' and
> `RTE-V850E-MA1-CB').
> 
> The thing is there _is_ no cache on this machine (it's very slow), so
> cache-consistency is actually not an issue (and the cache-flushing
> macros won't help me at all).
> 
> PCI devices are several busses removed from the CPU, and they only have
> this one 2MB area in common.  So on this machine, PCI devices can _only_
> use consistent memory.
> 
> When a driver uses the non-consistent interfaces, then:
> 
>   * pci_map_single allocates a `shadow area' of consistent memory and
>     pci_unmap_single deallocates it

That's a little misleading: all your memory is consistent, the point
is that it is a shadow area of PCI-mappable memory.

>   * pci_dma_sync_... just does a memcpy to/from the `shadow' consistent
>     memory from/to the drivers kalloc'd block (in principle I think this
>     is incorrect, because it uses the `dir' parameter to determine the
>     direction to copy, but it works in practice)
> 
> So you can see that for this platform, it would be better if drivers
> could _always_ use alloc_consistent, but many don't.

Ah, ok, now I understand.  The issue here is actually nothing to do
with consistency, since your platform is fully consistent.  The issue
is that there are other constraints for DMAable memory and you want
drivers to be able to easily mallocate with those constraints.

Actually, it occurs to me that PC ISA DMA is in a similar situation -
there is a constraint on DMAable memory (sufficiently low physical
address) which has nothing to do with consistency.

> Yes this is a wierd and frustrating design, but I think it does credit
> to the linux PCI layer that I could get it work at all, without
> modifying any drivers!  I guess my main goal in this discussion is to
> ensure that remains the case...

Ok... see also my reply to one of James's posts.

-- 
David Gibson			| For every complex problem there is a
david@gibson.dropbear.id.au	| solution which is simple, neat and
				| wrong.
http://www.ozlabs.org/people/dgibson

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC] generic device DMA implementation
  2002-12-05 15:03           ` James Bottomley
@ 2002-12-05 23:54             ` David Gibson
  0 siblings, 0 replies; 71+ messages in thread
From: David Gibson @ 2002-12-05 23:54 UTC (permalink / raw)
  To: James Bottomley; +Cc: Adam J. Richter, linux-kernel

On Thu, Dec 05, 2002 at 09:03:24AM -0600, James Bottomley wrote:
> david@gibson.dropbear.id.au said:
> > But if you have the sync points, you don't need a special allocater
> > for the memory at all - any old RAM will do.  So why not just use
> > kmalloc() to get it. 
> 
> Because with kmalloc, you have to be aware of platform
> implementations.  Most notably that cache flush/invalidate
> instructions only operate at the level of certain block of memory
> (called the cache line width).  If kmalloc returns less than a cache
> line width you have the potential for severe cockups because of the
> possibility of interfering cache operations on adjacent kmalloc
> regions that share the same cache line.

Having debugged a stack corruption problem when attempting to use USB
on a PPC 4xx machine, which was due to improperly aligned DMA buffers,
I am well aware of this issue.

> the dma_alloc... function guarantees to avoid this for you by passing the 
> allocation to the platform layer which knows the cache characteristics and 
> requirements for the machine (and dma controller) you're using.

Ok - now I begin to see the point of this: I was being misled by the
emphasis on a preference for consistent allocation and the original
"alloc_consistent" name you suggested.  When consistent memory isn't
strictly required it's as likely as not that it won't be preferred
either.

Given this, and Miles example, I can see the point of a DMA mallocater
that applies DMA constraints that are not to do with consistency.
Then consistency could also be specified, but that's a separate issue.

So, to remove the misleading emphasis on the point of the allocated
being consistent memory (your name change was a start, this goes
further), I'd prefer to see something like:

void *dma_malloc(struct device *bus, unsigned long size, int flags,
		 dma_addr_t *dma_addr);

Which returns virtual and DMA pointers for a chunk of memory
satisfying any DMA conditions for the specified bus.  Then if flags
includes DMA_CONSISTENT (or some such) the memory will be allocated
consistent in addition to those constraints.

If DMA_CONSISTENT is not specified, the memory might be consistent,
and there would be a preference for consistent only on platforms where
consistent memory is actually preferable (I haven't yet heard of one).

-- 
David Gibson			| For every complex problem there is a
david@gibson.dropbear.id.au	| solution which is simple, neat and
				| wrong.
http://www.ozlabs.org/people/dgibson

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC] generic device DMA implementation
  2002-12-05 16:29           ` Jeff Garzik
@ 2002-12-05 23:59             ` David Gibson
  0 siblings, 0 replies; 71+ messages in thread
From: David Gibson @ 2002-12-05 23:59 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: James Bottomley, Adam J. Richter, linux-kernel

On Thu, Dec 05, 2002 at 11:29:45AM -0500, Jeff Garzik wrote:
> David Gibson wrote:
> >On Wed, Dec 04, 2002 at 10:41:31PM -0500, Jeff Garzik wrote:
> >
> >>On Thu, Dec 05, 2002 at 01:38:47PM +1100, David Gibson wrote:
> >>
> >>>It seems the "try to get consistent memory, but otherwise give me
> >>>inconsistent" is only useful on machines which:
> >>>	(1) Are not fully consisent, BUT
> >>>	(2) Can get consistent memory without disabling the cache, BUT
> >>>	(3) Not very much of it, so you might run out.
> >>>
> >>>The point is, there has to be an advantage to using consistent memory
> >>>if it is available AND the possibility of it not being available.
> >>
> >>Agreed here.  Add to this
> >>
> >>(4) quite silly from an API taste perspective.
> >>
> >>
> >>
> >>>Otherwise, drivers which absolutely need consistent memory, no matter
> >>>the cost, should use consistent_alloc(), all other drivers just use
> >>>kmalloc() (or whatever) then use the DMA flushing functions which
> >>>compile to NOPs on platforms with consistent memory.
> >>
> >>Ug.  This is travelling backwards in time.
> >>
> >>kmalloc is not intended to allocate memory for DMA'ing.  I (and others)
> >>didn't spend all that time converting drivers to the PCI DMA API just to
> >>see all that work undone.
> >
> >
> >But if there aren't any consistency constraints on the memory, why not
> >get it with kmalloc().  There are two approaches to handling DMA on a
> >not-fully-consistent machine:
> >	1) Allocate the memory specially so that it is consistent
> >	2) Use any old memory, and make sure we have explicit cache
> >frobbing.
> 
> For me it's an API issue.  kmalloc does not return DMA'able memory.

Ok - see my reply to James's post.  I see the point of this given that
there are constraints on DMAable memory which are not related to
consistency (e.g. particular address ranges and cacheline alignment).
A mallocater which can satisfy these constraints makes sense to me.

I just think it's a mistake to associate these constraints with cache
consistency - they are not related.  James's original patch does make
this separation in practice, but it misleading suggests a link - which
is what confused me.

> 
> If "your way" is acceptable to most, then at the very least I would want
> 
> 	#define get_any_old_dmaable_memory kmalloc

I imagine platforms where any address is DMAable and which are fully
consistent would do this.

-- 
David Gibson			| For every complex problem there is a
david@gibson.dropbear.id.au	| solution which is simple, neat and
				| wrong.
http://www.ozlabs.org/people/dgibson

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC] generic device DMA implementation
  2002-12-05 11:08   ` Benjamin Herrenschmidt
  2002-12-05 11:35     ` Russell King
@ 2002-12-06  0:01     ` David Gibson
  1 sibling, 0 replies; 71+ messages in thread
From: David Gibson @ 2002-12-06  0:01 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: James Bottomley, linux-kernel

On Thu, Dec 05, 2002 at 12:08:16PM +0100, Benjamin Herrenschmidt wrote:
> On Thu, 2002-12-05 at 01:47, David Gibson wrote:
> > Do you have an example of where the second option is useful?  Off hand
> > the only places I can think of where you'd use a consistent_alloc()
> > rather than map_single() and friends is in cases where the hardware's
> > behaviour means you absolutely positively have to have consistent
> > memory.
> 
> Looking at our implementation (ppc32 on non-coherent CPUs like 405) of
> pci_map_single, which just flushes the cache, I still feel we need a
> consistent_alloc, that is an implementation that _disables_ caching for
> the area.

No question there: that's James's first option.  

> A typical example is an USB OHCI driver. You really don't want to play
> cache tricks with the shared area here. That will happen each time you
> have a shared area in memory in which both the CPU and the device may
> read/write in the same cache line.
> 
> For things like ring descriptors of a net driver, I feel it's very much
> simpler (and possibly more efficient too) to also allocate non-cacheable
> space for consistent instead of continuously flushing/invalidating.
> Actually, flush/invalidate here can also have nasty side effects if
> several descriptors fit in the same cache line.
> 
> The data buffers, of course (skbuffs typically) would preferably use
> pci_map_* like APIs (hrm... did we ever make sure skbuffs would _not_
> mix the data buffer with control datas in the same cache line ? This
> have been a problem with non-coherent CPUs in the past).

Indeed - the 405GP ethernet driver, which I've worked on, uses exactly
this approach.  consistent_alloc() is used for the descriptor ring
buffer, and DMA syncs are used for the data buffers.

-- 
David Gibson			| For every complex problem there is a
david@gibson.dropbear.id.au	| solution which is simple, neat and
				| wrong.
http://www.ozlabs.org/people/dgibson

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFC] generic device DMA implementation
  2002-12-05 23:44             ` David Gibson
@ 2002-12-06  2:23               ` Miles Bader
  0 siblings, 0 replies; 71+ messages in thread
From: Miles Bader @ 2002-12-06  2:23 UTC (permalink / raw)
  To: David Gibson; +Cc: James Bottomley, Adam J. Richter, linux-kernel

David Gibson <david@gibson.dropbear.id.au> writes:
> >   * pci_map_single allocates a `shadow area' of consistent memory and
> >     pci_unmap_single deallocates it
> 
> That's a little misleading: all your memory is consistent, the point
> is that it is a shadow area of PCI-mappable memory.

Well I suppose that's true if you're using the term in a cache-related
sense (which I gather is the convention).

OTOH, if you think about it from the view point of the PCI framework,
the terminology actually does make sense even in this odd case --
`consistent' memory is indeed consistent (both CPU and device see a
single image), but other memory used to communicate with the driver is
`inconsistent' (CPU and device see different things until a sync
operation is done).

> The issue is that there are other constraints for DMAable memory and
> you want drivers to be able to easily mallocate with those
> constraints.
> 
> Actually, it occurs to me that PC ISA DMA is in a similar situation -
> there is a constraint on DMAable memory (sufficiently low physical
> address) which has nothing to do with consistency.

Indeed.  What I'm doing is basically bounce-buffers.

-Miles
-- 
`The suburb is an obsolete and contradictory form of human settlement'

^ permalink raw reply	[flat|nested] 71+ messages in thread

* [RFT][PATCH] generic device DMA implementation
@ 2002-12-18  3:01 James Bottomley
  2002-12-18  3:13 ` David Mosberger
  2002-12-28 18:14 ` Russell King
  0 siblings, 2 replies; 71+ messages in thread
From: James Bottomley @ 2002-12-18  3:01 UTC (permalink / raw)
  To: linux-kernel; +Cc: James.Bottomley

[-- Attachment #1: Type: text/plain, Size: 463 bytes --]

The attached should represent close to final form for the generic DMA API.  It 
includes documentation (surprise!) and and implementation in terms of the pci_ 
API for every arch (apart from parisc, which will be submitted later).

I've folded in the feedback from the previous thread.  Hopefully, this should 
be ready for inclusion.  If people could test it on x86 and other 
architectures, I'd be grateful.

comments and feedback from testing welcome.

James


[-- Attachment #2: tmp.diff --]
[-- Type: text/plain , Size: 39112 bytes --]

# This is a BitKeeper generated patch for the following project:
# Project Name: Linux kernel tree
# This patch format is intended for GNU patch command version 2.5 or higher.
# This patch includes the following deltas:
#	           ChangeSet	1.859   -> 1.861  
#	arch/i386/kernel/pci-dma.c	1.8     -> 1.10   
#	   drivers/pci/pci.c	1.51    -> 1.52   
#	include/asm-i386/pci.h	1.17    -> 1.18   
#	 include/linux/pci.h	1.55    -> 1.56   
#	Documentation/DMA-mapping.txt	1.13    -> 1.14   
#	arch/i386/kernel/i386_ksyms.c	1.40    -> 1.41   
#	               (new)	        -> 1.1     include/asm-s390x/dma-mapping.h
#	               (new)	        -> 1.1     include/asm-arm/dma-mapping.h
#	               (new)	        -> 1.1     include/asm-sparc/dma-mapping.h
#	               (new)	        -> 1.1     include/asm-cris/dma-mapping.h
#	               (new)	        -> 1.1     include/asm-sh/dma-mapping.h
#	               (new)	        -> 1.1     include/asm-ppc64/dma-mapping.h
#	               (new)	        -> 1.1     include/asm-m68knommu/dma-mapping.h
#	               (new)	        -> 1.1     Documentation/DMA-API.txt
#	               (new)	        -> 1.1     include/asm-um/dma-mapping.h
#	               (new)	        -> 1.1     include/asm-ia64/dma-mapping.h
#	               (new)	        -> 1.3     include/asm-generic/dma-mapping.h
#	               (new)	        -> 1.1     include/asm-alpha/dma-mapping.h
#	               (new)	        -> 1.2     include/linux/dma-mapping.h
#	               (new)	        -> 1.1     include/asm-ppc/dma-mapping.h
#	               (new)	        -> 1.1     include/asm-s390/dma-mapping.h
#	               (new)	        -> 1.5     include/asm-i386/dma-mapping.h
#	               (new)	        -> 1.1     include/asm-sparc64/dma-mapping.h
#	               (new)	        -> 1.1     include/asm-m68k/dma-mapping.h
#	               (new)	        -> 1.1     include/asm-mips/dma-mapping.h
#	               (new)	        -> 1.1     include/asm-x86_64/dma-mapping.h
#	               (new)	        -> 1.1     include/asm-v850/dma-mapping.h
#	               (new)	        -> 1.1     include/asm-mips64/dma-mapping.h
#
# The following is the BitKeeper ChangeSet Log
# --------------------------------------------
# 02/12/09	jejb@mulgrave.(none)	1.860
# Merge ssh://raven/BK/dma-generic-device-2.5.50
# into mulgrave.(none):/home/jejb/BK/dma-generic-device-2.5
# --------------------------------------------
# 02/12/16	jejb@mulgrave.(none)	1.861
# Documentation complete
# --------------------------------------------
#
diff -Nru a/Documentation/DMA-API.txt b/Documentation/DMA-API.txt
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/Documentation/DMA-API.txt	Tue Dec 17 20:49:32 2002
@@ -0,0 +1,325 @@
+               Dynamic DMA mapping using the generic device
+               ============================================
+
+        James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
+
+This document describes the DMA API.  For a more gentle introduction
+phrased in terms of the pci_ equivalents (and actual examples) see
+DMA-mapping.txt
+
+This API is split into two pieces.  Part I describes the API and the
+corresponding pci_ API.  Part II describes the extensions to the API
+for supporting non-consistent memory machines.  Unless you know that
+your driver absolutely has to support non-consistent platforms (this
+is usually only legacy platforms) you should only use the API
+described in part I.
+
+Part I - pci_ and dma_ Equivalent API 
+-------------------------------------
+
+To get the pci_ API, you must #include <linux/pci.h>
+To get the dma_ API, you must #include <linux/dma-mapping.h>
+
+void *
+dma_alloc_consistent(struct device *dev, size_t size,
+			     dma_addr_t *dma_handle)
+void *
+pci_alloc_consistent(struct pci_dev *dev, size_t size,
+			     dma_addr_t *dma_handle)
+
+Consistent memory is memory for which a write by either the device or
+the processor can immediately be read by the processor or device
+without having to worry about caching effects.
+
+This routine allocates a region of <size> bytes of consistent memory.
+it also returns a <dma_handle> which may be cast to an unsigned
+integer the same width as the bus and used as the physical address
+base of the region.
+
+Returns: a pointer to the allocated region (in the processor's virtual
+address space) or NULL if the allocation failed.
+
+Note: consistent memory can be expensive on some platforms, and the
+minimum allocation length may be as big as a page, so you should
+consolidate your requests for consistent memory as much as possible.
+
+void
+dma_free_consistent(struct device *dev, size_t size, void *cpu_addr
+			   dma_addr_t dma_handle)
+void
+pci_free_consistent(struct pci_dev *dev, size_t size, void *cpu_addr
+			   dma_addr_t dma_handle)
+
+Free the region of consistent memory you previously allocated.  dev,
+size and dma_handle must all be the same as those passed into the
+consistent allocate.  cpu_addr must be the virtual address returned by
+the consistent allocate
+
+int
+dma_supported(struct device *dev, u64 mask)
+int
+pci_dma_supported(struct device *dev, u64 mask)
+
+Checks to see if the device can support DMA to the memory described by
+mask.
+
+Returns: 1 if it can and 0 if it can't.
+
+Notes: This routine merely tests to see if the mask is possible.  It
+won't change the current mask settings.  It is more intended as an
+internal API for use by the platform than an external API for use by
+driver writers.
+
+int
+dma_set_mask(struct device *dev, u64 mask)
+int
+pci_dma_set_mask(struct pci_device *dev, u64 mask)
+
+Checks to see if the mask is possible and updates the device
+parameters if it is.
+
+Returns: 1 if successful and 0 if not
+
+dma_addr_t
+dma_map_single(struct device *dev, void *cpu_addr, size_t size,
+		      enum dma_data_direction direction)
+dma_addr_t
+pci_map_single(struct device *dev, void *cpu_addr, size_t size,
+		      int direction)
+
+Maps a piece of processor virtual memory so it can be accessed by the
+device and returns the physical handle of the memory.
+
+The direction for both api's may be converted freely by casting.
+However the dma_ API uses a strongly typed enumerator for its
+direction:
+
+DMA_NONE		= PCI_DMA_NONE		no direction (used for
+						debugging)
+DMA_TO_DEVICE		= PCI_DMA_TODEVICE	data is going from the
+						memory to the device
+DMA_FROM_DEVICE		= PCI_DMA_FROMDEVICE	data is coming from
+						the device to the
+						memory
+DMA_BIDIRECTIONAL	= PCI_DMA_BIDIRECTIONAL	direction isn't known
+
+Notes:  Not all memory regions in a machine can be mapped by this
+API.  Further, regions that appear to be physically contiguous in
+kernel virtual space may not be contiguous as physical memory.  Since
+this API does not provide any scatter/gather capability, it will fail
+if the user tries to map a non physically contiguous piece of memory.
+For this reason, it is recommended that memory mapped by this API be
+obtained only from sources which guarantee to be physically contiguous
+(like kmalloc).
+
+Further, the physical address of the memory must be within the
+dma_mask of the device (the dma_mask represents a bit mask of the
+addressable region for the device.  i.e. if the physical address of
+the memory anded with the dma_mask is still equal to the physical
+address, then the device can perform DMA to the memory).  In order to
+ensure that the memory allocated by kmalloc is within the dma_mask,
+the driver may specify various platform dependent flags to restrict
+the physical memory range of the allocation (e.g. on x86, GFP_DMA
+guarantees to be within the first 16Mb of available physical memory,
+as required by ISA devices).
+
+Note also that the above constraints on physical contiguity and
+dma_mask may not apply if the platform has an IOMMU (a device which
+supplies a physical to virtual mapping between the I/O memory bus and
+the device).  However, to be portable, device driver writers may *not*
+assume that such an IOMMU exists.
+
+Warnings:  Memory coherency operates at a granularity called the cache
+line width.  In order for memory mapped by this API to operate
+correctly, the mapped region must begin exactly on a cache line
+boundary and end exactly on one (to prevent two separately mapped
+regions from sharing a single cache line).  Since the cache line size
+may not be known at compile time, the API will not enforce this
+requirement.  Therefore, it is recommended that driver writers who
+don't take special care to determine the cache line size at run time
+only map virtual regions that begin and end on page boundaries (which
+are guaranteed also to be cache line boundaries).
+
+DMA_TO_DEVICE synchronisation must be done after the last modification
+of the memory region by the software and before it is handed off to
+the driver.  Once this primitive is used.  Memory covered by this
+primitive should be treated as read only by the device.  If the device
+may write to it at any point, it should be DMA_BIDIRECTIONAL (see
+below).
+
+DMA_FROM_DEVICE synchronisation must be done before the driver
+accesses data that may be changed by the device.  This memory should
+be treated as read only by the driver.  If the driver needs to write
+to it at any point, it should be DMA_BIDIRECTIONAL (see below).
+
+DMA_BIDIRECTIONAL requires special handling: it means that the driver
+isn't sure if the memory was modified before being handed off to the
+device and also isn't sure if the device will also modify it.  Thus,
+you must always sync bidirectional memory twice: once before the
+memory is handed off to the device (to make sure all memory changes
+are flushed from the processor) and once before the data may be
+accessed after being used by the device (to make sure any processor
+cache lines are updated with data that the device may have changed.
+
+void
+dma_unmap_single(struct device *dev, dma_addr_t dma_addr, size_t size,
+		 enum dma_data_direction direction)
+void
+pci_unmap_single(struct pci_dev *hwdev, dma_addr_t dma_addr,
+		 size_t size, int direction)
+
+Unmaps the region previously mapped.  All the parameters passed in
+must be identical to those passed in (and returned) by the mapping
+API.
+
+dma_addr_t
+dma_map_page(struct device *dev, struct page *page,
+		    unsigned long offset, size_t size,
+		    enum dma_data_direction direction)
+dma_addr_t
+pci_map_page(struct pci_dev *hwdev, struct page *page,
+		    unsigned long offset, size_t size, int direction)
+void
+dma_unmap_page(struct device *dev, dma_addr_t dma_address, size_t size,
+	       enum dma_data_direction direction)
+void
+pci_unmap_page(struct pci_dev *hwdev, dma_addr_t dma_address,
+	       size_t size, int direction)
+
+API for mapping and unmapping for pages.  All the notes and warnings
+for the other mapping APIs apply here.  Also, although the <offset>
+and <size> parameters are provided to do partial page mapping, it is
+recommended that you never use these unless you really know what the
+cache width is.
+
+int
+dma_map_sg(struct device *dev, struct scatterlist *sg, int nents,
+	   enum dma_data_direction direction)
+int
+pci_map_sg(struct pci_dev *hwdev, struct scatterlist *sg,
+	   int nents, int direction)
+
+Maps a scatter gather list from the block layer.
+
+Returns: the number of physical segments mapped (this may be shorted
+than <nents> passed in if the block layer determines that some
+elements of the scatter/gather list are physically adjacent and thus
+may be mapped with a single entry).
+
+void
+dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nhwentries,
+	     enum dma_data_direction direction)
+void
+pci_unmap_sg(struct pci_dev *hwdev, struct scatterlist *sg,
+	     int nents, int direction)
+
+unmap the previously mapped scatter/gather list.  All the parameters
+must be the same as those and passed in to the scatter/gather mapping
+API.
+
+Note: <nents> must be the number you passed in, *not* the number of
+physical entries returned.
+
+void
+dma_sync_single(struct device *dev, dma_addr_t dma_handle, size_t size,
+		enum dma_data_direction direction)
+void
+pci_dma_sync_single(struct pci_dev *hwdev, dma_addr_t dma_handle,
+			   size_t size, int direction)
+void
+dma_sync_sg(struct device *dev, struct scatterlist *sg, int nelems,
+			  enum dma_data_direction direction)
+void
+pci_dma_sync_sg(struct pci_dev *hwdev, struct scatterlist *sg,
+		       int nelems, int direction)
+
+synchronise a single contiguous or scatter/gather mapping.  All the
+parameters must be the same as those passed into the single mapping
+API.
+
+Notes:  You must do this:
+
+- Before reading values that have been written by DMA from the device
+  (use the DMA_FROM_DEVICE direction)
+- After writing values that will be written to the device using DMA
+  (use the DMA_TO_DEVICE) direction
+- before *and* after handing memory to the device if the memory is
+  DMA_BIDIRECTIONAL
+
+See also dma_map_single().
+
+Part II - Advanced dma_ usage
+-----------------------------
+
+Warning: These pieces of the DMA API have no PCI equivalent.  They
+should also not be used in the majority of cases, since they cater for
+unlikely corner cases that don't belong in usual drivers.
+
+If you don't understand how cache line coherency works between a
+processor and an I/O device, you should not be using this part of the
+API at all.
+
+void *
+dma_alloc_nonconsistent(struct device *dev, size_t size,
+			       dma_addr_t *dma_handle)
+
+Identical to dma_alloc_consistent() except that the platform will
+choose to return either consistent or non-consistent memory as it sees
+fit.  By using this API, you are guaranteeing to the platform that you
+have all the correct and necessary sync points for this memory in the
+driver should it choose to return non-consistent memory.
+
+Note: where the platform can return consistent memory, it will
+guarantee that the sync points become nops.
+
+Warning:  Handling non-consistent memory is a real pain.  You should
+only ever use this API if you positively know your driver will be
+required to work on one of the rare (usually non-PCI) architectures
+that simply cannot make consistent memory.
+
+void
+dma_free_nonconsistent(struct device *dev, size_t size, void *cpu_addr,
+			      dma_addr_t dma_handle)
+
+free memory allocated by the nonconsistent API.  All parameters must
+be identical to those passed in (and returned by
+dma_alloc_nonconsistent()).
+
+int
+dma_is_consistent(dma_addr_t dma_handle)
+
+returns true if the memory pointed to by the dma_handle is actually
+consistent.
+
+int
+dma_get_cache_alignment(void)
+
+returns the processor cache alignment.  This is the absolute minimum
+alignment *and* width that you must observe when either mapping
+memory or doing partial flushes.
+
+Notes: This API may return a number *larger* than the actual cache
+line, but it will guarantee that one or more cache lines fit exactly
+into the width returned by this call.  It will also always be a power
+of two for easy alignment
+
+void
+dma_sync_single_range(struct device *dev, dma_addr_t dma_handle,
+		      unsigned long offset, size_t size,
+		      enum dma_data_direction direction)
+
+does a partial sync.  starting at offset and continuing for size.  You
+must be careful to observe the cache alignment and width when doing
+anything like this.  You must also be extra careful about accessing
+memory you intend to sync partially.
+
+void
+dma_cache_sync(void *vaddr, size_t size,
+	       enum dma_data_direction direction)
+
+Do a partial sync of memory that was allocated by
+dma_alloc_nonconsistent(), starting at virtual address vaddr and
+continuing on for size.  Again, you *must* observe the cache line
+boundaries when doing this.
+
+
diff -Nru a/Documentation/DMA-mapping.txt b/Documentation/DMA-mapping.txt
--- a/Documentation/DMA-mapping.txt	Tue Dec 17 20:49:32 2002
+++ b/Documentation/DMA-mapping.txt	Tue Dec 17 20:49:32 2002
@@ -5,6 +5,10 @@
 		 Richard Henderson <rth@cygnus.com>
 		  Jakub Jelinek <jakub@redhat.com>
 
+This document describes the DMA mapping system in terms of the pci_
+API.  For a similar API that works for generic devices, see
+DMA-API.txt.
+
 Most of the 64bit platforms have special hardware that translates bus
 addresses (DMA addresses) into physical addresses.  This is similar to
 how page tables and/or a TLB translates virtual addresses to physical
diff -Nru a/arch/i386/kernel/i386_ksyms.c b/arch/i386/kernel/i386_ksyms.c
--- a/arch/i386/kernel/i386_ksyms.c	Tue Dec 17 20:49:32 2002
+++ b/arch/i386/kernel/i386_ksyms.c	Tue Dec 17 20:49:32 2002
@@ -124,8 +124,8 @@
 EXPORT_SYMBOL(__copy_to_user);
 EXPORT_SYMBOL(strnlen_user);
 
-EXPORT_SYMBOL(pci_alloc_consistent);
-EXPORT_SYMBOL(pci_free_consistent);
+EXPORT_SYMBOL(dma_alloc_consistent);
+EXPORT_SYMBOL(dma_free_consistent);
 
 #ifdef CONFIG_PCI
 EXPORT_SYMBOL(pcibios_penalize_isa_irq);
diff -Nru a/arch/i386/kernel/pci-dma.c b/arch/i386/kernel/pci-dma.c
--- a/arch/i386/kernel/pci-dma.c	Tue Dec 17 20:49:32 2002
+++ b/arch/i386/kernel/pci-dma.c	Tue Dec 17 20:49:32 2002
@@ -13,13 +13,13 @@
 #include <linux/pci.h>
 #include <asm/io.h>
 
-void *pci_alloc_consistent(struct pci_dev *hwdev, size_t size,
+void *dma_alloc_consistent(struct device *dev, size_t size,
 			   dma_addr_t *dma_handle)
 {
 	void *ret;
 	int gfp = GFP_ATOMIC;
 
-	if (hwdev == NULL || ((u32)hwdev->dma_mask != 0xffffffff))
+	if (dev == NULL || ((u32)*dev->dma_mask != 0xffffffff))
 		gfp |= GFP_DMA;
 	ret = (void *)__get_free_pages(gfp, get_order(size));
 
@@ -30,7 +30,7 @@
 	return ret;
 }
 
-void pci_free_consistent(struct pci_dev *hwdev, size_t size,
+void dma_free_consistent(struct device *dev, size_t size,
 			 void *vaddr, dma_addr_t dma_handle)
 {
 	free_pages((unsigned long)vaddr, get_order(size));
diff -Nru a/include/asm-alpha/dma-mapping.h b/include/asm-alpha/dma-mapping.h
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/include/asm-alpha/dma-mapping.h	Tue Dec 17 20:49:32 2002
@@ -0,0 +1 @@
+#include <asm-generic/dma-mapping.h>
diff -Nru a/include/asm-arm/dma-mapping.h b/include/asm-arm/dma-mapping.h
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/include/asm-arm/dma-mapping.h	Tue Dec 17 20:49:32 2002
@@ -0,0 +1 @@
+#include <asm-generic/dma-mapping.h>
diff -Nru a/include/asm-cris/dma-mapping.h b/include/asm-cris/dma-mapping.h
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/include/asm-cris/dma-mapping.h	Tue Dec 17 20:49:32 2002
@@ -0,0 +1 @@
+#include <asm-generic/dma-mapping.h>
diff -Nru a/include/asm-generic/dma-mapping.h b/include/asm-generic/dma-mapping.h
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/include/asm-generic/dma-mapping.h	Tue Dec 17 20:49:32 2002
@@ -0,0 +1,154 @@
+/* Copyright (C) 2002 by James.Bottomley@HansenPartnership.com 
+ *
+ * Implements the generic device dma API via the existing pci_ one
+ * for unconverted architectures
+ */
+
+#ifndef _ASM_GENERIC_DMA_MAPPING_H
+#define _ASM_GENERIC_DMA_MAPPING_H
+
+/* we implement the API below in terms of the existing PCI one,
+ * so include it */
+#include <linux/pci.h>
+
+static inline int
+dma_supported(struct device *dev, u64 mask)
+{
+	BUG_ON(dev->bus != &pci_bus_type);
+
+	return pci_dma_supported(to_pci_dev(dev), mask);
+}
+
+static inline int
+dma_set_mask(struct device *dev, u64 dma_mask)
+{
+	BUG_ON(dev->bus != &pci_bus_type);
+
+	return pci_set_dma_mask(to_pci_dev(dev), dma_mask);
+}
+
+static inline void *
+dma_alloc_consistent(struct device *dev, size_t size, dma_addr_t *dma_handle)
+{
+	BUG_ON(dev->bus != &pci_bus_type);
+
+	return pci_alloc_consistent(to_pci_dev(dev), size, dma_handle);
+}
+
+static inline void
+dma_free_consistent(struct device *dev, size_t size, void *cpu_addr,
+		    dma_addr_t dma_handle)
+{
+	BUG_ON(dev->bus != &pci_bus_type);
+
+	pci_free_consistent(to_pci_dev(dev), size, cpu_addr, dma_handle);
+}
+
+static inline dma_addr_t
+dma_map_single(struct device *dev, void *cpu_addr, size_t size,
+	       enum dma_data_direction direction)
+{
+	BUG_ON(dev->bus != &pci_bus_type);
+
+	return pci_map_single(to_pci_dev(dev), cpu_addr, size, (int)direction);
+}
+
+static inline void
+dma_unmap_single(struct device *dev, dma_addr_t dma_addr, size_t size,
+		 enum dma_data_direction direction)
+{
+	BUG_ON(dev->bus != &pci_bus_type);
+
+	pci_unmap_single(to_pci_dev(dev), dma_addr, size, (int)direction);
+}
+
+static inline dma_addr_t
+dma_map_page(struct device *dev, struct page *page,
+	     unsigned long offset, size_t size,
+	     enum dma_data_direction direction)
+{
+	BUG_ON(dev->bus != &pci_bus_type);
+
+	return pci_map_page(to_pci_dev(dev), page, offset, size, (int)direction);
+}
+
+static inline void
+dma_unmap_page(struct device *dev, dma_addr_t dma_address, size_t size,
+	       enum dma_data_direction direction)
+{
+	BUG_ON(dev->bus != &pci_bus_type);
+
+	pci_unmap_page(to_pci_dev(dev), dma_address, size, (int)direction);
+}
+
+static inline int
+dma_map_sg(struct device *dev, struct scatterlist *sg, int nents,
+	   enum dma_data_direction direction)
+{
+	BUG_ON(dev->bus != &pci_bus_type);
+
+	return pci_map_sg(to_pci_dev(dev), sg, nents, (int)direction);
+}
+
+static inline void
+dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nhwentries,
+	     enum dma_data_direction direction)
+{
+	BUG_ON(dev->bus != &pci_bus_type);
+
+	pci_unmap_sg(to_pci_dev(dev), sg, nhwentries, (int)direction);
+}
+
+static inline void
+dma_sync_single(struct device *dev, dma_addr_t dma_handle, size_t size,
+		enum dma_data_direction direction)
+{
+	BUG_ON(dev->bus != &pci_bus_type);
+
+	pci_dma_sync_single(to_pci_dev(dev), dma_handle, size, (int)direction);
+}
+
+static inline void
+dma_sync_sg(struct device *dev, struct scatterlist *sg, int nelems,
+	    enum dma_data_direction direction)
+{
+	BUG_ON(dev->bus != &pci_bus_type);
+
+	pci_dma_sync_sg(to_pci_dev(dev), sg, nelems, (int)direction);
+}
+
+/* Now for the API extensions over the pci_ one */
+
+#define dma_alloc_nonconsistent(d, s, h) dma_alloc_consistent(d, s, h)
+#define dma_free_nonconsistent(d, s, v, h) dma_free_consistent(d, s, v, h)
+#define dma_is_consistent(d)	(1)
+
+static inline int
+dma_get_cache_alignment(void)
+{
+	/* no easy way to get cache size on all processors, so return
+	 * the maximum possible, to be safe */
+	return (1 << L1_CACHE_SHIFT_MAX);
+}
+
+static inline void
+dma_sync_single_range(struct device *dev, dma_addr_t dma_handle,
+		      unsigned long offset, size_t size,
+		      enum dma_data_direction direction)
+{
+	/* just sync everything, that's all the pci API can do */
+	dma_sync_single(dev, dma_handle, offset+size, direction);
+}
+
+static inline void
+dma_cache_sync(void *vaddr, size_t size,
+	       enum dma_data_direction direction)
+{
+	/* could define this in terms of the dma_cache ... operations,
+	 * but if you get this on a platform, you should convert the platform
+	 * to using the generic device DMA API */
+	BUG();
+}
+
+#endif
+
diff -Nru a/include/asm-i386/dma-mapping.h b/include/asm-i386/dma-mapping.h
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/include/asm-i386/dma-mapping.h	Tue Dec 17 20:49:32 2002
@@ -0,0 +1,137 @@
+#ifndef _ASM_I386_DMA_MAPPING_H
+#define _ASM_I386_DMA_MAPPING_H
+
+#include <asm/cache.h>
+
+#define dma_alloc_nonconsistent(d, s, h) dma_alloc_consistent(d, s, h)
+#define dma_free_nonconsistent(d, s, v, h) dma_free_consistent(d, s, v, h)
+
+void *dma_alloc_consistent(struct device *dev, size_t size,
+			   dma_addr_t *dma_handle);
+
+void dma_free_consistent(struct device *dev, size_t size,
+			 void *vaddr, dma_addr_t dma_handle);
+
+static inline dma_addr_t
+dma_map_single(struct device *dev, void *ptr, size_t size,
+	       enum dma_data_direction direction)
+{
+	BUG_ON(direction == DMA_NONE);
+	flush_write_buffers();
+	return virt_to_phys(ptr);
+}
+
+static inline void
+dma_unmap_single(struct device *dev, dma_addr_t dma_addr, size_t size,
+		 enum dma_data_direction direction)
+{
+	BUG_ON(direction == DMA_NONE);
+}
+
+static inline int
+dma_map_sg(struct device *dev, struct scatterlist *sg, int nents,
+	   enum dma_data_direction direction)
+{
+	int i;
+
+	BUG_ON(direction == DMA_NONE);
+
+	for (i = 0; i < nents; i++ ) {
+		BUG_ON(!sg[i].page);
+
+		sg[i].dma_address = page_to_phys(sg[i].page) + sg[i].offset;
+	}
+
+	flush_write_buffers();
+	return nents;
+}
+
+static inline dma_addr_t
+dma_map_page(struct device *dev, struct page *page, unsigned long offset,
+	     size_t size, enum dma_data_direction direction)
+{
+	BUG_ON(direction == DMA_NONE);
+	return (dma_addr_t)(page_to_pfn(page)) * PAGE_SIZE + offset;
+}
+
+static inline void
+dma_unmap_page(struct device *dev, dma_addr_t dma_address, size_t size,
+	       enum dma_data_direction direction)
+{
+	BUG_ON(direction == DMA_NONE);
+}
+
+
+static inline void
+dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nhwentries,
+	     enum dma_data_direction direction)
+{
+	BUG_ON(direction == DMA_NONE);
+}
+
+static inline void
+dma_sync_single(struct device *dev, dma_addr_t dma_handle, size_t size,
+		enum dma_data_direction direction)
+{
+	flush_write_buffers();
+}
+
+static inline void
+dma_sync_single_range(struct device *dev, dma_addr_t dma_handle,
+		      unsigned long offset, size_t size,
+		      enum dma_data_direction direction)
+{
+	flush_write_buffers();
+}
+
+
+static inline void
+dma_sync_sg(struct device *dev, struct scatterlist *sg, int nelems,
+		 enum dma_data_direction direction)
+{
+	flush_write_buffers();
+}
+
+static inline int
+dma_supported(struct device *dev, u64 mask)
+{
+        /*
+         * we fall back to GFP_DMA when the mask isn't all 1s,
+         * so we can't guarantee allocations that must be
+         * within a tighter range than GFP_DMA..
+         */
+        if(mask < 0x00ffffff)
+                return 0;
+
+	return 1;
+}
+
+static inline int
+dma_set_mask(struct device *dev, u64 mask)
+{
+	if(!dev->dma_mask || !dma_supported(dev, mask))
+		return -EIO;
+
+	*dev->dma_mask = mask;
+
+	return 0;
+}
+
+static inline int
+dma_get_cache_alignment(void)
+{
+	/* no easy way to get cache size on all x86, so return the
+	 * maximum possible, to be safe */
+	return (1 << L1_CACHE_SHIFT_MAX);
+}
+
+#define dma_is_consistent(d)	(1)
+
+static inline void
+dma_cache_sync(void *vaddr, size_t size,
+	       enum dma_data_direction direction)
+{
+	flush_write_buffers();
+}
+
+#endif
diff -Nru a/include/asm-i386/pci.h b/include/asm-i386/pci.h
--- a/include/asm-i386/pci.h	Tue Dec 17 20:49:32 2002
+++ b/include/asm-i386/pci.h	Tue Dec 17 20:49:32 2002
@@ -6,6 +6,9 @@
 #ifdef __KERNEL__
 #include <linux/mm.h>		/* for struct page */
 
+/* we support the new DMA API, but still provide the old one */
+#define PCI_NEW_DMA_COMPAT_API	1
+
 /* Can be used to override the logic in pci_scan_bus for skipping
    already-configured bus numbers - to be used for buggy BIOSes
    or architectures with incomplete PCI setup by the loader */
@@ -46,78 +49,6 @@
  */
 #define PCI_DMA_BUS_IS_PHYS	(1)
 
-/* Allocate and map kernel buffer using consistent mode DMA for a device.
- * hwdev should be valid struct pci_dev pointer for PCI devices,
- * NULL for PCI-like buses (ISA, EISA).
- * Returns non-NULL cpu-view pointer to the buffer if successful and
- * sets *dma_addrp to the pci side dma address as well, else *dma_addrp
- * is undefined.
- */
-extern void *pci_alloc_consistent(struct pci_dev *hwdev, size_t size,
-				  dma_addr_t *dma_handle);
-
-/* Free and unmap a consistent DMA buffer.
- * cpu_addr is what was returned from pci_alloc_consistent,
- * size must be the same as what as passed into pci_alloc_consistent,
- * and likewise dma_addr must be the same as what *dma_addrp was set to.
- *
- * References to the memory and mappings associated with cpu_addr/dma_addr
- * past this call are illegal.
- */
-extern void pci_free_consistent(struct pci_dev *hwdev, size_t size,
-				void *vaddr, dma_addr_t dma_handle);
-
-/* Map a single buffer of the indicated size for DMA in streaming mode.
- * The 32-bit bus address to use is returned.
- *
- * Once the device is given the dma address, the device owns this memory
- * until either pci_unmap_single or pci_dma_sync_single is performed.
- */
-static inline dma_addr_t pci_map_single(struct pci_dev *hwdev, void *ptr,
-					size_t size, int direction)
-{
-	if (direction == PCI_DMA_NONE)
-		BUG();
-	flush_write_buffers();
-	return virt_to_phys(ptr);
-}
-
-/* Unmap a single streaming mode DMA translation.  The dma_addr and size
- * must match what was provided for in a previous pci_map_single call.  All
- * other usages are undefined.
- *
- * After this call, reads by the cpu to the buffer are guarenteed to see
- * whatever the device wrote there.
- */
-static inline void pci_unmap_single(struct pci_dev *hwdev, dma_addr_t dma_addr,
-				    size_t size, int direction)
-{
-	if (direction == PCI_DMA_NONE)
-		BUG();
-	/* Nothing to do */
-}
-
-/*
- * pci_{map,unmap}_single_page maps a kernel page to a dma_addr_t. identical
- * to pci_map_single, but takes a struct page instead of a virtual address
- */
-static inline dma_addr_t pci_map_page(struct pci_dev *hwdev, struct page *page,
-				      unsigned long offset, size_t size, int direction)
-{
-	if (direction == PCI_DMA_NONE)
-		BUG();
-
-	return (dma_addr_t)(page_to_pfn(page)) * PAGE_SIZE + offset;
-}
-
-static inline void pci_unmap_page(struct pci_dev *hwdev, dma_addr_t dma_address,
-				  size_t size, int direction)
-{
-	if (direction == PCI_DMA_NONE)
-		BUG();
-	/* Nothing to do */
-}
-
 /* pci_unmap_{page,single} is a nop so... */
 #define DECLARE_PCI_UNMAP_ADDR(ADDR_NAME)
 #define DECLARE_PCI_UNMAP_LEN(LEN_NAME)
@@ -126,84 +57,6 @@
 #define pci_unmap_len(PTR, LEN_NAME)		(0)
 #define pci_unmap_len_set(PTR, LEN_NAME, VAL)	do { } while (0)
 
-/* Map a set of buffers described by scatterlist in streaming
- * mode for DMA.  This is the scather-gather version of the
- * above pci_map_single interface.  Here the scatter gather list
- * elements are each tagged with the appropriate dma address
- * and length.  They are obtained via sg_dma_{address,length}(SG).
- *
- * NOTE: An implementation may be able to use a smaller number of
- *       DMA address/length pairs than there are SG table elements.
- *       (for example via virtual mapping capabilities)
- *       The routine returns the number of addr/length pairs actually
- *       used, at most nents.
- *
- * Device ownership issues as mentioned above for pci_map_single are
- * the same here.
- */
-static inline int pci_map_sg(struct pci_dev *hwdev, struct scatterlist *sg,
-			     int nents, int direction)
-{
-	int i;
-
-	if (direction == PCI_DMA_NONE)
-		BUG();
-
-	for (i = 0; i < nents; i++ ) {
-		if (!sg[i].page)
-			BUG();
-
-		sg[i].dma_address = page_to_phys(sg[i].page) + sg[i].offset;
-	}
-
-	flush_write_buffers();
-	return nents;
-}
-
-/* Unmap a set of streaming mode DMA translations.
- * Again, cpu read rules concerning calls here are the same as for
- * pci_unmap_single() above.
- */
-static inline void pci_unmap_sg(struct pci_dev *hwdev, struct scatterlist *sg,
-				int nents, int direction)
-{
-	if (direction == PCI_DMA_NONE)
-		BUG();
-	/* Nothing to do */
-}
-
-/* Make physical memory consistent for a single
- * streaming mode DMA translation after a transfer.
- *
- * If you perform a pci_map_single() but wish to interrogate the
- * buffer using the cpu, yet do not wish to teardown the PCI dma
- * mapping, you must call this function before doing so.  At the
- * next point you give the PCI dma address back to the card, the
- * device again owns the buffer.
- */
-static inline void pci_dma_sync_single(struct pci_dev *hwdev,
-				       dma_addr_t dma_handle,
-				       size_t size, int direction)
-{
-	if (direction == PCI_DMA_NONE)
-		BUG();
-	flush_write_buffers();
-}
-
-/* Make physical memory consistent for a set of streaming
- * mode DMA translations after a transfer.
- *
- * The same as pci_dma_sync_single but for a scatter-gather list,
- * same rules and usage.
- */
-static inline void pci_dma_sync_sg(struct pci_dev *hwdev,
-				   struct scatterlist *sg,
-				   int nelems, int direction)
-{
-	if (direction == PCI_DMA_NONE)
-		BUG();
-	flush_write_buffers();
-}
 
 /* Return whether the given PCI device DMA address mask can
  * be supported properly.  For example, if your device can
diff -Nru a/include/asm-ia64/dma-mapping.h b/include/asm-ia64/dma-mapping.h
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/include/asm-ia64/dma-mapping.h	Tue Dec 17 20:49:32 2002
@@ -0,0 +1 @@
+#include <asm-generic/dma-mapping.h>
diff -Nru a/include/asm-m68k/dma-mapping.h b/include/asm-m68k/dma-mapping.h
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/include/asm-m68k/dma-mapping.h	Tue Dec 17 20:49:32 2002
@@ -0,0 +1 @@
+#include <asm-generic/dma-mapping.h>
diff -Nru a/include/asm-m68knommu/dma-mapping.h b/include/asm-m68knommu/dma-mapping.h
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/include/asm-m68knommu/dma-mapping.h	Tue Dec 17 20:49:32 2002
@@ -0,0 +1 @@
+#include <asm-generic/dma-mapping.h>
diff -Nru a/include/asm-mips/dma-mapping.h b/include/asm-mips/dma-mapping.h
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/include/asm-mips/dma-mapping.h	Tue Dec 17 20:49:32 2002
@@ -0,0 +1 @@
+#include <asm-generic/dma-mapping.h>
diff -Nru a/include/asm-mips64/dma-mapping.h b/include/asm-mips64/dma-mapping.h
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/include/asm-mips64/dma-mapping.h	Tue Dec 17 20:49:32 2002
@@ -0,0 +1 @@
+#include <asm-generic/dma-mapping.h>
diff -Nru a/include/asm-ppc/dma-mapping.h b/include/asm-ppc/dma-mapping.h
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/include/asm-ppc/dma-mapping.h	Tue Dec 17 20:49:32 2002
@@ -0,0 +1 @@
+#include <asm-generic/dma-mapping.h>
diff -Nru a/include/asm-ppc64/dma-mapping.h b/include/asm-ppc64/dma-mapping.h
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/include/asm-ppc64/dma-mapping.h	Tue Dec 17 20:49:32 2002
@@ -0,0 +1 @@
+#include <asm-generic/dma-mapping.h>
diff -Nru a/include/asm-s390/dma-mapping.h b/include/asm-s390/dma-mapping.h
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/include/asm-s390/dma-mapping.h	Tue Dec 17 20:49:32 2002
@@ -0,0 +1 @@
+#include <asm-generic/dma-mapping.h>
diff -Nru a/include/asm-s390x/dma-mapping.h b/include/asm-s390x/dma-mapping.h
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/include/asm-s390x/dma-mapping.h	Tue Dec 17 20:49:32 2002
@@ -0,0 +1 @@
+#include <asm-generic/dma-mapping.h>
diff -Nru a/include/asm-sh/dma-mapping.h b/include/asm-sh/dma-mapping.h
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/include/asm-sh/dma-mapping.h	Tue Dec 17 20:49:32 2002
@@ -0,0 +1 @@
+#include <asm-generic/dma-mapping.h>
diff -Nru a/include/asm-sparc/dma-mapping.h b/include/asm-sparc/dma-mapping.h
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/include/asm-sparc/dma-mapping.h	Tue Dec 17 20:49:32 2002
@@ -0,0 +1 @@
+#include <asm-generic/dma-mapping.h>
diff -Nru a/include/asm-sparc64/dma-mapping.h b/include/asm-sparc64/dma-mapping.h
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/include/asm-sparc64/dma-mapping.h	Tue Dec 17 20:49:32 2002
@@ -0,0 +1 @@
+#include <asm-generic/dma-mapping.h>
diff -Nru a/include/asm-um/dma-mapping.h b/include/asm-um/dma-mapping.h
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/include/asm-um/dma-mapping.h	Tue Dec 17 20:49:32 2002
@@ -0,0 +1 @@
+#include <asm-generic/dma-mapping.h>
diff -Nru a/include/asm-v850/dma-mapping.h b/include/asm-v850/dma-mapping.h
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/include/asm-v850/dma-mapping.h	Tue Dec 17 20:49:32 2002
@@ -0,0 +1 @@
+#include <asm-generic/dma-mapping.h>
diff -Nru a/include/asm-x86_64/dma-mapping.h b/include/asm-x86_64/dma-mapping.h
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/include/asm-x86_64/dma-mapping.h	Tue Dec 17 20:49:32 2002
@@ -0,0 +1 @@
+#include <asm-generic/dma-mapping.h>
diff -Nru a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/include/linux/dma-mapping.h	Tue Dec 17 20:49:32 2002
@@ -0,0 +1,17 @@
+#ifndef _ASM_LINUX_DMA_MAPPING_H
+#define _ASM_LINUX_DMA_MAPPING_H
+
+/* These definitions mirror those in pci.h, so they can be used
+ * interchangeably with their PCI_ counterparts */
+enum dma_data_direction {
+	DMA_BIDIRECTIONAL = 0,
+	DMA_TO_DEVICE = 1,
+	DMA_FROM_DEVICE = 2,
+	DMA_NONE = 3,
+};
+
+#include <asm/dma-mapping.h>
+
+#endif
+
+
diff -Nru a/include/linux/pci.h b/include/linux/pci.h
--- a/include/linux/pci.h	Tue Dec 17 20:49:32 2002
+++ b/include/linux/pci.h	Tue Dec 17 20:49:32 2002
@@ -826,5 +826,92 @@
 #define PCIPCI_VIAETBF		8
 #define PCIPCI_VSFX		16
 
+#include <linux/dma-mapping.h>
+
+/* If you define PCI_NEW_DMA_COMPAT_API it means you support the new DMA API
+ * and you want the pci_ DMA API to be implemented using it.
+ */
+#if defined(PCI_NEW_DMA_COMPAT_API) && defined(CONFIG_PCI)
+
+/* note pci_set_dma_mask isn't here, since it's a public function
+ * exported from drivers/pci, use dma_supported instead */
+
+static inline int
+pci_dma_supported(struct pci_dev *hwdev, u64 mask)
+{
+	return dma_supported(&hwdev->dev, mask);
+}
+
+static inline void *
+pci_alloc_consistent(struct pci_dev *hwdev, size_t size,
+		     dma_addr_t *dma_handle)
+{
+	return dma_alloc_consistent(&hwdev->dev, size, dma_handle);
+}
+
+static inline void
+pci_free_consistent(struct pci_dev *hwdev, size_t size,
+		    void *vaddr, dma_addr_t dma_handle)
+{
+	dma_free_consistent(&hwdev->dev, size, vaddr, dma_handle);
+}
+
+static inline dma_addr_t
+pci_map_single(struct pci_dev *hwdev, void *ptr, size_t size, int direction)
+{
+	return dma_map_single(&hwdev->dev, ptr, size, (enum dma_data_direction)direction);
+}
+
+static inline void
+pci_unmap_single(struct pci_dev *hwdev, dma_addr_t dma_addr,
+		 size_t size, int direction)
+{
+	dma_unmap_single(&hwdev->dev, dma_addr, size, (enum dma_data_direction)direction);
+}
+
+static inline dma_addr_t
+pci_map_page(struct pci_dev *hwdev, struct page *page,
+	     unsigned long offset, size_t size, int direction)
+{
+	return dma_map_page(&hwdev->dev, page, offset, size, (enum dma_data_direction)direction);
+}
+
+static inline void
+pci_unmap_page(struct pci_dev *hwdev, dma_addr_t dma_address,
+	       size_t size, int direction)
+{
+	dma_unmap_page(&hwdev->dev, dma_address, size, (enum dma_data_direction)direction);
+}
+
+static inline int
+pci_map_sg(struct pci_dev *hwdev, struct scatterlist *sg,
+	   int nents, int direction)
+{
+	return dma_map_sg(&hwdev->dev, sg, nents, (enum dma_data_direction)direction);
+}
+
+static inline void
+pci_unmap_sg(struct pci_dev *hwdev, struct scatterlist *sg,
+	     int nents, int direction)
+{
+	dma_unmap_sg(&hwdev->dev, sg, nents, (enum dma_data_direction)direction);
+}
+
+static inline void
+pci_dma_sync_single(struct pci_dev *hwdev, dma_addr_t dma_handle,
+		    size_t size, int direction)
+{
+	dma_sync_single(&hwdev->dev, dma_handle, size, (enum dma_data_direction)direction);
+}
+
+static inline void
+pci_dma_sync_sg(struct pci_dev *hwdev, struct scatterlist *sg,
+		int nelems, int direction)
+{
+	dma_sync_sg(&hwdev->dev, sg, nelems, (enum dma_data_direction)direction);
+}
+
+#endif
+
 #endif /* __KERNEL__ */
 #endif /* LINUX_PCI_H */

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFT][PATCH] generic device DMA implementation
  2002-12-18  3:01 [RFT][PATCH] generic device DMA implementation James Bottomley
@ 2002-12-18  3:13 ` David Mosberger
  2002-12-28 18:14 ` Russell King
  1 sibling, 0 replies; 71+ messages in thread
From: David Mosberger @ 2002-12-18  3:13 UTC (permalink / raw)
  To: James Bottomley; +Cc: linux-kernel


  James> The attached should represent close to final form for the
  James> generic DMA API.  It includes documentation (surprise!) and
  James> and implementation in terms of the pci_ API for every arch
  James> (apart from parisc, which will be submitted later).

  James> I've folded in the feedback from the previous thread.
  James> Hopefully, this should be ready for inclusion.  If people
  James> could test it on x86 and other architectures, I'd be
  James> grateful.

  James> comments and feedback from testing welcome.

Would you mind doing a s/consistent/coherent/g?  This has been
misnamed in the PCI DMA interface all along, but I didn't think it's
worth breaking drivers because of it.  But since this is a new
interface, there is no such issue.

(Consistency says something about memory access ordering, coherency
only talks about there not being multiple values for a given memory
location.  On DMA-coherent platforms with weakly-ordered memory
systems, the returned memory really is only coherent, not consistent,
i.e., you have to use memory barriers if you want to enforce
ordering.)

Thanks,

	--david

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFT][PATCH] generic device DMA implementation
  2002-12-18  3:01 [RFT][PATCH] generic device DMA implementation James Bottomley
  2002-12-18  3:13 ` David Mosberger
@ 2002-12-28 18:14 ` Russell King
  2002-12-28 18:19   ` James Bottomley
  1 sibling, 1 reply; 71+ messages in thread
From: Russell King @ 2002-12-28 18:14 UTC (permalink / raw)
  To: James Bottomley; +Cc: linux-kernel

I've just been working through the ARM dma stuff, converting it to the
new API, and I foudn this:

> +static inline int
> +pci_dma_supported(struct pci_dev *hwdev, u64 mask)
> +{
> +	return dma_supported(&hwdev->dev, mask);
> +}
> (etc)

I'll now pull out a bit from DMA-mapping.txt:

| 		 Using Consistent DMA mappings.
| 
| To allocate and map large (PAGE_SIZE or so) consistent DMA regions,
| you should do:
| 
| 	dma_addr_t dma_handle;
| 
| 	cpu_addr = pci_alloc_consistent(dev, size, &dma_handle);
| 
| where dev is a struct pci_dev *. You should pass NULL for PCI like buses
| where devices don't have struct pci_dev (like ISA, EISA).  This may be
| called in interrupt context. 

What happens to &hwdev->dev when you do as detailed there and pass NULL
into these "compatibility" functions?  Probably an oops.

I think these "compatibility" functions need to do:

static inline xxx
pci_xxx(struct pci_dev *hwdev, ...)
{
	dma_xxxx(hwdev ? &hwdev->dev : NULL, ...)
}

so they remain correct to existing API users expectations.

-- 
Russell King (rmk@arm.linux.org.uk)                The developer of ARM Linux
             http://www.arm.linux.org.uk/personal/aboutme.html









^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [RFT][PATCH] generic device DMA implementation
  2002-12-28 18:14 ` Russell King
@ 2002-12-28 18:19   ` James Bottomley
  0 siblings, 0 replies; 71+ messages in thread
From: James Bottomley @ 2002-12-28 18:19 UTC (permalink / raw)
  To: James Bottomley, linux-kernel

rmk@arm.linux.org.uk said:
> What happens to &hwdev->dev when you do as detailed there and pass
> NULL into these "compatibility" functions?  Probably an oops. 

Yes.  Already found by Udo Steinberg and fixed in bk latest...

James



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [parisc-linux] Untested port of parisc_device to generic device interface
@ 2002-11-10  5:20 Adam J. Richter
  0 siblings, 0 replies; 71+ messages in thread
From: Adam J. Richter @ 2002-11-10  5:20 UTC (permalink / raw)
  To: willy; +Cc: andmike, hch, James.Bottomley, linux-kernel, mochel, parisc-linux

On Sat, Nov 09, 2002 at 05:21:50AM +0000, Matthew Wilcox wrote about
generic device models:
> And, for gods sake, don't fuck it up by integrating it with USB too early
> in the game.

	Without rewriting everything, I could see creating a common
device container for devices connected directly to the CPU's memory,
IO or interrupt busses (PCI, ISA, sbus, etc.) so as not to increase
the memory size of device and device_driver for non-mapped device
types (USB, firewire, SCSI, etc.).  More importantly, this change
would provide some type checking for operations that are really only
meaningful on CPU mapped devices, such as the "generic"
dma_{alloc,free}_consistent() and related functions.  I imagine
something like this:


struct cpu_mapped_device {
	dma_addr_t		dma_mask;
	/* Initialized from parent->dma_mask and driver->dma_restrictions. */
	void			*prealloc_vaddr;
	void 			*prealloc_busaddr;
	struct device		device;
};

struct cpu_mapped_driver {
	struct io_restrictions	*dma_restrictions;
	int			prealloc_consistent;
	int			fake_consistent_ok : 1;
	struct device_driver	driver;
};

struct device {
	...
	struct resource_list		*resources;
	/* For automatic release of memory ranges, IO ports, DMA channels,
	   IRQ's, SCSI ID's, SCSI LUN's within an ID, USB device ID's,
	   etc.  (but not USB hub ports, and PCMCIA slots, which we can
	   released before ->remove() is called). */

	dma_pool_t			dma_stub_pool[2];
	struct cpu_mapped_device	*dma_dev;
	/* Points to the cpu_mapped_device that we are embedded in for PCI,
	   ISA, and other cpu mapped devices.  Points to the parent
	   cpu mapped device for others, such as the USB controller for a
	   USB network interface.   Will be used for allocating DMA gather
	   scatter stubs when allocating skbuff's, bio's, scsi_request's,
	   USB request blocks, etc., eliminating another set of error
	   branches. */
};

Adam J. Richter     __     ______________   575 Oroville Road
adam@yggdrasil.com     \ /                  Miplitas, California 95035
+1 408 309-6081         | g g d r a s i l   United States of America
                         "Free Software For The Rest Of Us."

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [parisc-linux] Untested port of parisc_device to generic device interface
  2002-11-10  2:01 ` J.E.J. Bottomley
@ 2002-11-10  2:15   ` Matthew Wilcox
  0 siblings, 0 replies; 71+ messages in thread
From: Matthew Wilcox @ 2002-11-10  2:15 UTC (permalink / raw)
  To: J.E.J. Bottomley
  Cc: Adam J. Richter, andmike, greg, hch, linux-kernel, mochel,
	parisc-linux, willy

On Sat, Nov 09, 2002 at 09:01:29PM -0500, J.E.J. Bottomley wrote:
> Actually, so would I.  I can suspect why there might exist machines like this 
> (say the consistent attribute is settable at the pgd level)

well, there's a limited amount of space available for consistent mappings
on some machines.  it's basically the same as the vmalloc space.  i think
the best way to handle this is simply to fail to initialise if you can't
get the consistent memory you need.

-- 
Revolutions do not require corporate support.

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [parisc-linux] Untested port of parisc_device to generic device  interface
  2002-11-10  0:23 Adam J. Richter
@ 2002-11-10  2:01 ` J.E.J. Bottomley
  2002-11-10  2:15   ` Matthew Wilcox
  0 siblings, 1 reply; 71+ messages in thread
From: J.E.J. Bottomley @ 2002-11-10  2:01 UTC (permalink / raw)
  To: Adam J. Richter
  Cc: James.Bottomley, andmike, greg, hch, linux-kernel, mochel,
	parisc-linux, willy

adam@yggdrasil.com said:
> 
> 	I'd like to know more about what these machines look like in the real
> world.  Specifically, I am interested in the trade-off of having a
> parameter to wback_fake_consistent so that it could be enabled or
> disabled on an individual basis.

Actually, so would I.  I can suspect why there might exist machines like this 
(say the consistent attribute is settable at the pgd level)

> 	I suspect that the parameter is not worth the clutter because these
> "partially consistent" machines either have a large amount of
> consistent memory, so the case of the allocation failing in the is not
> worth supporting, or it is easy to check for consistent memory on them
> with something like "if ((unsigned long) vaddr < 0xwhatever)", but I'm
> just guessing. 

Well, if it has to be done, it can be done by making alloc_consistent return a 
handle rather than an address and making wback/invalidate take the handle (but 
it's certainly not ideal).

James



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [parisc-linux] Untested port of parisc_device to generic device interface
@ 2002-11-10  1:50 Adam J. Richter
  0 siblings, 0 replies; 71+ messages in thread
From: Adam J. Richter @ 2002-11-10  1:50 UTC (permalink / raw)
  To: grundler
  Cc: andmike, hch, James.Bottomley, linux-kernel, mochel, parisc-linux

Grant Grundler writes:
>Documentation/driver-model/overview.txt:
>| Note also that it is at the _end_ of struct pci_dev. This is
>| to make people think about what they're doing when switching between the bus
>| driver and the global driver; and to prevent against mindless casts between
>| the two.
>
>Until this changes, I don't see this as a useful replacement for
>either PCI or parisc devices. The "mindless casts" can be fixed.
>But without the ability to easily go from generic device type to
>bus specific type, people will just get lost in the maze of pointers.

	linux-2.5.46/include/linux/kernel.h already defines
container_of(ptr_to_element, parent_struct, element_name).

>From <linux/pci.h>:
#define to_pci_dev(n) container_of(n, struct pci_dev, dev)

>From <linux/usb.h>:
#define to_usb_device(d) container_of(d, struct usb_device, dev)

>From <asm-parisc/hardware.h> with my parisc device patch:
static inline struct parisc_device *to_parisc_dev(struct device *dev)
{
        return container_of(dev, struct parisc_device, device);
}

Adam J. Richter     __     ______________   575 Oroville Road
adam@yggdrasil.com     \ /                  Miplitas, California 95035
+1 408 309-6081         | g g d r a s i l   United States of America
                         "Free Software For The Rest Of Us."

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [parisc-linux] Untested port of parisc_device to generic device interface
@ 2002-11-10  0:23 Adam J. Richter
  2002-11-10  2:01 ` J.E.J. Bottomley
  0 siblings, 1 reply; 71+ messages in thread
From: Adam J. Richter @ 2002-11-10  0:23 UTC (permalink / raw)
  To: James.Bottomley
  Cc: andmike, greg, hch, linux-kernel, mochel, parisc-linux, willy

James Bottomley wrote:
>3. Machine is partially consistent.  consistent allocation may fail because 
>we're out of consistent memory so we have to fall back to the old.

	I'd like to know more about what these machines look
like in the real world.  Specifically, I am interested in the
trade-off of having a parameter to wback_fake_consistent
so that it could be enabled or disabled on an individual basis.

	I suspect that the parameter is not worth the clutter because
these "partially consistent" machines either have a large amount of
consistent memory, so the case of the allocation failing in the is not
worth supporting, or it is easy to check for consistent memory on them
with something like "if ((unsigned long) vaddr < 0xwhatever)", but I'm
just guessing.

Adam J. Richter     __     ______________   575 Oroville Road
adam@yggdrasil.com     \ /                  Miplitas, California 95035
+1 408 309-6081         | g g d r a s i l   United States of America
                         "Free Software For The Rest Of Us."



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [parisc-linux] Untested port of parisc_device to generic device interface
@ 2002-11-09 12:22 Adam J. Richter
  0 siblings, 0 replies; 71+ messages in thread
From: Adam J. Richter @ 2002-11-09 12:22 UTC (permalink / raw)
  To: willy; +Cc: andmike, hch, James.Bottomley, linux-kernel, mochel, parisc-linux

Matthew Wilcox writes:
>On Fri, Nov 08, 2002 at 08:51:28PM -0800, Adam J. Richter wrote:
>> My patch is a net deletion of 57 lines and will allow simplification
>> of parisc DMA allocation.
>
>57 lines of clean elegant code, replacing them with overly generic ugly
>code and bloated data structures.

	You'd really have to gerrymander to pick a standard by which
my code is "ugly" and what it deletes is "clean and elegant."  I leave
it to other interested readers to look at my patch and judge.

>struct device is a round 256 bytes
>on x86.  more on 64-bit architectures.

	Here is an updated version of my completely untested patch
that replaces parisc_device.name[] with parisc_device.device.name[].

	sizeof(struct device), at least on x86	244 bytes
	5 pointers removed from parisc_device   -40 bytes
	parisc_device.name moved to .device	-80 bytes
						---------
	Size cost of using struct device	124 bytes per parisc_device

	There are ten drivers in drivers/parisc and arch/parisc that
define a struct parisc_driver.  Perhaps you'll have an average of ~1.5
of each such device in a parisc system or maybe ten struct
parisc_device's and a similar number of parisc_driver's.  So, you're
talking about perhaps 2kB of additional space for parisc_device's,
perhaps 3kB including the parisc_driver's.

	This space cost would be reduced somewhat by the shrinkage to
arch/parisc/kernel/drivers.c, and this code will enable other
simplifications.  Even without those other simplificatins, this code
may be a net space savings on some systems.

Adam J. Richter     __     ______________   575 Oroville Road
adam@yggdrasil.com     \ /                  Miplitas, California 95035
+1 408 309-6081         | g g d r a s i l   United States of America
                         "Free Software For The Rest Of Us."

diff -u -r linux-2.5.46/include/asm-parisc/hardware.h linux/include/asm-parisc/hardware.h
--- linux-2.5.46/include/asm-parisc/hardware.h	Wed Oct 30 21:32:15 2002
+++ linux/include/asm-parisc/hardware.h	Sat Nov  9 03:07:23 2002
@@ -1,6 +1,7 @@
 #ifndef _PARISC_HARDWARE_H
 #define _PARISC_HARDWARE_H
 
+#include <linux/device.h>
 #include <asm/pdc.h>
 
 struct parisc_device_id {
@@ -26,14 +27,9 @@
 struct parisc_device {
 	unsigned long   hpa;		/* Hard Physical Address */
 	struct parisc_device_id id;
-	struct parisc_device *parent;
-	struct parisc_device *sibling;
-	struct parisc_device *child;
-	struct parisc_driver *driver;	/* Driver for this device */
-	void		*sysdata;	/* Driver instance private data */
-	char		name[80];	/* The hardware description */
 	int		irq;
 
+	struct device	device;
 	char		hw_path;        /* The module number on this bus */
 	unsigned int	num_addrs;	/* some devices have additional address ranges. */
 	unsigned long	*addr;          /* which will be stored here */
@@ -66,10 +62,9 @@
 extern char *cpu_name_version[][2]; /* mapping from enum cpu_type to strings */
 
 struct parisc_driver {
-	struct parisc_driver *next;
-	char *name; 
 	const struct parisc_device_id *id_table;
 	int (*probe) (struct parisc_device *dev); /* New device discovered */
+	struct device_driver driver;
 };
 
 struct io_module {
@@ -128,6 +123,26 @@
 #define HPHW_FAULTY    31
 
 
+static inline struct parisc_device *to_parisc_dev(struct device *dev)
+{
+	return container_of(dev, struct parisc_device, device);
+}
+
+static inline struct parisc_driver *to_parisc_drv(struct driver *dev)
+{
+	return container_of(drv, struct parisc_driver, driver);
+}
+
+static inline void *parisc_get_drvdata(struct parisc_device *pa_dev)
+{
+	return dev_get_drvdata(pa_dev->device);
+}
+
+static inline void parisc_set_drvdata(struct parisc_device *pa_dev, void *ptr)
+{
+	dev_set_drvdata(pa_dev->device, ptr);
+}
+
 /* hardware.c: */
 extern const char *parisc_hardware_description(struct parisc_device_id *id);
 extern enum cpu_type parisc_get_cpu_type(unsigned long hversion);
@@ -135,12 +150,13 @@
 struct pci_dev;
 
 /* drivers.c: */
+extern struct bus_type parisc_bus_type;
 extern struct parisc_device *alloc_pa_dev(unsigned long hpa,
 		struct hardware_path *path);
 extern int register_parisc_device(struct parisc_device *dev);
 extern int register_parisc_driver(struct parisc_driver *driver);
 extern int count_parisc_driver(struct parisc_driver *driver);
-extern int unregister_parisc_driver(struct parisc_driver *driver);
+extern void unregister_parisc_driver(struct parisc_driver *driver);
 extern void walk_central_bus(void);
 extern void fixup_child_irqs(struct parisc_device *parent, int irqbase,
 		int (*choose)(struct parisc_device *parent));
diff -u -r linux-2.5.46/arch/parisc/kernel/drivers.c linux/arch/parisc/kernel/drivers.c
--- linux-2.5.46/arch/parisc/kernel/drivers.c	Mon Nov  4 21:35:43 2002
+++ linux/arch/parisc/kernel/drivers.c	Sat Nov  9 03:21:25 2002
@@ -27,17 +27,8 @@
 /* See comments in include/asm-parisc/pci.h */
 struct pci_dma_ops *hppa_dma_ops;
 
-static struct parisc_driver *pa_drivers;
 static struct parisc_device root;
 
-/* This lock protects the pa_drivers list _only_ since all parisc_devices
- * are registered before smp_init() is called.  If you wish to add devices
- * after that, this muct be serialised somehow.  I recommend a semaphore
- * rather than a spinlock since driver ->probe functions are allowed to
- * sleep (for example when allocating memory).
- */
-static spinlock_t pa_lock = SPIN_LOCK_UNLOCKED;
-
 #define for_each_padev(dev) \
 	for (dev = root.child; dev != NULL; dev = next_dev(dev))
 
@@ -53,20 +44,13 @@
  */
 struct parisc_device *next_dev(struct parisc_device *dev)
 {
-	if (dev->child) {
-		return check_dev(dev->child);
-	} else if (dev->sibling) {
-		return dev->sibling;
-	}
+	struct device *next =
+		list_entry(dev->device.next, struct device, g_list);
 
-	/* Exhausted tree at this level, time to go up. */
-	do {
-		dev = dev->parent;
-		if (dev && dev->sibling)
-			return dev->sibling;
-	} while (dev != &root);
-
-	return NULL;
+	if (next->parent == NULL) /* end of list.  back at root */
+		return NULL;
+	else
+		return container_of(next, struct parisc_device, device);
 }
 
 /**
@@ -74,8 +58,11 @@
  * @driver: the PA-RISC driver to try
  * @dev: the PA-RISC device to try
  */
-static int match_device(struct parisc_driver *driver, struct parisc_device *dev)
+static int
+parisc_match_device(struct device_driver *gendrv, struct device *gendev)
 {
+	struct parisc_driver *driver = to_parisc_driver(gendrv);
+	struct parisc_device *dev = to_parisc_device(gendev);
 	const struct parisc_device_id *ids;
 
 	for (ids = driver->id_table; ids->sversion; ids++) {
@@ -96,10 +83,17 @@
 	return 0;
 }
 
-static void claim_device(struct parisc_driver *driver, struct parisc_device *dev)
+static int parisc_device_probe(struct device *dev)
 {
-	dev->driver = driver;
-	request_mem_region(dev->hpa, 0x1000, driver->name);
+	struct parisc_device *pa_dev = to_parisc_device(dev);
+	struct device_driver *drv = dev->driver;
+	struct parisc_driver *pa_drv = to_parisc_driver(drv);
+	int err = (*pa_drv->probe)(pa_dev);
+
+	if (!err)
+		request_mem_region(dev->hpa, 0x1000, drv->name);
+
+	return err;
 }
 
 /**
@@ -108,37 +102,22 @@
  */
 int register_parisc_driver(struct parisc_driver *driver)
 {
-	struct parisc_device *device;
-
-	if (driver->next) {
-		printk(KERN_WARNING 
-		       "BUG: Skipping previously registered driver: %s\n",
-		       driver->name);
-		return 1;
-	}
-
-	for_each_padev(device) {
-		if (device->driver)
-			continue;
-		if (!match_device(driver, device))
-			continue;
-
-		if (driver->probe(device) < 0)
-			continue;
-		claim_device(driver, device);
-	}
+	driver->driver.bus_type = &parisc_bus_type;
+	driver->driver.probe = parisc_device_probe;
+	return driver_register(&driver->driver);
+}
 
-	/* Note that the list is in reverse order of registration.  This
-	 * may be significant if we ever actually support hotplug and have
-	 * multiple drivers capable of claiming the same chip.
-	 */
-
-	spin_lock(&pa_lock);
-	driver->next = pa_drivers;
-	pa_drivers = driver;
-	spin_unlock(&pa_lock);
+struct count_arg {
+	struct driver *driver;
+	int count;
+};
 
-	return 0;
+static void count_callback(struct device *dev, void *data)
+{
+	struct count_arg *arg = data;
+	
+	if (parisc_match_device(&arg->driver, dev))
+		(arg->count)++;
 }
 
 /**
@@ -148,17 +127,14 @@
  * Use by IOMMU support to "guess" the right size IOPdir.
  * Formula is something like memsize/(num_iommu * entry_size).
  */
-int count_parisc_driver(struct parisc_driver *driver)
+int count_parisc_driver(struct parisc_driver *pa_driver)
 {
-	struct parisc_device *device;
-	int cnt = 0;
-
-	for_each_padev(device) {
-		if (match_device(driver, device))
-			cnt++;
-	}
-
-	return cnt;
+	struct count_arg arg;
+  
+	arg.driver = &pa_driver->driver;
+	arg.count = 0;
+	bus_for_each_dev(arg.driver->bus, &arg, count_callback);
+	return arg.count;
 }
 
 
@@ -167,42 +143,9 @@
  * unregister_parisc_driver - Unregister this driver from the list of drivers
  * @driver: the PA-RISC driver to unregister
  */
-int unregister_parisc_driver(struct parisc_driver *driver)
+void unregister_parisc_driver(struct parisc_driver *driver)
 {
-	struct parisc_device *dev;
-
-	spin_lock(&pa_lock);
-
-	if (pa_drivers == driver) {
-		/* was head of list - update head */
-		pa_drivers = driver->next;
-	} else {
-		struct parisc_driver *prev = pa_drivers;
-
-		while (prev && driver != prev->next) {
-			prev = prev->next;
-		}
-
-		if (!prev) {
-			printk(KERN_WARNING "unregister_parisc_driver: %s wasn't registered\n", driver->name);
-		} else {
-			/* Drop driver from list */
-			prev->next = driver->next;
-			driver->next = NULL;
-		}
-
-	}
-
-	spin_unlock(&pa_lock);
-
-	for_each_padev(dev) {
-		if (dev->driver != driver)
-			continue;
-		dev->driver = NULL;
-		release_mem_region(dev->hpa, 0x1000);
-	}
-
-	return 0;
+	driver_unregister(&driver->driver);
 }
 
 static struct parisc_device *find_device_by_addr(unsigned long hpa)
@@ -335,7 +278,8 @@
 	memset(dev, 0, sizeof(*dev));
 	dev->hw_path = id;
 	dev->id.hw_type = HPHW_FAULTY;
-	dev->parent = parent;
+	dev->device.bus_type = &parisc_bus_type;
+	dev->device.parent = &parent->device;
 	dev->sibling = *insert;
 	*insert = dev;
 	return dev;
@@ -417,7 +361,7 @@
 	dev->hpa = hpa;
 	name = parisc_hardware_description(&dev->id);
 	if (name) {
-		strncpy(dev->name, name, sizeof(dev->name)-1);
+		strncpy(dev->device.name, name, sizeof(dev->device.name)-1);
 	}
 
 	return dev;
@@ -429,32 +373,11 @@
  *
  * Search the driver list for a driver that is willing to manage
  * this device.
+ * WARNING: This routine now returns 0 on success. --Adam J. Richter 2002.11.08
  */
 int register_parisc_device(struct parisc_device *dev)
 {
-	struct parisc_driver *driver;
-
-	if (!dev)
-		return 0;
-
-	if (dev->driver)
-		return 1;
-	
-	spin_lock(&pa_lock);
-
-	/* Locate a driver which agrees to manage this device.  */
-	for (driver = pa_drivers; driver; driver = driver->next) {
-		if (!match_device(driver,dev))
-			continue;
-		if (driver->probe(dev) == 0)
-			break;
-	}
-
-	if (driver != NULL) {
-		claim_device(driver, dev);
-	}
-	spin_unlock(&pa_lock);
-	return driver != NULL;
+	return device_register(&dev->device);
 }
 
 #define BC_PORT_MASK 0x8
@@ -588,3 +511,8 @@
 		print_parisc_device(dev);
 	}
 }
+
+struct bus_type parisc_bus_type = {
+	.name =		"parisc",
+	.match =	parisc_match_device,
+};
diff -u -r linux-2.5.46/arch/parisc/kernel/perf.c linux/arch/parisc/kernel/perf.c
--- linux-2.5.46/arch/parisc/kernel/perf.c	Mon Nov  4 21:35:43 2002
+++ linux/arch/parisc/kernel/perf.c	Sat Nov  9 03:13:23 2002
@@ -527,7 +527,7 @@
 	/* TODO: this only lets us access the first cpu.. what to do for SMP? */
 	cpu_device = cpu_data[0].dev;
 	printk("Performance monitoring counters enabled for %s\n",
-		cpu_data[0].dev->name);
+		cpu_data[0].dev->device.name);
 
 	return 0;
 }
diff -u -r linux-2.5.46/arch/parisc/kernel/processor.c linux/arch/parisc/kernel/processor.c
--- linux-2.5.46/arch/parisc/kernel/processor.c	Mon Nov  4 21:35:44 2002
+++ linux/arch/parisc/kernel/processor.c	Sat Nov  9 03:13:04 2002
@@ -344,7 +344,7 @@
 				"model name\t: %s\n",
 				 boot_cpu_data.pdc.sys_model_name,
 				 cpu_data[n].dev ? 
-				 cpu_data[n].dev->name : "Unknown" );
+				 cpu_data[n].dev->device.name : "Unknown" );
 
 		seq_printf(m, "hversion\t: 0x%08x\n"
 			        "sversion\t: 0x%08x\n",
@@ -370,9 +370,11 @@
 };
 
 static struct parisc_driver cpu_driver = {
-	name:		"CPU",
-	id_table:	processor_tbl,
-	probe:		processor_probe
+	.id_table =	processor_tbl,
+	.probe =	processor_probe
+	.driver = {
+		.name =		"CPU",
+	},
 };
 
 /**
diff -u -r linux-2.5.46/arch/parisc/kernel/setup.c linux/arch/parisc/kernel/setup.c
--- linux-2.5.46/arch/parisc/kernel/setup.c	Mon Nov  4 21:35:44 2002
+++ linux/arch/parisc/kernel/setup.c	Sat Nov  9 03:10:26 2002
@@ -216,24 +216,24 @@
 }
 
 static struct resource central_bus = {
-	name:	"Central Bus",
-	start:	(unsigned long)0xfffffffffff80000,
-	end:    (unsigned long)0xfffffffffffaffff,
-	flags:	IORESOURCE_MEM,
+	.name =		"Central Bus",
+	.start =	(unsigned long)0xfffffffffff80000,
+	.end =		(unsigned long)0xfffffffffffaffff,
+	.flags =	IORESOURCE_MEM,
 };
 
 static struct resource local_broadcast = {
-	name:	"Local Broadcast",
-	start:	(unsigned long)0xfffffffffffb0000,
-	end:	(unsigned long)0xfffffffffffdffff,
-	flags:	IORESOURCE_MEM,
+	.name =		"Local Broadcast",
+	.start =	(unsigned long)0xfffffffffffb0000,
+	.end =		(unsigned long)0xfffffffffffdffff,
+	.flags =	IORESOURCE_MEM,
 };
 
 static struct resource global_broadcast = {
-	name:	"Global Broadcast",
-	start:	(unsigned long)0xfffffffffffe0000,
-	end:	(unsigned long)0xffffffffffffffff,
-	flags:	IORESOURCE_MEM,
+	.name =		"Global Broadcast",
+	.start =	(unsigned long)0xfffffffffffe0000,
+	.end =		(unsigned long)0xffffffffffffffff,
+	.flags =	IORESOURCE_MEM,
 };
 
 int __init parisc_init_resources(void)
diff -u -r linux-2.5.46/drivers/parisc/sba_iommu.c linux/drivers/parisc/sba_iommu.c
--- linux-2.5.46/drivers/parisc/sba_iommu.c	Fri Nov  8 01:28:15 2002
+++ linux/drivers/parisc/sba_iommu.c	Fri Nov  8 01:43:35 2002
@@ -1988,7 +1988,7 @@
 		return(1);
 	}
 
-	dev->sysdata = (void *) sba_dev;
+	parisc_set_drvdata(dev, sba_dev);
 	memset(sba_dev, 0, sizeof(struct sba_device));
 
 	for(i=0; i<MAX_IOC; i++)
@@ -2041,7 +2041,7 @@
  */
 void * sba_get_iommu(struct parisc_device *pci_hba)
 {
-	struct sba_device *sba = (struct sba_device *) pci_hba->parent->sysdata;
+	struct sba_device *sba = parisc_get_drvdata(pci_hba->parent);
 	char t = pci_hba->parent->id.hw_type;
 	int iocnum = (pci_hba->hw_path >> 3);	/* rope # */
 

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [parisc-linux] Untested port of parisc_device to generic device interface
       [not found] <200211090128.RAA31693@adam.yggdrasil.com>
@ 2002-11-09  3:37 ` Matthew Wilcox
  0 siblings, 0 replies; 71+ messages in thread
From: Matthew Wilcox @ 2002-11-09  3:37 UTC (permalink / raw)
  To: Adam J. Richter
  Cc: James.Bottomley, andmike, hch, parisc-linux, Patrick Mochel,
	linux-kernel

On Fri, Nov 08, 2002 at 05:28:05PM -0800, Adam J. Richter wrote:
> 	To start with, here is another completely untested patch that
> I haven't even tried to compile that attempts to port the parisc device
> type to the generic device model.  I have deliberately not included any
> changes that would make it dependent on my generic DMA routine facility.

Actually I think the generic device model is crap.  It's failed to live
up to its promise of removing common fields from structs, it's introduced
a new composite filesystem and it's not helped in any concrete way.

Just look at pci_dev:

struct pci_dev {
        struct list_head global_list;   /* node in list of all PCI devices */
        struct list_head bus_list;      /* node in per-bus list */
        struct pci_bus  *bus;           /* bus this device is on */
        struct pci_bus  *subordinate;   /* bus this device bridges to */
        struct proc_dir_entry *procent; /* device entry in /proc/bus/pci */

        unsigned int    devfn;          /* encoded device & function index */
        struct pci_driver *driver;      /* which driver has allocated this devic
e */
        void            *driver_data;   /* data private to the driver */
        struct  device  dev;            /* Generic device interface */
        struct resource resource[DEVICE_COUNT_RESOURCE]; /* I/O and memory regio
ns + expansion ROMs */
        struct resource dma_resource[DEVICE_COUNT_DMA];
        struct resource irq_resource[DEVICE_COUNT_IRQ];
        char            name[90];       /* device name */
}

(there may be some more duplicate fields i've missed)

here's the duplicate fields in struct device:

struct device {
        struct list_head g_list;        /* node in depth-first order list */
        struct list_head node;          /* node in sibling list */
        struct list_head bus_list;      /* node in bus's list */
        struct list_head driver_list;
        struct list_head children;
        struct list_head intf_list;
        struct device   * parent;
        char    name[DEVICE_NAME_SIZE]; /* descriptive ascii string */
        char    bus_id[BUS_ID_SIZE];    /* position on parent bus */

        spinlock_t      lock;           /* lock for the device to ensure two
                                           different layers don't access it at
                                           the same time. */
        atomic_t        refcount;       /* refcount to make sure the device
                                         * persists for the right amount of time
 */

        struct bus_type * bus;          /* type of bus device is on */
        struct device_driver *driver;   /* which driver has allocated this
                                           device */
        void            *driver_data;   /* data private to the driver */
}

Oh, and _that_ embeds a struct kobject:

struct kobject {
        char                    name[KOBJ_NAME_LEN];
        atomic_t                refcount;
        struct list_head        entry;
        struct kobject          * parent;
        struct subsystem        * subsys;
        struct dentry           * dentry;
};

For fucks sake, this is ridiculous.  I haven't dared compare the relative
sizes of struct pci_dev between 2.5, 2.4 and 2.2, but this is sheer bloat.

I was hoping for something _incredibly_ simple from struct device.
Something to replace pci_alloc_consistent with device_alloc_consistent.
Something where I could look through the ancestors of a device to find
out whether it was under a CCIO or just a processor.  Something I could
query to find out whether it was an EISA, a GSC or a PCI device.

I'm disappointed this is trying to serve the needs of USB over the needs
of busses in the box.  I don't think it was even remotely smart to unify
USB with other busses.  And I think the PCI system has suffered the most.
I guess I'm so annoyed because I thought it might solve problems instead
of increasing the amount of user eyecandy.

> 	One question about the machine that has no consistent memory
> option: does it take PCI cards?  If so, then all PCI device drivers
> should theoretically use something like wback_fake_consistent.
> If not, then it sounds like the facility needs only to apply to
> generic DMA operations for "parisc" bus cards.

The only machines which can take any kind of PCI devices that don't
have consistent memory available to them are the T-class machines.
We have no plans to support these machines.  What you do need to watch
out for are machines such as the 735/755 which can take an NCR720 chip
in a non-coherent memory machine.  It is of course also used in machines
which are perfectly capable of allocating consistent memory (whether
through uncached mappings or a cache-coherent IO TLB).

-- 
Revolutions do not require corporate support.

^ permalink raw reply	[flat|nested] 71+ messages in thread

end of thread, other threads:[~2002-12-28 18:11 UTC | newest]

Thread overview: 71+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-11-09  4:51 [parisc-linux] Untested port of parisc_device to generic device interface Adam J. Richter
2002-11-09  5:21 ` Matthew Wilcox
2002-11-09  6:03   ` Greg KH
2002-11-09 15:33     ` J.E.J. Bottomley
2002-11-13  6:13       ` Greg KH
2002-11-13  7:46         ` Miles Bader
2002-11-13  7:52           ` Greg KH
2002-11-13  8:02             ` Miles Bader
2002-11-13  8:10               ` Greg KH
2002-11-13  8:26                 ` Miles Bader
2002-11-13  8:25                   ` Greg KH
2002-11-13  9:05                     ` Miles Bader
     [not found]               ` <miles@lsi.nec.co.jp>
2002-11-13 20:13                 ` Grant Grundler
2002-11-13 20:21                   ` J.E.J. Bottomley
2002-11-13 20:37                     ` Grant Grundler
2002-11-13 11:59             ` Ivan Kokshaysky
2002-11-13 12:36               ` Marc Zyngier
2002-11-13 16:32             ` Bjorn Helgaas
2002-11-13 17:23               ` J.E.J. Bottomley
2002-11-13 20:33                 ` Grant Grundler
2002-11-13 20:44                   ` J.E.J. Bottomley
2002-11-13 21:42                     ` Grant Grundler
2002-11-13 20:12             ` Grant Grundler
2002-11-09  7:58   ` Marc Zyngier
2002-11-09 18:04 ` Grant Grundler
  -- strict thread matches above, loose matches on Subject: below --
2002-12-18  3:01 [RFT][PATCH] generic device DMA implementation James Bottomley
2002-12-18  3:13 ` David Mosberger
2002-12-28 18:14 ` Russell King
2002-12-28 18:19   ` James Bottomley
2002-12-04 17:47 [RFC] " James Bottomley
2002-12-04 18:27 ` Jeff Garzik
2002-12-04 19:36   ` James Bottomley
2002-12-04 21:19 ` Miles Bader
2002-12-04 21:21 ` Miles Bader
2002-12-04 21:42   ` James Bottomley
2002-12-05  5:44     ` Miles Bader
2002-12-04 21:46   ` James Bottomley
2002-12-05  2:31     ` Miles Bader
2002-12-05  3:06       ` James Bottomley
2002-12-05  5:02       ` David Gibson
2002-12-05 11:15     ` Benjamin Herrenschmidt
2002-12-05 11:16       ` William Lee Irwin III
2002-12-05 15:12       ` James Bottomley
2002-12-05  0:47 ` David Gibson
2002-12-05  0:54   ` Jeff Garzik
2002-12-05  1:44   ` James Bottomley
2002-12-05  2:38     ` David Gibson
2002-12-05  3:13       ` James Bottomley
2002-12-05  5:05         ` David Gibson
2002-12-05 15:03           ` James Bottomley
2002-12-05 23:54             ` David Gibson
2002-12-05  3:17       ` Miles Bader
2002-12-05  6:06         ` David Gibson
2002-12-05  6:43           ` Miles Bader
2002-12-05 23:44             ` David Gibson
2002-12-06  2:23               ` Miles Bader
2002-12-05  3:41       ` Jeff Garzik
2002-12-05  6:04         ` David Gibson
2002-12-05 16:29           ` Jeff Garzik
2002-12-05 23:59             ` David Gibson
2002-12-05 11:08   ` Benjamin Herrenschmidt
2002-12-05 11:35     ` Russell King
2002-12-05 15:24       ` James Bottomley
2002-12-06  0:01     ` David Gibson
2002-11-10  5:20 [parisc-linux] Untested port of parisc_device to generic device interface Adam J. Richter
2002-11-10  1:50 Adam J. Richter
2002-11-10  0:23 Adam J. Richter
2002-11-10  2:01 ` J.E.J. Bottomley
2002-11-10  2:15   ` Matthew Wilcox
2002-11-09 12:22 Adam J. Richter
     [not found] <200211090128.RAA31693@adam.yggdrasil.com>
2002-11-09  3:37 ` Matthew Wilcox

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).