DMA from user space buffer/VIPT cache flushing wows (was: Minutes: 21 Sept,09 RMK meeting)

* DMA from user space buffer/VIPT cache flushing wows (was: Minutes: 21 Sept,09 RMK meeting)
       [not found]               ` <20091109101056.GA29621@flint.arm.linux.org.uk>
@ 2009-11-10 16:03                 ` Imre Deak
  2009-11-11 19:26                   ` Russell King
  2009-11-11 19:29                   ` Russell King
  0 siblings, 2 replies; 4+ messages in thread
From: Imre Deak @ 2009-11-10 16:03 UTC (permalink / raw)
  To: Russell King
  Cc: Paul Mundt, linux-arch, Doyu Hiroshi (Nokia-D/Helsinki),
	vikram.pandita, tony, rak, rhishi, r-woodruff2, laurent.pinchart,
	Syrjala Ville (Nokia-D/Helsinki)

On Mon, Nov 09, 2009 at 11:10:56AM +0100, ext Russell King wrote:
> On Mon, Nov 09, 2009 at 02:15:09AM +0200, Imre Deak wrote:
> > The problem with mlock is that in case of shared memory it needs to
> > be called in the context of each process that does flushing. This
> > I think complicates unnecessarily the quota management as we'd have
> > to increase the mlock quota for each such process.
> 
> We have to deal with the cache lines associated with the user addresses,
> otherwise we're not solving anything and userspace can't do any DMA.
> The easiest all-round solution is to operate on the user addresses.
> However, if those user PTEs can vanish beneath us, that's bad news.
> We have to have some way to lock them in while the cache operation
> occurs.

Yes but this is purely an ARM VIPT architecture specific issue and so
any solution should be contained in the kernel if at all possible. And
in this case it is possible using kernel addresses as you also stated.

> The only way to do that would be to hold the spinlock for the page table,
> but spinlocks are no-ops on UP (so the cache flush could be preempted,
> and the page swapped out).
> 
> > As far as I know
> > no other architectures require mlock for DMA'ing so it wouldn't be
> > nice to introduce arch specific tweaks in the process management
> > level.
> 
> Let me be totally clear about this: The Linux Kernel does *not* support
> user-driven DMA operations on any architecture.

By user-driven DMA you mean DMA'ing directly from an _arbitrary_ user space
buffer? The V4L2_MEMORY_USERPTR method supports this. That at least
contradicts with your statement.

> That's why no other architectures require mlock for DMA from userspace -
> the problem does not exist elsewhere because there is no one else doing
> this.  Everyone else writes proper kernel-side drivers, even if they're
> just a message passing API.

What do you mean by proper? Does the kernel support only the following two
DMA methods:

- directly from a buffer allocated by the driver and mapped by user space
- from an arbitrary user space buffer by first copying it to a secondary
  buffer allocated by the driver

If this is true it's not possible to DMA for example from an SHM buffer,
something done often for shared 3D pixel buffers.

> > I don't understand why can't we flush through the kernel address of
> > each page. I know you mentioned the aliasing issue before, but that
> > needs to be solved at other places too that flush through kernel
> > addresses, for example __flush_anon_page, couldn't this also work in
> > a similar way?
> 
> For __flush_anon_page, we only flush the user mapping if we have VIVT
> caches.  VIVT caches don't care about whether there's a mapping present
> and so don't oops the kernel if there isn't a page present.
> 
> For aliasing VIPT caches, we can get away with re-mapping a page at an
> address with the same cache colour as the user mapping, and flushing
> it there to get rid of user data - and so this avoids the problem of
> the user mapping disappearing beneath us.  This 'trick' is specific to
> aliasing VIPT caches only.
> 
> So, yes, we could do it this way, conditional on the cache type, and
> for VIPT, map each page into a high kernel address, operate on it, and
> unmap it, thereby eating through additional TLB entries for each page.

To me this seems to be still much better solution than the mlock way.
With mlocking you have to eat through additional TLB entries anyway,
since mlock will call __get_user_pages internally which does cache
flushing on ARM for each page through it's kernel address.

Additionally as I said we would need a kernel interface for flushing
user space buffers and mlock is not exposed to drivers. For that we
would also need to add reference counting for mlock.

--Imre

^ permalink raw reply	[flat|nested] 4+ messages in thread