* please revert "drm/msm: Use the correct dma_sync calls in msm_gem" @ 2019-08-02 21:37 Christoph Hellwig 2019-08-02 22:06 ` Rob Clark 0 siblings, 1 reply; 4+ messages in thread From: Christoph Hellwig @ 2019-08-02 21:37 UTC (permalink / raw) To: Rob Clark; +Cc: Stephan Gerhold, Rob Clark, Sean Paul, linux-kernel, torvalds Hi Rob, I just saw your above commit in Linus' tree, which is completely bogus and misunderstand the DMA API. Next time you have any issues please Cc the relevant maintainers and mailing lists. But even more importantly get_dma_ops is an completely internal API, and whatevet your helpers are trying to do is bad - if the dma wasn't mapped at the time you call the new sync_for_device helper, this is broken because you can't call dma_sync_sg_for_device on unmapped address. If it was mapped it is bogus as well as you can't double map it. Please describe what you're actually trying to fix and we're going to work on a proper solution, but this shot from the hip is just going to make things worse. ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: please revert "drm/msm: Use the correct dma_sync calls in msm_gem" 2019-08-02 21:37 please revert "drm/msm: Use the correct dma_sync calls in msm_gem" Christoph Hellwig @ 2019-08-02 22:06 ` Rob Clark 2019-08-03 6:19 ` Christoph Hellwig 0 siblings, 1 reply; 4+ messages in thread From: Rob Clark @ 2019-08-02 22:06 UTC (permalink / raw) To: Christoph Hellwig; +Cc: Stephan Gerhold, Sean Paul, LKML, torvalds On Fri, Aug 2, 2019 at 2:37 PM Christoph Hellwig <hch@lst.de> wrote: > > Hi Rob, > > I just saw your above commit in Linus' tree, which is completely > bogus and misunderstand the DMA API. Next time you have any issues > please Cc the relevant maintainers and mailing lists. But even > more importantly get_dma_ops is an completely internal API, and > whatevet your helpers are trying to do is bad - if the dma wasn't > mapped at the time you call the new sync_for_device helper, this > is broken because you can't call dma_sync_sg_for_device on unmapped > address. If it was mapped it is bogus as well as you can't double > map it. Please describe what you're actually trying to fix and we're > going to work on a proper solution, but this shot from the hip is just > going to make things worse. Sorry, this is just a temporary band-aid for v5.3 to get things working again. Yes, I realize it is a complete hack. The root problem is that I'm using the DMA API in the first place. I don't actually use the DMA API to map buffers, for various reasons, but instead manage the iommu_domain directly. Because arm/arm64 cache ops are not exported to modules, so currently I need to abuse the DMA API for cache operations (really just to clean pages if I need to mmap them uncached/writecombine). Originally I was doing that w/ dma_{map,unmap}_sg. But to avoid debug splats I switched that to dma_sync_sg (see 0036bc73ccbe7e600a3468bf8e8879b122252274). But now it seems the dma-direct ops are unhappy w/ dma_sync without a dma_map (AFAICT). (On some generations of hw, the iommu is attached to the device node that maps to the drm device, which is passed to dma_map/dma_sync. On other generations the iommu is attached to a sub-device. Changing this would break dtb compatibility.. so for now I need to handle both iommu-ops and direct-ops cases.) For v5.4, my plan was to revisit (at least for arm64) exposing cache ops to drivers, so I can use drm_clflush_* instead, and stop abusing DMA API. BR, -R ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: please revert "drm/msm: Use the correct dma_sync calls in msm_gem" 2019-08-02 22:06 ` Rob Clark @ 2019-08-03 6:19 ` Christoph Hellwig 2019-08-03 14:47 ` Rob Clark 0 siblings, 1 reply; 4+ messages in thread From: Christoph Hellwig @ 2019-08-03 6:19 UTC (permalink / raw) To: Rob Clark; +Cc: Christoph Hellwig, Stephan Gerhold, Sean Paul, LKML, torvalds On Fri, Aug 02, 2019 at 03:06:10PM -0700, Rob Clark wrote: > Sorry, this is just a temporary band-aid for v5.3 to get things > working again. Yes, I realize it is a complete hack. My main problem is here that you badly hack a around a problem without talking to the relevant maintainers, and by abusing even more internal APIs. As said get_dma_ops isn't really for driver use (although we have a few that use it for not quite as bad reason we are trying to get rid off). And as also said your abuse of the DMA API will blow up with dma-debug use quite badly. You might also corrupt the dma_address in the scatterlist in ways that aren't intended - as the call to dma_map_sg will allocate new iova space you are getting different results from whatever you expect to actually get from your iommu API usage. This might or might no matter in the end, but you really should consult the maintainers first. > The root problem is that I'm using the DMA API in the first place. I > don't actually use the DMA API to map buffers, for various reasons, > but instead manage the iommu_domain directly. Yes, and this has been going on for years, without any obvious attempt to address it at the API level before.. > Because arm/arm64 cache ops are not exported to modules, so currently > I need to abuse the DMA API for cache operations (really just to clean > pages if I need to mmap them uncached/writecombine). Originally I was > doing that w/ dma_{map,unmap}_sg. But to avoid debug splats I > switched that to dma_sync_sg (see > 0036bc73ccbe7e600a3468bf8e8879b122252274). But now it seems the > dma-direct ops are unhappy w/ dma_sync without a dma_map (AFAICT). Russell has been very strict about not exporting the cache ops, and all for the right reasons. Cache maintainance for not dma coherent devices is hard, and without a proper API that has arch input for even which calls are used for cache flushing chances of bugs are extremely high. I see two proper ways out of this mess: either we actually make msm use the DMA API, so I'd be curious of what is missing that forces you to use the low-level iommu API. Or we need to enhance the iommu API with a similar ownership concepts as the DMA API. Which probably is a good thing even if we move msm over to the DMA API. > (On some generations of hw, the iommu is attached to the device node > that maps to the drm device, which is passed to dma_map/dma_sync. On > other generations the iommu is attached to a sub-device. Changing > this would break dtb compatibility.. so for now I need to handle both > iommu-ops and direct-ops cases.) Or you always call call on a struct device that has the iommu, that is match on the generic, and pick a different device. That would in many ways seem preferably over the current hack, even if that also is just a horribly band-aid. ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: please revert "drm/msm: Use the correct dma_sync calls in msm_gem" 2019-08-03 6:19 ` Christoph Hellwig @ 2019-08-03 14:47 ` Rob Clark 0 siblings, 0 replies; 4+ messages in thread From: Rob Clark @ 2019-08-03 14:47 UTC (permalink / raw) To: Christoph Hellwig; +Cc: Stephan Gerhold, Sean Paul, LKML, torvalds On Fri, Aug 2, 2019 at 11:19 PM Christoph Hellwig <hch@lst.de> wrote: > > On Fri, Aug 02, 2019 at 03:06:10PM -0700, Rob Clark wrote: > > Sorry, this is just a temporary band-aid for v5.3 to get things > > working again. Yes, I realize it is a complete hack. > > My main problem is here that you badly hack a around a problem without > talking to the relevant maintainers, and by abusing even more internal > APIs. As said get_dma_ops isn't really for driver use (although we > have a few that use it for not quite as bad reason we are trying to get > rid off). And as also said your abuse of the DMA API will blow up > with dma-debug use quite badly. You might also corrupt the dma_address > in the scatterlist in ways that aren't intended - as the call to > dma_map_sg will allocate new iova space you are getting different > results from whatever you expect to actually get from your iommu API > usage. This might or might no matter in the end, but you really should > consult the maintainers first. > > > The root problem is that I'm using the DMA API in the first place. I > > don't actually use the DMA API to map buffers, for various reasons, > > but instead manage the iommu_domain directly. > > Yes, and this has been going on for years, without any obvious attempt > to address it at the API level before.. 99% of my time goes to mesa and r/e, so having the argument about dealing w/ cache directly simply wasn't a big enough fire to deal with until now, unfortunately. (Admittedly, there is room here for someone with more bandwidth to take on drm/msm maintainer role.. but someone needs to do it. Sean has been pitching in on the display side more recently, which has been a big help.) > > Because arm/arm64 cache ops are not exported to modules, so currently > > I need to abuse the DMA API for cache operations (really just to clean > > pages if I need to mmap them uncached/writecombine). Originally I was > > doing that w/ dma_{map,unmap}_sg. But to avoid debug splats I > > switched that to dma_sync_sg (see > > 0036bc73ccbe7e600a3468bf8e8879b122252274). But now it seems the > > dma-direct ops are unhappy w/ dma_sync without a dma_map (AFAICT). > > Russell has been very strict about not exporting the cache ops, and all > for the right reasons. Cache maintainance for not dma coherent devices > is hard, and without a proper API that has arch input for even which > calls are used for cache flushing chances of bugs are extremely high. > > I see two proper ways out of this mess: either we actually make msm > use the DMA API, so I'd be curious of what is missing that forces you > to use the low-level iommu API. Or we need to enhance the iommu API > with a similar ownership concepts as the DMA API. Which probably is > a good thing even if we move msm over to the DMA API. There are a few reasons we need to manage the GPU's address space directly, most of which are stalled on some iommu changes, that I wish I had more time to push for. In particular, we need to move to per-context pagetables (so each GL context has it's own GPU address space). Once we get there, we'd also like to enable SVM/SVA so GPU can share address space with the userspace process (which requires stall/resume and iommu fault handler to run in a context that can sleep). Fitting that into the DMA API doesn't really make sense to me. I'm not entirely sure that fitting cache maintenance into the IOMMU makes a huge amount of sense either, since the issue is really about the CPU cache. And I doubt the GPU is going to fit into a nice policy of page ownership. There are a number of cases where both CPU and GPU are accessing the same buffers, and unmapping/remapping to iommu is either not possible (because GPU is actively accessing another part of the buffer) or prohibitive from a performance standpoint. As far as cache maintenance being hard, I'm not really sure I buy that.. at least not for arm64. (And probably not even w/ the limited # of armv7 cores that can be paired with drm/msm.) > > (On some generations of hw, the iommu is attached to the device node > > that maps to the drm device, which is passed to dma_map/dma_sync. On > > other generations the iommu is attached to a sub-device. Changing > > this would break dtb compatibility.. so for now I need to handle both > > iommu-ops and direct-ops cases.) > > Or you always call call on a struct device that has the iommu, that > is match on the generic, and pick a different device. That would in > many ways seem preferably over the current hack, even if that also is > just a horribly band-aid. I suppose picking a device w/ iommu would *mostly* work, except for a2xx (which has no IOMMU, but has it's own internal GPUMMU instead.. so there is no device with iommu ops) We could perhaps, based on compatible, set a flag to use either dma_sync_* or dma_map_* which would avoid using get_dma_ops() (but otherwise doesn't seem like much of an improvement, I guess) BR, -R ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2019-08-03 14:47 UTC | newest] Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-08-02 21:37 please revert "drm/msm: Use the correct dma_sync calls in msm_gem" Christoph Hellwig 2019-08-02 22:06 ` Rob Clark 2019-08-03 6:19 ` Christoph Hellwig 2019-08-03 14:47 ` Rob Clark
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).