On 2/16/21 4:39 AM, Nicolas Dufresne wrote: > Le lundi 15 février 2021 à 09:58 +0100, Christian König a écrit : >> Hi guys, >> >> we are currently working an Freesync and direct scan out from system >> memory on AMD APUs in A+A laptops. >> >> On problem we stumbled over is that our display hardware needs to scan >> out from uncached system memory and we currently don't have a way to >> communicate that through DMA-buf. >> >> For our specific use case at hand we are going to implement something >> driver specific, but the question is should we have something more >> generic for this? > > Hopefully I'm getting this right, but this makes me think of a long standing > issue I've met with Intel DRM and UVC driver. If I let the UVC driver allocate > the buffer, and import the resulting DMABuf (cacheable memory written with a cpu > copy in the kernel) into DRM, we can see cache artifact being displayed. While > if I use the DRM driver memory (dumb buffer in that case) it's clean because > there is a driver specific solution to that. > > There is no obvious way for userspace application to know what's is right/wrong > way and in fact it feels like the kernel could solve this somehow without having > to inform userspace (perhaps). > >> >> After all the system memory access pattern is a PCIe extension and as >> such something generic. >> >> Regards, >> Christian. > > Hi All, We also encountered the UVC cache issue on ARMv8 CPU in Mediatek SoC when using UVC dmabuf-export and feeding the dmabuf to the DRM display by the following GStreamer command: # gst-launch-1.0 v4l2src device=/dev/video0 io-mode=dmabuf ! kmssink UVC driver uses videobuf2-vmalloc to allocate buffers and is able to export them as dmabuf. But UVC uses memcpy() to fill the frame buffer by CPU without flushing the cache. So if the display hardware directly uses the buffer, the image shown on the screen will be dirty. Here are some experiments: 1. By doing some memory operations (e.g. devmem) when streaming the UVC, the issue is mitigated. I guess the cache is swapped rapidly. 2. By replacing the memcpy() with memcpy_flushcache() in the UVC driver, the issue disappears. 3. By adding .finish callback in videobuf2-vmalloc.c to flush the cache before returning the buffer, the issue disappears. It seems to lack a cache flush stage in either UVC or Display. We may also need communication between the producer and consumer. Then, they can decide who is responsible for the flushing to avoid flushing cache unconditionally leading to the performance impact. Regards, Andy Hsieh