On Fri, Jul 5, 2019 at 1:53 AM Gerd Hoffmann wrote: > > On Thu, Jul 04, 2019 at 12:17:48PM -0700, Chia-I Wu wrote: > > On Thu, Jul 4, 2019 at 4:10 AM Gerd Hoffmann wrote: > > > > > > Hi, > > > > > > > > - r = ttm_bo_reserve(&bo->tbo, true, false, NULL); > > > > > + r = reservation_object_lock_interruptible(bo->gem_base.resv, NULL); > > > > Can you elaborate a bit about how TTM keeps the BOs alive in, for > > > > example, virtio_gpu_transfer_from_host_ioctl? In that function, only > > > > three TTM functions are called: ttm_bo_reserve, ttm_bo_validate, and > > > > ttm_bo_unreserve. I am curious how they keep the BO alive. > > > > > > It can't go away between reserve and unreserve, and I think it also > > > can't be evicted then. Havn't checked how ttm implements that. > > Hm, but the vbuf using the BO outlives the reserve/unreserve section. > > The NO_EVICT flag applies only when the BO is still alive. Someone > > needs to hold a reference to the BO to keep it alive, otherwise the BO > > can go away before the vbuf is retired. > > Note that patches 14+15 rework virtio_gpu_transfer_*_ioctl to keep > gem reference until the command is finished and patch 17 drops > virtio_gpu_object_{reserve,unreserve} altogether. > > Maybe I should try to reorder the series, then squash 6+17 to reduce > confusion. I suspect that'll cause quite a few conflicts though ... This may be well-known and is what you meant by "the fence keeps the bo alive", but I finally realize that ttm_bo_put delays the deletion of a BO when it is busy. In the current design, vbuf does not hold references to its BOs. Nor do fences. It is possible for a BO to lose all its references and gets virtio_gpu_gem_free_object()ed while it is still busy. The key is ttm_bo_put. ttm_bo_put calls ttm_bo_cleanup_refs_or_queue to decide whether to delete the BO immediately (when the BO is already idle) or to queue the BO to a delayed delete list (when the BO is still busy). If a BO is queued to the delayed delete list, ttm_bo_delayed_delete is called every 10ms (HZ/100 to be exact) to scan through the list and delete idled BOs. I wrote a simple test (attached) and added a bunch of printk's to confirm this. Anyway, I believe the culprit is patch 11, when we switch from ttm_bo_put to drm_gem_shmem_free_object to free a BO whose last reference is gone. The deletion becomes immediately after the switch. We need to fix vbuf to refcount its BOs before we can do the switch. > > cheers, > Gerd >