> how about just adding a wrapper for pin function as below? I considered this as well and don't think it will work reliable. We could use it as a band aid for this specific problem, but in general we need to improve the handling in TTM to resolve those kind of resource conflicts. Regards, Christian. Am 23.04.19 um 17:09 schrieb Zhou, David(ChunMing): > >3. If we have a ticket we grab a reference to the first BO on the > LRU, drop the LRU lock and try to grab the reservation lock with the > ticket. > > The BO on LRU is already locked by cs user, can it be dropped here by > DC user? and then DC user grab its lock with ticket, how does CS grab > it again? > > If you think waiting in ttm has this risk, how about just adding a > wrapper for pin function as below? > amdgpu_get_pin_bo_timeout() > { > do { > amdgpo_bo_reserve(); > r=amdgpu_bo_pin(); > > if(!r) >         break; > amdgpu_bo_unreserve(); > timeout--; > > } while(timeout>0); > > } > > -------- Original Message -------- > Subject: Re: [PATCH] ttm: wait mem space if user allow while gpu busy > From: Christian König > To: "Zhou, David(ChunMing)" ,"Koenig, Christian" ,"Liang, Prike" > ,dri-devel@lists.freedesktop.org > CC: > > Well that's not so easy of hand. > > The basic problem here is that when you busy wait at this place you > can easily run into situations where application A busy waits for B > while B busy waits for A -> deadlock. > > So what we need here is the deadlock detection logic of the ww_mutex. > To use this we at least need to do the following steps: > > 1. Reserve the BO in DC using a ww_mutex ticket (trivial). > > 2. If we then run into this EBUSY condition in TTM check if the BO we > need memory for (or rather the ww_mutex of its reservation object) has > a ticket assigned. > > 3. If we have a ticket we grab a reference to the first BO on the LRU, > drop the LRU lock and try to grab the reservation lock with the ticket. > > 4. If getting the reservation lock with the ticket succeeded we check > if the BO is still the first one on the LRU in question (the BO could > have moved). > > 5. If the BO is still the first one on the LRU in question we try to > evict it as we would evict any other BO. > > 6. If any of the "If's" above fail we just back off and return -EBUSY. > > Steps 2-5 are certainly not trivial, but doable as far as I can see. > > Have fun :) > Christian. > > Am 23.04.19 um 15:19 schrieb Zhou, David(ChunMing): >> How about adding more condition ctx->resv inline to address your >> concern? As well as don't wait from same user, shouldn't lead to >> deadlock. >> >> Otherwise, any other idea? >> >> -------- Original Message -------- >> Subject: Re: [PATCH] ttm: wait mem space if user allow while gpu busy >> From: Christian König >> To: "Liang, Prike" ,"Zhou, David(ChunMing)" >> ,dri-devel@lists.freedesktop.org >> CC: >> >> Well that is certainly a NAK because it can lead to deadlock in the >> memory management. >> >> You can't just busy wait with all those locks held. >> >> Regards, >> Christian. >> >> Am 23.04.19 um 03:45 schrieb Liang, Prike: >> > Acked-by: Prike Liang >> > >> > Thanks, >> > Prike >> > -----Original Message----- >> > From: Chunming Zhou >> > Sent: Monday, April 22, 2019 6:39 PM >> > To: dri-devel@lists.freedesktop.org >> > Cc: Liang, Prike ; Zhou, David(ChunMing) >> >> > Subject: [PATCH] ttm: wait mem space if user allow while gpu busy >> > >> > heavy gpu job could occupy memory long time, which could lead to >> other user fail to get memory. >> > >> > Change-Id: I0b322d98cd76e5ac32b00462bbae8008d76c5e11 >> > Signed-off-by: Chunming Zhou >> > --- >> >   drivers/gpu/drm/ttm/ttm_bo.c | 6 ++++-- >> >   1 file changed, 4 insertions(+), 2 deletions(-) >> > >> > diff --git a/drivers/gpu/drm/ttm/ttm_bo.c >> b/drivers/gpu/drm/ttm/ttm_bo.c index 7c484729f9b2..6c596cc24bec 100644 >> > --- a/drivers/gpu/drm/ttm/ttm_bo.c >> > +++ b/drivers/gpu/drm/ttm/ttm_bo.c >> > @@ -830,8 +830,10 @@ static int ttm_bo_mem_force_space(struct >> ttm_buffer_object *bo, >> >                if (mem->mm_node) >> >                        break; >> >                ret = ttm_mem_evict_first(bdev, mem_type, place, ctx); >> > -             if (unlikely(ret != 0)) >> > -                     return ret; >> > +             if (unlikely(ret != 0)) { >> > +                     if (!ctx || ctx->no_wait_gpu || ret != -EBUSY) >> > +                             return ret; >> > +             } >> >        } while (1); >> >        mem->mem_type = mem_type; >> >        return ttm_bo_add_move_fence(bo, man, mem); >> > -- >> > 2.17.1 >> > >> > _______________________________________________ >> > dri-devel mailing list >> > dri-devel@lists.freedesktop.org >> > https://lists.freedesktop.org/mailman/listinfo/dri-devel >> >> >> _______________________________________________ >> dri-devel mailing list >> dri-devel@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/dri-devel > > > _______________________________________________ > dri-devel mailing list > dri-devel@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel