On 2018年01月26日 22:35, Christian König wrote: > I just realized that a change I'm thinking about for a while would > solve your problem as well, but keep concurrent allocation possible. > > See ttm_mem_evict_first() unlocks the BO after evicting it: >>         ttm_bo_del_from_lru(bo); >>         spin_unlock(&glob->lru_lock); >> >>         ret = ttm_bo_evict(bo, ctx); >>         if (locked) { >>                 ttm_bo_unreserve(bo); <-------- here >>         } else { >>                 spin_lock(&glob->lru_lock); >>                 ttm_bo_add_to_lru(bo); >>                 spin_unlock(&glob->lru_lock); >>         } >> >>         kref_put(&bo->list_kref, ttm_bo_release_list); > > The effect is that in your example process C can not only beat process > B once, but many many times because we run into a ping/pong situation > where B evicts resources while C moves them back in. For ping/pong case, I want to disable busy placement for allocation period, only enable it for cs bo validation. > > For a while now I'm thinking about dropping those reservations only > after the original allocation succeeded. > > The effect would be that process C can still beat process B initially, > but sooner or process B would evict some resources from process C as > well and then it can succeed with its allocation. If it is from process C cs validation, process B still need evict the resource only after process C command submission completion. > > The problem is for this approach to work we need to core change to the > ww_mutexes to be able to handle this efficiently. Yes, ww_mutex doesn't support this net lock, which easily deadlock without ticket and class. So I think preventing validation on same place is a simpler way: process B bo's place is fpfn~lpfn, it will only try to evict LRU BOs in that range, while eviction, we just prevent those validation to this range(fpfn~lpfn), if out of this range, the allocation/validation still can be go on. Any negative? Regards, David Zhou > > Regards, > Christian. > > Am 26.01.2018 um 14:59 schrieb Christian König: >> I know, but this has the same effect. You prevent concurrent >> allocation from happening. >> >> What we could do is to pipeline reusing of deleted memory as well, >> this makes it less likely to cause the problem you are seeing because >> the evicting processes doesn't need to block for deleted BOs any more. >> >> But that other processes can grab memory during eviction is >> intentional. Otherwise greedy processes would completely dominate >> command submission. >> >> Regards, >> Christian. >> >> Am 26.01.2018 um 14:50 schrieb Zhou, David(ChunMing): >>> I don't want to prevent all, my new approach is to prevent the later >>> allocation is trying and ahead of front to get the memory space that >>> the front made from eviction. >>> >>> >>> 发自坚果 Pro >>> >>> Christian K鰊ig 于 2018年1月26日 >>> 下午9:24写道: >>> >>> Yes, exactly that's the problem. >>> >>> See when you want to prevent a process B from allocating the memory >>> process A has evicted, you need to prevent all concurrent allocation. >>> >>> And we don't do that because it causes a major performance drop. >>> >>> Regards, >>> Christian. >>> >>> Am 26.01.2018 um 14:21 schrieb Zhou, David(ChunMing): >>>> You patch will prevent concurrent allocation, and will result in >>>> allocation performance drop much. >>>> >>>> 发自坚果 Pro >>>> >>>> Christian K鰊ig 于 2018年1月26日 >>>> 下午9:04写道: >>>> >>>> Attached is what you actually want to do cleanly implemented. But >>>> as I said this is a NO-GO. >>>> >>>> Regards, >>>> Christian. >>>> >>>> Am 26.01.2018 um 13:43 schrieb Christian König: >>>>>> After my investigation, this issue should be detect of TTM design >>>>>> self, which breaks scheduling balance. >>>>> Yeah, but again. This is indented design we can't change easily. >>>>> >>>>> Regards, >>>>> Christian. >>>>> >>>>> Am 26.01.2018 um 13:36 schrieb Zhou, David(ChunMing): >>>>>> I am off work, so reply mail by phone, the format could not be text. >>>>>> >>>>>> back to topic itself: >>>>>> the problem indeed happen on amdgpu driver, someone reports me >>>>>> that application runs with two instances, the performance are >>>>>> different. >>>>>> I also reproduced the issue with unit test(bo_eviction_test). >>>>>> They always think our scheduler isn't working as expected. >>>>>> >>>>>> After my investigation, this issue should be detect of TTM design >>>>>> self, which breaks scheduling balance. >>>>>> >>>>>> Further, if we run containers for our gpu, container A could run >>>>>> high score, container B runs low score with same benchmark. >>>>>> >>>>>> So this is bug that we need fix. >>>>>> >>>>>> Regards, >>>>>> David Zhou >>>>>> >>>>>> 发自坚果 Pro >>>>>> >>>>>> Christian K鰊ig 于 2018年1月26日 >>>>>> 下午6:31写道: >>>>>> >>>>>> Am 26.01.2018 um 11:22 schrieb Chunming Zhou: >>>>>> > there is a scheduling balance issue about get node like: >>>>>> > a. process A allocates full memory and use it for submission. >>>>>> > b. process B tries to allocates memory, will wait for process A >>>>>> BO idle in eviction. >>>>>> > c. process A completes the job, process B eviction will put >>>>>> process A BO node, >>>>>> > but in the meantime, process C is comming to allocate BO, whill >>>>>> directly get node successfully, and do submission, >>>>>> > process B will again wait for process C BO idle. >>>>>> > d. repeat the above setps, process B could be delayed much more. >>>>>> > >>>>>> > later allocation must not be ahead of front in same place. >>>>>> >>>>>> Again NAK to the whole approach. >>>>>> >>>>>> At least with amdgpu the problem you described above never occurs >>>>>> because evictions are pipelined operations. We could only block for >>>>>> deleted regions to become free. >>>>>> >>>>>> But independent of that incoming memory requests while we make >>>>>> room for >>>>>> eviction are intended to be served first. >>>>>> >>>>>> Changing that is certainly a no-go cause that would favor memory >>>>>> hungry >>>>>> applications over small clients. >>>>>> >>>>>> Regards, >>>>>> Christian. >>>>>> >>>>>> > >>>>>> > Change-Id: I3daa892e50f82226c552cc008a29e55894a98f18 >>>>>> > Signed-off-by: Chunming Zhou >>>>>> > --- >>>>>> >   drivers/gpu/drm/ttm/ttm_bo.c    | 69 >>>>>> +++++++++++++++++++++++++++++++++++++++-- >>>>>> >   include/drm/ttm/ttm_bo_api.h    |  7 +++++ >>>>>> >   include/drm/ttm/ttm_bo_driver.h |  7 +++++ >>>>>> >   3 files changed, 80 insertions(+), 3 deletions(-) >>>>>> > >>>>>> > diff --git a/drivers/gpu/drm/ttm/ttm_bo.c >>>>>> b/drivers/gpu/drm/ttm/ttm_bo.c >>>>>> > index d33a6bb742a1..558ec2cf465d 100644 >>>>>> > --- a/drivers/gpu/drm/ttm/ttm_bo.c >>>>>> > +++ b/drivers/gpu/drm/ttm/ttm_bo.c >>>>>> > @@ -841,6 +841,58 @@ static int ttm_bo_add_move_fence(struct >>>>>> ttm_buffer_object *bo, >>>>>> >        return 0; >>>>>> >   } >>>>>> > >>>>>> > +static void ttm_man_init_waiter(struct ttm_bo_waiter *waiter, >>>>>> > +                             struct ttm_buffer_object *bo, >>>>>> > +                             const struct ttm_place *place) >>>>>> > +{ >>>>>> > +     waiter->tbo = bo; >>>>>> > +     memcpy((void *)&waiter->place, (void *)place, >>>>>> sizeof(*place)); >>>>>> > + INIT_LIST_HEAD(&waiter->list); >>>>>> > +} >>>>>> > + >>>>>> > +static void ttm_man_add_waiter(struct ttm_mem_type_manager *man, >>>>>> > +                            struct ttm_bo_waiter *waiter) >>>>>> > +{ >>>>>> > +     if (!waiter) >>>>>> > +             return; >>>>>> > +     spin_lock(&man->wait_lock); >>>>>> > +     list_add_tail(&waiter->list, &man->waiter_list); >>>>>> > + spin_unlock(&man->wait_lock); >>>>>> > +} >>>>>> > + >>>>>> > +static void ttm_man_del_waiter(struct ttm_mem_type_manager *man, >>>>>> > +                            struct ttm_bo_waiter *waiter) >>>>>> > +{ >>>>>> > +     if (!waiter) >>>>>> > +             return; >>>>>> > +     spin_lock(&man->wait_lock); >>>>>> > +     if (!list_empty(&waiter->list)) >>>>>> > + list_del(&waiter->list); >>>>>> > + spin_unlock(&man->wait_lock); >>>>>> > +     kfree(waiter); >>>>>> > +} >>>>>> > + >>>>>> > +int ttm_man_check_bo(struct ttm_mem_type_manager *man, >>>>>> > +                  struct ttm_buffer_object *bo, >>>>>> > +                  const struct ttm_place *place) >>>>>> > +{ >>>>>> > +     struct ttm_bo_waiter *waiter, *tmp; >>>>>> > + >>>>>> > +     spin_lock(&man->wait_lock); >>>>>> > +     list_for_each_entry_safe(waiter, tmp, &man->waiter_list, >>>>>> list) { >>>>>> > +             if ((bo != waiter->tbo) && >>>>>> > +                 ((place->fpfn >= waiter->place.fpfn && >>>>>> > +                   place->fpfn <= waiter->place.lpfn) || >>>>>> > +                  (place->lpfn <= waiter->place.lpfn && >>>>>> place->lpfn >= >>>>>> > + waiter->place.fpfn))) >>>>>> > +                 goto later_bo; >>>>>> > +     } >>>>>> > + spin_unlock(&man->wait_lock); >>>>>> > +     return true; >>>>>> > +later_bo: >>>>>> > + spin_unlock(&man->wait_lock); >>>>>> > +     return false; >>>>>> > +} >>>>>> >   /** >>>>>> >    * Repeatedly evict memory from the LRU for @mem_type until >>>>>> we create enough >>>>>> >    * space, or we've evicted everything and there isn't enough >>>>>> space. >>>>>> > @@ -853,17 +905,26 @@ static int ttm_bo_mem_force_space(struct >>>>>> ttm_buffer_object *bo, >>>>>> >   { >>>>>> >        struct ttm_bo_device *bdev = bo->bdev; >>>>>> >        struct ttm_mem_type_manager *man = &bdev->man[mem_type]; >>>>>> > +     struct ttm_bo_waiter waiter; >>>>>> >        int ret; >>>>>> > >>>>>> > +     ttm_man_init_waiter(&waiter, bo, place); >>>>>> > +     ttm_man_add_waiter(man, &waiter); >>>>>> >        do { >>>>>> >                ret = (*man->func->get_node)(man, bo, place, mem); >>>>>> > -             if (unlikely(ret != 0)) >>>>>> > +             if (unlikely(ret != 0)) { >>>>>> > + ttm_man_del_waiter(man, &waiter); >>>>>> >                        return ret; >>>>>> > -             if (mem->mm_node) >>>>>> > +             } >>>>>> > +             if (mem->mm_node) { >>>>>> > + ttm_man_del_waiter(man, &waiter); >>>>>> >                        break; >>>>>> > +             } >>>>>> >                ret = ttm_mem_evict_first(bdev, mem_type, place, >>>>>> ctx); >>>>>> > -             if (unlikely(ret != 0)) >>>>>> > +             if (unlikely(ret != 0)) { >>>>>> > + ttm_man_del_waiter(man, &waiter); >>>>>> >                        return ret; >>>>>> > +             } >>>>>> >        } while (1); >>>>>> >        mem->mem_type = mem_type; >>>>>> >        return ttm_bo_add_move_fence(bo, man, mem); >>>>>> > @@ -1450,6 +1511,8 @@ int ttm_bo_init_mm(struct ttm_bo_device >>>>>> *bdev, unsigned type, >>>>>> >        man->use_io_reserve_lru = false; >>>>>> > mutex_init(&man->io_reserve_mutex); >>>>>> > spin_lock_init(&man->move_lock); >>>>>> > + spin_lock_init(&man->wait_lock); >>>>>> > + INIT_LIST_HEAD(&man->waiter_list); >>>>>> > INIT_LIST_HEAD(&man->io_reserve_lru); >>>>>> > >>>>>> >        ret = bdev->driver->init_mem_type(bdev, type, man); >>>>>> > diff --git a/include/drm/ttm/ttm_bo_api.h >>>>>> b/include/drm/ttm/ttm_bo_api.h >>>>>> > index 2cd025c2abe7..0fce4dbd02e7 100644 >>>>>> > --- a/include/drm/ttm/ttm_bo_api.h >>>>>> > +++ b/include/drm/ttm/ttm_bo_api.h >>>>>> > @@ -40,6 +40,7 @@ >>>>>> >   #include >>>>>> >   #include >>>>>> >   #include >>>>>> > +#include >>>>>> > >>>>>> >   struct ttm_bo_device; >>>>>> > >>>>>> > @@ -232,6 +233,12 @@ struct ttm_buffer_object { >>>>>> >        struct mutex wu_mutex; >>>>>> >   }; >>>>>> > >>>>>> > +struct ttm_bo_waiter { >>>>>> > +     struct ttm_buffer_object *tbo; >>>>>> > +     struct ttm_place place; >>>>>> > +     struct list_head list; >>>>>> > +}; >>>>>> > + >>>>>> >   /** >>>>>> >    * struct ttm_bo_kmap_obj >>>>>> >    * >>>>>> > diff --git a/include/drm/ttm/ttm_bo_driver.h >>>>>> b/include/drm/ttm/ttm_bo_driver.h >>>>>> > index 9b417eb2df20..dc6b8b4c9e06 100644 >>>>>> > --- a/include/drm/ttm/ttm_bo_driver.h >>>>>> > +++ b/include/drm/ttm/ttm_bo_driver.h >>>>>> > @@ -293,6 +293,10 @@ struct ttm_mem_type_manager { >>>>>> >        bool io_reserve_fastpath; >>>>>> >        spinlock_t move_lock; >>>>>> > >>>>>> > +     /* waiters in list */ >>>>>> > +     spinlock_t wait_lock; >>>>>> > +     struct list_head waiter_list; >>>>>> > + >>>>>> >        /* >>>>>> >         * Protected by @io_reserve_mutex: >>>>>> >         */ >>>>>> > @@ -748,6 +752,9 @@ int ttm_bo_mem_space(struct >>>>>> ttm_buffer_object *bo, >>>>>> >                     struct ttm_mem_reg *mem, >>>>>> >                     struct ttm_operation_ctx *ctx); >>>>>> > >>>>>> > +int ttm_man_check_bo(struct ttm_mem_type_manager *man, >>>>>> > +                  struct ttm_buffer_object *bo, >>>>>> > +                  const struct ttm_place *place); >>>>>> >   void ttm_bo_mem_put(struct ttm_buffer_object *bo, struct >>>>>> ttm_mem_reg *mem); >>>>>> >   void ttm_bo_mem_put_locked(struct ttm_buffer_object *bo, >>>>>> >                           struct ttm_mem_reg *mem); >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> amd-gfx mailing list >>>>>> amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org >>>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx >>>>> >>>> >>>> >>>> >>>> _______________________________________________ >>>> dri-devel mailing list >>>> dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org >>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel >>> >> >> >> >> _______________________________________________ >> amd-gfx mailing list >> amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org >> https://lists.freedesktop.org/mailman/listinfo/amd-gfx >