From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.2 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 63E2CC433B4 for ; Fri, 21 May 2021 08:43:57 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 2EED661057 for ; Fri, 21 May 2021 08:43:57 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2EED661057 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=dri-devel-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 394FF6F5DF; Fri, 21 May 2021 08:43:56 +0000 (UTC) Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by gabe.freedesktop.org (Postfix) with ESMTPS id 286DB6F5DF; Fri, 21 May 2021 08:43:25 +0000 (UTC) IronPort-SDR: GjFIal3RKED61xO2PW3wQbg/y+jOIErsN169o2vpjZXpqOu/oDLBwv031WnXUyp7FaC1ARPE1V 1/X8KUL6losw== X-IronPort-AV: E=McAfee;i="6200,9189,9990"; a="181049937" X-IronPort-AV: E=Sophos;i="5.82,313,1613462400"; d="scan'208";a="181049937" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 May 2021 01:43:21 -0700 IronPort-SDR: zLfAr141lXNZBA/w7rTJ2TQCGV20LC2IqHA4hFEDSrFBf4yKmNjeUnUnTBgpOtsT8D0GzSQHxn vCA/i8IfldkA== X-IronPort-AV: E=Sophos;i="5.82,313,1613462400"; d="scan'208";a="440847238" Received: from imarinmo-mobl1.ger.corp.intel.com (HELO [10.249.254.34]) ([10.249.254.34]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 May 2021 01:43:19 -0700 Subject: Re: [RFC PATCH 4/5] drm/ttm: Document and optimize ttm_bo_pipeline_gutting() To: =?UTF-8?Q?Christian_K=c3=b6nig?= , intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org References: <20210520150947.803891-1-thomas.hellstrom@linux.intel.com> <20210520150947.803891-5-thomas.hellstrom@linux.intel.com> <96dd844a-6ef5-7502-7bb8-22ec21b9c15b@amd.com> From: =?UTF-8?Q?Thomas_Hellstr=c3=b6m?= Message-ID: Date: Fri, 21 May 2021 10:43:17 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.10.1 MIME-Version: 1.0 In-Reply-To: <96dd844a-6ef5-7502-7bb8-22ec21b9c15b@amd.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" On 5/21/21 10:21 AM, Christian König wrote: > Am 20.05.21 um 17:09 schrieb Thomas Hellström: >> If the bo is idle when calling ttm_bo_pipeline_gutting(), we >> unnecessarily >> create a ghost object and push it out to delayed destroy. >> Fix this by adding a path for idle, and document the function. >> >> Also avoid having the bo end up in a bad state vulnerable to user-space >> triggered kernel BUGs if the call to ttm_tt_create() fails. >> >> Finally reuse ttm_bo_pipeline_gutting() in ttm_bo_evict(). >> >> Cc: Christian König >> Signed-off-by: Thomas Hellström >> --- >>   drivers/gpu/drm/ttm/ttm_bo.c      | 20 +++++----- >>   drivers/gpu/drm/ttm/ttm_bo_util.c | 63 ++++++++++++++++++++++++------- >>   drivers/gpu/drm/ttm/ttm_tt.c      |  5 +++ >>   include/drm/ttm/ttm_tt.h          | 10 +++++ >>   4 files changed, 75 insertions(+), 23 deletions(-) >> >> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c >> index ca1b098b6a56..a8fa3375b8aa 100644 >> --- a/drivers/gpu/drm/ttm/ttm_bo.c >> +++ b/drivers/gpu/drm/ttm/ttm_bo.c >> @@ -501,10 +501,15 @@ static int ttm_bo_evict(struct >> ttm_buffer_object *bo, >>       bdev->funcs->evict_flags(bo, &placement); >>         if (!placement.num_placement && !placement.num_busy_placement) { >> -        ttm_bo_wait(bo, false, false); >> +        ret = ttm_bo_wait(bo, true, false); >> +        if (ret) >> +            return ret; >>   -        ttm_bo_cleanup_memtype_use(bo); >> -        return ttm_tt_create(bo, false); >> +        /* >> +         * Since we've already synced, this frees backing store >> +         * immediately. >> +         */ >> +        return ttm_bo_pipeline_gutting(bo); > > Yeah, we tried to avoid pipeline_gutting here because of eviction. But > I think when you wait before that should work. > >>       } >>         ret = ttm_bo_mem_space(bo, &placement, &evict_mem, ctx); >> @@ -974,13 +979,8 @@ int ttm_bo_validate(struct ttm_buffer_object *bo, >>       /* >>        * Remove the backing store if no placement is given. >>        */ >> -    if (!placement->num_placement && !placement->num_busy_placement) { >> -        ret = ttm_bo_pipeline_gutting(bo); >> -        if (ret) >> -            return ret; >> - >> -        return ttm_tt_create(bo, false); >> -    } >> +    if (!placement->num_placement && !placement->num_busy_placement) >> +        return ttm_bo_pipeline_gutting(bo); >>         /* >>        * Check whether we need to move buffer. >> diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c >> b/drivers/gpu/drm/ttm/ttm_bo_util.c >> index 919ee03f7eb3..1860e2e7563f 100644 >> --- a/drivers/gpu/drm/ttm/ttm_bo_util.c >> +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c >> @@ -479,7 +479,8 @@ static void ttm_transfered_destroy(struct >> ttm_buffer_object *bo) >>    */ >>     static int ttm_buffer_object_transfer(struct ttm_buffer_object *bo, >> -                      struct ttm_buffer_object **new_obj) >> +                      struct ttm_buffer_object **new_obj, >> +                      bool realloc_tt) >>   { >>       struct ttm_transfer_obj *fbo; >>       int ret; >> @@ -493,6 +494,17 @@ static int ttm_buffer_object_transfer(struct >> ttm_buffer_object *bo, >>       ttm_bo_get(bo); >>       fbo->bo = bo; >>   +    if (realloc_tt) { >> +        bo->ttm = NULL; >> +        ret = ttm_tt_create(bo, true); >> +        if (ret) { >> +            bo->ttm = fbo->base.ttm; >> +            kfree(fbo); >> +            ttm_bo_put(bo); >> +            return ret; >> +        } >> +    } >> + > > Can't we keep that logic in the caller? I think that would be cleaner. Indeed, let me see if we can do that without breaking anything. > >>       /** >>        * Fix up members that we shouldn't copy directly: >>        * TODO: Explicit member copy would probably be better here. >> @@ -763,7 +775,7 @@ static int ttm_bo_move_to_ghost(struct >> ttm_buffer_object *bo, >>       dma_fence_put(bo->moving); >>       bo->moving = dma_fence_get(fence); >>   -    ret = ttm_buffer_object_transfer(bo, &ghost_obj); >> +    ret = ttm_buffer_object_transfer(bo, &ghost_obj, false); >>       if (ret) >>           return ret; >>   @@ -836,26 +848,51 @@ int ttm_bo_move_accel_cleanup(struct >> ttm_buffer_object *bo, >>   } >>   EXPORT_SYMBOL(ttm_bo_move_accel_cleanup); >>   +/** >> + * ttm_bo_pipeline_gutting - purge the contents of a bo >> + * @bo: The buffer object >> + * >> + * Purge the contents of a bo, async if the bo is not idle. >> + * After a successful call, the bo is left unpopulated in >> + * system placement. The function may wait uninterruptible >> + * for idle on OOM. >> + * >> + * Return: 0 if successful, negative error code on failure. >> + */ >>   int ttm_bo_pipeline_gutting(struct ttm_buffer_object *bo) >>   { >>       static const struct ttm_place sys_mem = { .mem_type = >> TTM_PL_SYSTEM }; >>       struct ttm_buffer_object *ghost; >>       int ret; >>   -    ret = ttm_buffer_object_transfer(bo, &ghost); >> -    if (ret) >> -        return ret; >> +    /* If already idle, no need for ghost object dance. */ >> +    ret = ttm_bo_wait(bo, false, true); >> +    if (ret == -EBUSY) { >> +        ret = ttm_buffer_object_transfer(bo, &ghost, true); >> +        if (ret) >> +            return ret; > > When this is a shortcout to avoid work we should rather use the > inverse notation. > > In other words something like that: > > if (ret != -EBUSY) { >     ttm_resource_free(bo, &bo->mem); >     ttm_resource_alloc(bo, &sys_mem, &bo->mem); >     ttm_tt_create()... >     return ret; > } OK. > >>   -    ret = dma_resv_copy_fences(&ghost->base._resv, bo->base.resv); >> -    /* Last resort, wait for the BO to be idle when we are OOM */ >> -    if (ret) >> -        ttm_bo_wait(bo, false, false); >> +        ret = dma_resv_copy_fences(&ghost->base._resv, bo->base.resv); >> +        /* Last resort, wait for the BO to be idle when we are OOM */ >> +        if (ret) >> +            ttm_bo_wait(bo, false, false); >>   -    ttm_resource_alloc(bo, &sys_mem, &bo->mem); >> -    bo->ttm = NULL; >> +        dma_resv_unlock(&ghost->base._resv); >> +        ttm_bo_put(ghost); >> +    } else { >> +        if (!bo->ttm) { >> +            ret = ttm_tt_create(bo, true); >> +            if (ret) >> +                return ret; >> +        } else { >> +            ttm_tt_unpopulate(bo->bdev, bo->ttm); >> +            if (bo->type == ttm_bo_type_device) >> +                ttm_tt_mark_for_clear(bo->ttm); > > That's not legal, you can't unpopulate it when the BO is busy. > > Instead the TT object must be destroyed with the ghost and a new one > created. We've already verified that the bo is idle here, so we should be fine. /Thomas > > Christian. > >> +        } >> +        ttm_resource_free(bo, &bo->mem); >> +    } >>   -    dma_resv_unlock(&ghost->base._resv); >> -    ttm_bo_put(ghost); >> +    ttm_resource_alloc(bo, &sys_mem, &bo->mem); >>         return 0; >>   } >> diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c >> index 539e0232cb3b..0b1053e93db2 100644 >> --- a/drivers/gpu/drm/ttm/ttm_tt.c >> +++ b/drivers/gpu/drm/ttm/ttm_tt.c >> @@ -134,6 +134,11 @@ void ttm_tt_destroy_common(struct ttm_device >> *bdev, struct ttm_tt *ttm) >>   } >>   EXPORT_SYMBOL(ttm_tt_destroy_common); >>   +void ttm_tt_mark_for_clear(struct ttm_tt *ttm) >> +{ >> +    ttm->page_flags |= TTM_PAGE_FLAG_ZERO_ALLOC; >> +} >> + >>   void ttm_tt_destroy(struct ttm_device *bdev, struct ttm_tt *ttm) >>   { >>       bdev->funcs->ttm_tt_destroy(bdev, ttm); >> diff --git a/include/drm/ttm/ttm_tt.h b/include/drm/ttm/ttm_tt.h >> index 134d09ef7766..91552c83ac79 100644 >> --- a/include/drm/ttm/ttm_tt.h >> +++ b/include/drm/ttm/ttm_tt.h >> @@ -157,6 +157,16 @@ int ttm_tt_populate(struct ttm_device *bdev, >> struct ttm_tt *ttm, struct ttm_oper >>    */ >>   void ttm_tt_unpopulate(struct ttm_device *bdev, struct ttm_tt *ttm); >>   +/** >> + * ttm_tt_mark_for_clear - Mark pages for clearing on populate. >> + * >> + * @ttm: Pointer to the ttm_tt structure >> + * >> + * Marks pages for clearing so that the next time the page vector is >> + * populated, the pages will be cleared. >> + */ >> +void ttm_tt_mark_for_clear(struct ttm_tt *ttm); >> + >>   void ttm_tt_mgr_init(unsigned long num_pages, unsigned long >> num_dma32_pages); >>     #if IS_ENABLED(CONFIG_AGP) >