From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.1 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5CA73C433F5 for ; Thu, 16 Sep 2021 17:25:59 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 45C9961206 for ; Thu, 16 Sep 2021 17:25:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1348136AbhIPR1R (ORCPT ); Thu, 16 Sep 2021 13:27:17 -0400 Received: from mail.kernel.org ([198.145.29.99]:44344 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1351714AbhIPRT1 (ORCPT ); Thu, 16 Sep 2021 13:19:27 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 11E7960D07; Thu, 16 Sep 2021 16:41:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1631810493; bh=G8dpfV2PElFHwAWhg2+m5dwGxUJR7xrvlAx2BMa7kq0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=bkp+yiYtcnZJKrbVghDu0UCghkuQhVoegOy+jel8oqk0pfFn0j43AHk+EoSPHeKxy r+I8hOLHichUjr6zTG3XM0YBXCmZyK3Qio3ckBQwkHRNHEO8G4wnB1Qp31CbXA79kd NAFMMyrV4/X7bYXNXfPud/Em9mp1gaZz/oJYWoYs= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Andrey Grodzovsky , =?UTF-8?q?Christian=20K=C3=B6nig?= , Sasha Levin Subject: [PATCH 5.14 160/432] drm/ttm: Fix multihop assert on eviction. Date: Thu, 16 Sep 2021 17:58:29 +0200 Message-Id: <20210916155816.166139983@linuxfoundation.org> X-Mailer: git-send-email 2.33.0 In-Reply-To: <20210916155810.813340753@linuxfoundation.org> References: <20210916155810.813340753@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Andrey Grodzovsky [ Upstream commit 403797925768d9fa870f5b1ebcd20016b397083b ] Problem: Under memory pressure when GTT domain is almost full multihop assert will come up when trying to evict LRU BO from VRAM to SYSTEM. Fix: Don't assert on multihop error in evict code but rather do a retry as we do in ttm_bo_move_buffer Signed-off-by: Andrey Grodzovsky Reviewed-by: Christian König Link: https://patchwork.freedesktop.org/patch/msgid/20210622162339.761651-6-andrey.grodzovsky@amd.com Signed-off-by: Sasha Levin --- drivers/gpu/drm/ttm/ttm_bo.c | 63 +++++++++++++++++++----------------- 1 file changed, 34 insertions(+), 29 deletions(-) diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index 8d7fd65ccced..32202385073a 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -488,6 +488,31 @@ void ttm_bo_unlock_delayed_workqueue(struct ttm_device *bdev, int resched) } EXPORT_SYMBOL(ttm_bo_unlock_delayed_workqueue); +static int ttm_bo_bounce_temp_buffer(struct ttm_buffer_object *bo, + struct ttm_resource **mem, + struct ttm_operation_ctx *ctx, + struct ttm_place *hop) +{ + struct ttm_placement hop_placement; + struct ttm_resource *hop_mem; + int ret; + + hop_placement.num_placement = hop_placement.num_busy_placement = 1; + hop_placement.placement = hop_placement.busy_placement = hop; + + /* find space in the bounce domain */ + ret = ttm_bo_mem_space(bo, &hop_placement, &hop_mem, ctx); + if (ret) + return ret; + /* move to the bounce domain */ + ret = ttm_bo_handle_move_mem(bo, hop_mem, false, ctx, NULL); + if (ret) { + ttm_resource_free(bo, &hop_mem); + return ret; + } + return 0; +} + static int ttm_bo_evict(struct ttm_buffer_object *bo, struct ttm_operation_ctx *ctx) { @@ -527,12 +552,17 @@ static int ttm_bo_evict(struct ttm_buffer_object *bo, goto out; } +bounce: ret = ttm_bo_handle_move_mem(bo, evict_mem, true, ctx, &hop); - if (unlikely(ret)) { - WARN(ret == -EMULTIHOP, "Unexpected multihop in eviction - likely driver bug\n"); - if (ret != -ERESTARTSYS) + if (ret == -EMULTIHOP) { + ret = ttm_bo_bounce_temp_buffer(bo, &evict_mem, ctx, &hop); + if (ret) { pr_err("Buffer eviction failed\n"); - ttm_resource_free(bo, &evict_mem); + ttm_resource_free(bo, &evict_mem); + goto out; + } + /* try and move to final place now. */ + goto bounce; } out: return ret; @@ -847,31 +877,6 @@ int ttm_bo_mem_space(struct ttm_buffer_object *bo, } EXPORT_SYMBOL(ttm_bo_mem_space); -static int ttm_bo_bounce_temp_buffer(struct ttm_buffer_object *bo, - struct ttm_resource **mem, - struct ttm_operation_ctx *ctx, - struct ttm_place *hop) -{ - struct ttm_placement hop_placement; - struct ttm_resource *hop_mem; - int ret; - - hop_placement.num_placement = hop_placement.num_busy_placement = 1; - hop_placement.placement = hop_placement.busy_placement = hop; - - /* find space in the bounce domain */ - ret = ttm_bo_mem_space(bo, &hop_placement, &hop_mem, ctx); - if (ret) - return ret; - /* move to the bounce domain */ - ret = ttm_bo_handle_move_mem(bo, hop_mem, false, ctx, NULL); - if (ret) { - ttm_resource_free(bo, &hop_mem); - return ret; - } - return 0; -} - static int ttm_bo_move_buffer(struct ttm_buffer_object *bo, struct ttm_placement *placement, struct ttm_operation_ctx *ctx) -- 2.30.2