From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A0A28C761A6 for ; Mon, 20 Mar 2023 00:56:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230090AbjCTA4H (ORCPT ); Sun, 19 Mar 2023 20:56:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60610 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229972AbjCTAzi (ORCPT ); Sun, 19 Mar 2023 20:55:38 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2A8B112F27; Sun, 19 Mar 2023 17:54:29 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 7EAD6611C9; Mon, 20 Mar 2023 00:54:07 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 738E2C433EF; Mon, 20 Mar 2023 00:54:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1679273646; bh=obqkoNb62PsRjF2yf7nUVQjnq9ted1J5Oc+NeWU/zgo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=XRjq52pi7yYndNT7Y9bHlBvNY70lNC8wnMSdmfNsby+8DK2i1cvqEjRqEmLt0oLkI mtK+FBT2SHiuGsOIUoT18U8+fIuQUoHlVnJAjxf9OKQCfM8K/gLo0vlHhgQXlB7VbF 97RPtP6z5NIQxqhqEi0ScYWVVjU8yeq73RwmtZFe6FpJrrYu2iPjENiyUP7KJ23JIl 19BPIee0rqACD+M97xLHeaQgr5Q2nHjHoNUqD9gGSa/G/o9nHByNmWLC0nxUH7RJCf 48MyiA8JE0m9Wr2o2yJo7WAxi0mNlxKHR6vp/gsvJTioYYDlSb1vdv8045+DHCO4k6 TUbe5uVEAoiJw== From: Sasha Levin To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Cc: lyndonli , Guchun Chen , =?UTF-8?q?Christian=20K=C3=B6nig?= , Alex Deucher , Sasha Levin , Xinhui.Pan@amd.com, airlied@gmail.com, daniel@ffwll.ch, luben.tuikov@amd.com, Philip.Yang@amd.com, sunpeng.li@amd.com, amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org Subject: [PATCH AUTOSEL 6.2 29/30] drm/amdgpu: Fix call trace warning and hang when removing amdgpu device Date: Sun, 19 Mar 2023 20:52:54 -0400 Message-Id: <20230320005258.1428043-29-sashal@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230320005258.1428043-1-sashal@kernel.org> References: <20230320005258.1428043-1-sashal@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 X-stable: review X-Patchwork-Hint: Ignore Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: lyndonli [ Upstream commit 93bb18d2a873d2fa9625c8ea927723660a868b95 ] On GPUs with RAS enabled, below call trace and hang are observed when shutting down device. v2: use DRM device unplugged flag instead of shutdown flag as the check to prevent memory wipe in shutdown stage. [ +0.000000] RIP: 0010:amdgpu_vram_mgr_fini+0x18d/0x1c0 [amdgpu] [ +0.000001] PKRU: 55555554 [ +0.000001] Call Trace: [ +0.000001] [ +0.000002] amdgpu_ttm_fini+0x140/0x1c0 [amdgpu] [ +0.000183] amdgpu_bo_fini+0x27/0xa0 [amdgpu] [ +0.000184] gmc_v11_0_sw_fini+0x2b/0x40 [amdgpu] [ +0.000163] amdgpu_device_fini_sw+0xb6/0x510 [amdgpu] [ +0.000152] amdgpu_driver_release_kms+0x16/0x30 [amdgpu] [ +0.000090] drm_dev_release+0x28/0x50 [drm] [ +0.000016] devm_drm_dev_init_release+0x38/0x60 [drm] [ +0.000011] devm_action_release+0x15/0x20 [ +0.000003] release_nodes+0x40/0xc0 [ +0.000001] devres_release_all+0x9e/0xe0 [ +0.000001] device_unbind_cleanup+0x12/0x80 [ +0.000003] device_release_driver_internal+0xff/0x160 [ +0.000001] driver_detach+0x4a/0x90 [ +0.000001] bus_remove_driver+0x6c/0xf0 [ +0.000001] driver_unregister+0x31/0x50 [ +0.000001] pci_unregister_driver+0x40/0x90 [ +0.000003] amdgpu_exit+0x15/0x120 [amdgpu] Signed-off-by: lyndonli Reviewed-by: Guchun Chen Reviewed-by: Christian König Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c index 25a68d8888e0d..5d4649b8bfd33 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c @@ -1315,7 +1315,7 @@ void amdgpu_bo_release_notify(struct ttm_buffer_object *bo) if (!bo->resource || bo->resource->mem_type != TTM_PL_VRAM || !(abo->flags & AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE) || - adev->in_suspend || adev->shutdown) + adev->in_suspend || drm_dev_is_unplugged(adev_to_drm(adev))) return; if (WARN_ON_ONCE(!dma_resv_trylock(bo->base.resv))) -- 2.39.2