From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id ABDD6C4332F for ; Mon, 5 Dec 2022 19:58:38 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id BB0AF10E044; Mon, 5 Dec 2022 19:58:32 +0000 (UTC) Received: from mail-ej1-x636.google.com (mail-ej1-x636.google.com [IPv6:2a00:1450:4864:20::636]) by gabe.freedesktop.org (Postfix) with ESMTPS id 9B55B10E044; Mon, 5 Dec 2022 19:58:28 +0000 (UTC) Received: by mail-ej1-x636.google.com with SMTP id bj12so1044334ejb.13; Mon, 05 Dec 2022 11:58:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=fUBYd+/OIOUoGkMQoiRVXH5I+1PP1uRFia3yCcoalZ4=; b=XXrwY0YM3B7+X4pSxHygI+I1cq4Yw9xOXvPZGgoFmB8o+AOsZ5H8iTrKiEbQksUgFS K3lGp/fw9q3VRdhlvQn/ykYLcnp6AVOxK67OMAPfy9TwMu9AnHqZ5eSGhpzEMyxA8+yc J7RwnFBq2+JIscct+2UcHchG/pgC5G/utrP2gf1ELGrACG/xi5vAUQiAyVcto0pb6NLj fqBWbONKBxnqRTdMb6M+7YmgAYZI2iwrmQpjf5zoFtbZpuU9nYBqT1coztCi4Rveh1uK Y5H5KYKH9ewMRd+liTbt2BIJt9czH+iafqtbKAnN5TOVFcqIq2+JrZYP3WIfCN9friRd VMLg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=fUBYd+/OIOUoGkMQoiRVXH5I+1PP1uRFia3yCcoalZ4=; b=MoF2d6BiEvWjWR170FiJy768/wjQQ94qindoB4gjC87XAeMur27j0XEBsgpvpurpBK r/P7Oe2WyOqMSL6l6gEQMXPdT1BcYlf0x/H93lb5Ygs97nnr+Bfv3eFnVwZUBQMC+N/P 0PTMgJjPsfZRuagNDxb3NXBEhTbHUorsZQZ/TlXEuEdA9qbr31gSAAPaCgSwC828ztdY 1nN/7q9DfF9j2v1MGpQLWbyF68a3lX6h0rLlFnLqH72baVog+6i1722crn3P2c7tA5UA FB6XC8shXiWvnkjPaGLNg7TN6Nx8KsqAnqDmgI9Q+FSfk3EIBCtvjDaFJsRXko0p1lj3 5yDA== X-Gm-Message-State: ANoB5pmOLG2U0mR0K0W41JEV1KUcYAjjhsNqVNrXvoV0jsoJXK0H4pGN fN0TNPYhGV7/6Cc+ERKeXs7ARyHe8k8= X-Google-Smtp-Source: AA0mqf5dfmLWWpz8o4OinUDoOSU7iY32aAcmG20SvPQfankEuNet4/4vs/Inkm9eWS0M7bGWOrI94Q== X-Received: by 2002:a17:906:b34a:b0:755:6595:cd34 with SMTP id cd10-20020a170906b34a00b007556595cd34mr58857884ejb.70.1670270307068; Mon, 05 Dec 2022 11:58:27 -0800 (PST) Received: from ?IPV6:2a02:908:1256:79a0:7d4e:4122:f56a:39c2? ([2a02:908:1256:79a0:7d4e:4122:f56a:39c2]) by smtp.gmail.com with ESMTPSA id e15-20020a170906c00f00b0078c213ad441sm6620351ejz.101.2022.12.05.11.58.25 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 05 Dec 2022 11:58:26 -0800 (PST) Message-ID: <4514ca57-e39e-d684-3101-fddf57b0c89a@gmail.com> Date: Mon, 5 Dec 2022 20:58:25 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.4.2 Subject: Re: [Intel-gfx] [PATCH 7/9] drm/i915: stop using ttm_bo_wait Content-Language: en-US To: Daniel Vetter , Tvrtko Ursulin References: <20221125102137.1801-1-christian.koenig@amd.com> <20221125102137.1801-7-christian.koenig@amd.com> From: =?UTF-8?Q?Christian_K=c3=b6nig?= In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Intel Graphics Development , Matthew Auld , dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org, Matthew Auld Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Am 30.11.22 um 15:06 schrieb Daniel Vetter: > On Wed, 30 Nov 2022 at 14:03, Tvrtko Ursulin > wrote: >> On 29/11/2022 18:05, Matthew Auld wrote: >>> On Fri, 25 Nov 2022 at 11:14, Tvrtko Ursulin >>> wrote: >>>> >>>> + Matt >>>> >>>> On 25/11/2022 10:21, Christian König wrote: >>>>> TTM is just wrapping core DMA functionality here, remove the mid-layer. >>>>> No functional change. >>>>> >>>>> Signed-off-by: Christian König >>>>> --- >>>>> drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 9 ++++++--- >>>>> 1 file changed, 6 insertions(+), 3 deletions(-) >>>>> >>>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c >>>>> index 5247d88b3c13..d409a77449a3 100644 >>>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c >>>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c >>>>> @@ -599,13 +599,16 @@ i915_ttm_resource_get_st(struct drm_i915_gem_object *obj, >>>>> static int i915_ttm_truncate(struct drm_i915_gem_object *obj) >>>>> { >>>>> struct ttm_buffer_object *bo = i915_gem_to_ttm(obj); >>>>> - int err; >>>>> + long err; >>>>> >>>>> WARN_ON_ONCE(obj->mm.madv == I915_MADV_WILLNEED); >>>>> >>>>> - err = ttm_bo_wait(bo, true, false); >>>>> - if (err) >>>>> + err = dma_resv_wait_timeout(bo->base.resv, DMA_RESV_USAGE_BOOKKEEP, >>>>> + true, 15 * HZ); >>>> This 15 second stuck out a bit for me and then on a slightly deeper look >>>> it seems this timeout will "leak" into a few of i915 code paths. If we >>>> look at the difference between the legacy shmem and ttm backend I am not >>>> sure if the legacy one is blocking or not - but if it can block I don't >>>> think it would have an arbitrary timeout like this. Matt your thoughts? >>> Not sure what is meant by leak here, but the legacy shmem must also >>> wait/block when unbinding each VMA, before calling truncate. It's the >> By "leak" I meant if 15s timeout propagates into some code paths visible >> from userspace which with a legacy backend instead have an indefinite >> wait. If we have that it's probably not very good to have this >> inconsistency, or to apply an arbitrary timeout to those path to start with. >> >>> same story for the ttm backend, except slightly more complicated in >>> that there might be no currently bound VMA, and yet the GPU could >>> still be accessing the pages due to async unbinds, kernel moves etc, >>> which the wait here (and in i915_ttm_shrink) is meant to protect >>> against. If the wait times out it should just fail gracefully. I guess >>> we could just use MAX_SCHEDULE_TIMEOUT here? Not sure if it really >>> matters though. >> Right, depends if it can leak or not to userspace and diverge between >> backends. > Generally lock_timeout() is a design bug. It's either > lock_interruptible (or maybe lock_killable) or try_lock, but > lock_timeout is just duct-tape. I haven't dug in to figure out what > should be here, but it smells fishy. Independent of this discussion could I get an rb for removing ttm_bo_wait() from i915? Exactly hiding this timeout inside TTM is what always made me quite nervous here. Regards, Christian. > -Daniel From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 24448C4332F for ; Mon, 5 Dec 2022 19:58:33 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 7D73310E28F; Mon, 5 Dec 2022 19:58:32 +0000 (UTC) Received: from mail-ej1-x636.google.com (mail-ej1-x636.google.com [IPv6:2a00:1450:4864:20::636]) by gabe.freedesktop.org (Postfix) with ESMTPS id 9B55B10E044; Mon, 5 Dec 2022 19:58:28 +0000 (UTC) Received: by mail-ej1-x636.google.com with SMTP id bj12so1044334ejb.13; Mon, 05 Dec 2022 11:58:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=fUBYd+/OIOUoGkMQoiRVXH5I+1PP1uRFia3yCcoalZ4=; b=XXrwY0YM3B7+X4pSxHygI+I1cq4Yw9xOXvPZGgoFmB8o+AOsZ5H8iTrKiEbQksUgFS K3lGp/fw9q3VRdhlvQn/ykYLcnp6AVOxK67OMAPfy9TwMu9AnHqZ5eSGhpzEMyxA8+yc J7RwnFBq2+JIscct+2UcHchG/pgC5G/utrP2gf1ELGrACG/xi5vAUQiAyVcto0pb6NLj fqBWbONKBxnqRTdMb6M+7YmgAYZI2iwrmQpjf5zoFtbZpuU9nYBqT1coztCi4Rveh1uK Y5H5KYKH9ewMRd+liTbt2BIJt9czH+iafqtbKAnN5TOVFcqIq2+JrZYP3WIfCN9friRd VMLg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=fUBYd+/OIOUoGkMQoiRVXH5I+1PP1uRFia3yCcoalZ4=; b=MoF2d6BiEvWjWR170FiJy768/wjQQ94qindoB4gjC87XAeMur27j0XEBsgpvpurpBK r/P7Oe2WyOqMSL6l6gEQMXPdT1BcYlf0x/H93lb5Ygs97nnr+Bfv3eFnVwZUBQMC+N/P 0PTMgJjPsfZRuagNDxb3NXBEhTbHUorsZQZ/TlXEuEdA9qbr31gSAAPaCgSwC828ztdY 1nN/7q9DfF9j2v1MGpQLWbyF68a3lX6h0rLlFnLqH72baVog+6i1722crn3P2c7tA5UA FB6XC8shXiWvnkjPaGLNg7TN6Nx8KsqAnqDmgI9Q+FSfk3EIBCtvjDaFJsRXko0p1lj3 5yDA== X-Gm-Message-State: ANoB5pmOLG2U0mR0K0W41JEV1KUcYAjjhsNqVNrXvoV0jsoJXK0H4pGN fN0TNPYhGV7/6Cc+ERKeXs7ARyHe8k8= X-Google-Smtp-Source: AA0mqf5dfmLWWpz8o4OinUDoOSU7iY32aAcmG20SvPQfankEuNet4/4vs/Inkm9eWS0M7bGWOrI94Q== X-Received: by 2002:a17:906:b34a:b0:755:6595:cd34 with SMTP id cd10-20020a170906b34a00b007556595cd34mr58857884ejb.70.1670270307068; Mon, 05 Dec 2022 11:58:27 -0800 (PST) Received: from ?IPV6:2a02:908:1256:79a0:7d4e:4122:f56a:39c2? ([2a02:908:1256:79a0:7d4e:4122:f56a:39c2]) by smtp.gmail.com with ESMTPSA id e15-20020a170906c00f00b0078c213ad441sm6620351ejz.101.2022.12.05.11.58.25 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 05 Dec 2022 11:58:26 -0800 (PST) Message-ID: <4514ca57-e39e-d684-3101-fddf57b0c89a@gmail.com> Date: Mon, 5 Dec 2022 20:58:25 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.4.2 Content-Language: en-US To: Daniel Vetter , Tvrtko Ursulin References: <20221125102137.1801-1-christian.koenig@amd.com> <20221125102137.1801-7-christian.koenig@amd.com> From: =?UTF-8?Q?Christian_K=c3=b6nig?= In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Subject: Re: [Intel-gfx] [PATCH 7/9] drm/i915: stop using ttm_bo_wait X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Intel Graphics Development , dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org, Matthew Auld Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Am 30.11.22 um 15:06 schrieb Daniel Vetter: > On Wed, 30 Nov 2022 at 14:03, Tvrtko Ursulin > wrote: >> On 29/11/2022 18:05, Matthew Auld wrote: >>> On Fri, 25 Nov 2022 at 11:14, Tvrtko Ursulin >>> wrote: >>>> >>>> + Matt >>>> >>>> On 25/11/2022 10:21, Christian König wrote: >>>>> TTM is just wrapping core DMA functionality here, remove the mid-layer. >>>>> No functional change. >>>>> >>>>> Signed-off-by: Christian König >>>>> --- >>>>> drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 9 ++++++--- >>>>> 1 file changed, 6 insertions(+), 3 deletions(-) >>>>> >>>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c >>>>> index 5247d88b3c13..d409a77449a3 100644 >>>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c >>>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c >>>>> @@ -599,13 +599,16 @@ i915_ttm_resource_get_st(struct drm_i915_gem_object *obj, >>>>> static int i915_ttm_truncate(struct drm_i915_gem_object *obj) >>>>> { >>>>> struct ttm_buffer_object *bo = i915_gem_to_ttm(obj); >>>>> - int err; >>>>> + long err; >>>>> >>>>> WARN_ON_ONCE(obj->mm.madv == I915_MADV_WILLNEED); >>>>> >>>>> - err = ttm_bo_wait(bo, true, false); >>>>> - if (err) >>>>> + err = dma_resv_wait_timeout(bo->base.resv, DMA_RESV_USAGE_BOOKKEEP, >>>>> + true, 15 * HZ); >>>> This 15 second stuck out a bit for me and then on a slightly deeper look >>>> it seems this timeout will "leak" into a few of i915 code paths. If we >>>> look at the difference between the legacy shmem and ttm backend I am not >>>> sure if the legacy one is blocking or not - but if it can block I don't >>>> think it would have an arbitrary timeout like this. Matt your thoughts? >>> Not sure what is meant by leak here, but the legacy shmem must also >>> wait/block when unbinding each VMA, before calling truncate. It's the >> By "leak" I meant if 15s timeout propagates into some code paths visible >> from userspace which with a legacy backend instead have an indefinite >> wait. If we have that it's probably not very good to have this >> inconsistency, or to apply an arbitrary timeout to those path to start with. >> >>> same story for the ttm backend, except slightly more complicated in >>> that there might be no currently bound VMA, and yet the GPU could >>> still be accessing the pages due to async unbinds, kernel moves etc, >>> which the wait here (and in i915_ttm_shrink) is meant to protect >>> against. If the wait times out it should just fail gracefully. I guess >>> we could just use MAX_SCHEDULE_TIMEOUT here? Not sure if it really >>> matters though. >> Right, depends if it can leak or not to userspace and diverge between >> backends. > Generally lock_timeout() is a design bug. It's either > lock_interruptible (or maybe lock_killable) or try_lock, but > lock_timeout is just duct-tape. I haven't dug in to figure out what > should be here, but it smells fishy. Independent of this discussion could I get an rb for removing ttm_bo_wait() from i915? Exactly hiding this timeout inside TTM is what always made me quite nervous here. Regards, Christian. > -Daniel