From: Lyude Paul <lyude@redhat.com> To: dri-devel@lists.freedesktop.org Cc: "Christian König" <christian.koenig@amd.com>, "Dave Airlie" <airlied@redhat.com>, "Huang Rui" <ray.huang@amd.com>, "David Airlie" <airlied@linux.ie>, "Daniel Vetter" <daniel@ffwll.ch>, "Andrey Grodzovsky" <andrey.grodzovsky@amd.com>, linux-kernel@vger.kernel.org (open list) Subject: [PATCH] drm/ttm: Remove pinned bos from LRU in ttm_bo_move_to_lru_tail() Date: Mon, 4 Jan 2021 18:13:58 -0500 [thread overview] Message-ID: <20210104231358.154521-1-lyude@redhat.com> (raw) Recently a regression was introduced which caused TTM's buffer eviction to attempt to evict already-pinned BOs, causing issues with buffer eviction under memory pressure along with suspend/resume: nouveau 0000:1f:00.0: DRM: evicting buffers... nouveau 0000:1f:00.0: DRM: Moving pinned object 00000000c428c3ff! nouveau 0000:1f:00.0: fifo: fault 00 [READ] at 0000000000200000 engine 04 [BAR1] client 07 [HUB/HOST_CPU] reason 02 [PTE] on channel -1 [00ffeaa000 unknown] nouveau 0000:1f:00.0: fifo: DROPPED_MMU_FAULT 00001000 nouveau 0000:1f:00.0: fifo: fault 01 [WRITE] at 0000000000020000 engine 0c [HOST6] client 07 [HUB/HOST_CPU] reason 02 [PTE] on channel 1 [00ffb28000 DRM] nouveau 0000:1f:00.0: fifo: channel 1: killed nouveau 0000:1f:00.0: fifo: runlist 0: scheduled for recovery [TTM] Buffer eviction failed nouveau 0000:1f:00.0: DRM: waiting for kernel channels to go idle... nouveau 0000:1f:00.0: DRM: failed to idle channel 1 [DRM] nouveau 0000:1f:00.0: DRM: resuming display... After some bisection and investigation, it appears this resulted from the recent changes to ttm_bo_move_to_lru_tail(). Previously when a buffer was pinned, the buffer would be removed from the LRU once ttm_bo_unreserve to maintain the LRU list when pinning or unpinning BOs. However, since: commit 3d1a88e1051f ("drm/ttm: cleanup LRU handling further") We've been exiting from ttm_bo_move_to_lru_tail() at the very beginning of the function if the bo we're looking at is pinned, resulting in the pinned BO never getting removed from the lru and as a result - causing issues when it eventually becomes time for eviction. So, let's fix this by calling ttm_bo_del_from_lru() from ttm_bo_move_to_lru_tail() in the event that we're dealing with a pinned buffer. As well, add back the hunks in ttm_bo_del_from_lru() that were removed which checked whether we want to call bdev->driver->del_from_lru_notify() or not. We do this last part to avoid calling the hook when the bo in question was already removed from the LRU. Signed-off-by: Lyude Paul <lyude@redhat.com> Fixes: 3d1a88e1051f ("drm/ttm: cleanup LRU handling further") Cc: Christian König <christian.koenig@amd.com> Cc: Dave Airlie <airlied@redhat.com> --- drivers/gpu/drm/ttm/ttm_bo.c | 20 ++++++++++++++++---- 1 file changed, 16 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index 31e8b3da5563..0f373b78e7fa 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -113,11 +113,18 @@ static struct kobj_type ttm_bo_glob_kobj_type = { static void ttm_bo_del_from_lru(struct ttm_buffer_object *bo) { struct ttm_bo_device *bdev = bo->bdev; + bool notify = false; - list_del_init(&bo->swap); - list_del_init(&bo->lru); + if (!list_empty(&bo->swap)) { + notify = true; + list_del_init(&bo->swap); + } + if (!list_empty(&bo->lru)) { + notify = true; + list_del_init(&bo->lru); + } - if (bdev->driver->del_from_lru_notify) + if (notify && bdev->driver->del_from_lru_notify) bdev->driver->del_from_lru_notify(bo); } @@ -138,8 +145,13 @@ void ttm_bo_move_to_lru_tail(struct ttm_buffer_object *bo, dma_resv_assert_held(bo->base.resv); - if (bo->pin_count) + /* Pinned bos will have been added to the LRU before they were pinned, so make sure we + * always remove them here + */ + if (bo->pin_count) { + ttm_bo_del_from_lru(bo); return; + } man = ttm_manager_type(bdev, mem->mem_type); list_move_tail(&bo->lru, &man->lru[bo->priority]); -- 2.29.2
WARNING: multiple messages have this Message-ID (diff)
From: Lyude Paul <lyude@redhat.com> To: dri-devel@lists.freedesktop.org Cc: "David Airlie" <airlied@linux.ie>, "open list" <linux-kernel@vger.kernel.org>, "Huang Rui" <ray.huang@amd.com>, "Dave Airlie" <airlied@redhat.com>, "Christian König" <christian.koenig@amd.com> Subject: [PATCH] drm/ttm: Remove pinned bos from LRU in ttm_bo_move_to_lru_tail() Date: Mon, 4 Jan 2021 18:13:58 -0500 [thread overview] Message-ID: <20210104231358.154521-1-lyude@redhat.com> (raw) Recently a regression was introduced which caused TTM's buffer eviction to attempt to evict already-pinned BOs, causing issues with buffer eviction under memory pressure along with suspend/resume: nouveau 0000:1f:00.0: DRM: evicting buffers... nouveau 0000:1f:00.0: DRM: Moving pinned object 00000000c428c3ff! nouveau 0000:1f:00.0: fifo: fault 00 [READ] at 0000000000200000 engine 04 [BAR1] client 07 [HUB/HOST_CPU] reason 02 [PTE] on channel -1 [00ffeaa000 unknown] nouveau 0000:1f:00.0: fifo: DROPPED_MMU_FAULT 00001000 nouveau 0000:1f:00.0: fifo: fault 01 [WRITE] at 0000000000020000 engine 0c [HOST6] client 07 [HUB/HOST_CPU] reason 02 [PTE] on channel 1 [00ffb28000 DRM] nouveau 0000:1f:00.0: fifo: channel 1: killed nouveau 0000:1f:00.0: fifo: runlist 0: scheduled for recovery [TTM] Buffer eviction failed nouveau 0000:1f:00.0: DRM: waiting for kernel channels to go idle... nouveau 0000:1f:00.0: DRM: failed to idle channel 1 [DRM] nouveau 0000:1f:00.0: DRM: resuming display... After some bisection and investigation, it appears this resulted from the recent changes to ttm_bo_move_to_lru_tail(). Previously when a buffer was pinned, the buffer would be removed from the LRU once ttm_bo_unreserve to maintain the LRU list when pinning or unpinning BOs. However, since: commit 3d1a88e1051f ("drm/ttm: cleanup LRU handling further") We've been exiting from ttm_bo_move_to_lru_tail() at the very beginning of the function if the bo we're looking at is pinned, resulting in the pinned BO never getting removed from the lru and as a result - causing issues when it eventually becomes time for eviction. So, let's fix this by calling ttm_bo_del_from_lru() from ttm_bo_move_to_lru_tail() in the event that we're dealing with a pinned buffer. As well, add back the hunks in ttm_bo_del_from_lru() that were removed which checked whether we want to call bdev->driver->del_from_lru_notify() or not. We do this last part to avoid calling the hook when the bo in question was already removed from the LRU. Signed-off-by: Lyude Paul <lyude@redhat.com> Fixes: 3d1a88e1051f ("drm/ttm: cleanup LRU handling further") Cc: Christian König <christian.koenig@amd.com> Cc: Dave Airlie <airlied@redhat.com> --- drivers/gpu/drm/ttm/ttm_bo.c | 20 ++++++++++++++++---- 1 file changed, 16 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index 31e8b3da5563..0f373b78e7fa 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -113,11 +113,18 @@ static struct kobj_type ttm_bo_glob_kobj_type = { static void ttm_bo_del_from_lru(struct ttm_buffer_object *bo) { struct ttm_bo_device *bdev = bo->bdev; + bool notify = false; - list_del_init(&bo->swap); - list_del_init(&bo->lru); + if (!list_empty(&bo->swap)) { + notify = true; + list_del_init(&bo->swap); + } + if (!list_empty(&bo->lru)) { + notify = true; + list_del_init(&bo->lru); + } - if (bdev->driver->del_from_lru_notify) + if (notify && bdev->driver->del_from_lru_notify) bdev->driver->del_from_lru_notify(bo); } @@ -138,8 +145,13 @@ void ttm_bo_move_to_lru_tail(struct ttm_buffer_object *bo, dma_resv_assert_held(bo->base.resv); - if (bo->pin_count) + /* Pinned bos will have been added to the LRU before they were pinned, so make sure we + * always remove them here + */ + if (bo->pin_count) { + ttm_bo_del_from_lru(bo); return; + } man = ttm_manager_type(bdev, mem->mem_type); list_move_tail(&bo->lru, &man->lru[bo->priority]); -- 2.29.2 _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
next reply other threads:[~2021-01-04 23:15 UTC|newest] Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-01-04 23:13 Lyude Paul [this message] 2021-01-04 23:13 ` [PATCH] drm/ttm: Remove pinned bos from LRU in ttm_bo_move_to_lru_tail() Lyude Paul 2021-01-04 21:06 ` Christian König 2021-01-04 21:06 ` Christian König 2021-01-05 11:49 ` Christian König 2021-01-05 11:49 ` Christian König
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20210104231358.154521-1-lyude@redhat.com \ --to=lyude@redhat.com \ --cc=airlied@linux.ie \ --cc=airlied@redhat.com \ --cc=andrey.grodzovsky@amd.com \ --cc=christian.koenig@amd.com \ --cc=daniel@ffwll.ch \ --cc=dri-devel@lists.freedesktop.org \ --cc=linux-kernel@vger.kernel.org \ --cc=ray.huang@amd.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.