From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B80D9C433F5 for ; Thu, 9 Dec 2021 16:58:27 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 50B2810E16D; Thu, 9 Dec 2021 16:53:48 +0000 (UTC) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by gabe.freedesktop.org (Postfix) with ESMTPS id 4DC7789F45; Thu, 9 Dec 2021 13:25:25 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10192"; a="324355905" X-IronPort-AV: E=Sophos;i="5.88,192,1635231600"; d="scan'208";a="324355905" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2021 05:25:24 -0800 X-IronPort-AV: E=Sophos;i="5.88,192,1635231600"; d="scan'208";a="503486596" Received: from shrehore-mobl2.ger.corp.intel.com (HELO [10.252.51.183]) ([10.252.51.183]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2021 05:25:22 -0800 Message-ID: Date: Thu, 9 Dec 2021 14:25:19 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Firefox/91.0 Thunderbird/91.3.2 Content-Language: en-US To: Matthew Auld References: <20211129134735.628712-1-maarten.lankhorst@linux.intel.com> <20211129134735.628712-13-maarten.lankhorst@linux.intel.com> From: Maarten Lankhorst In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Subject: Re: [Intel-gfx] [PATCH v2 12/16] drm/i915: Add i915_vma_unbind_unlocked, and take obj lock for i915_vma_unbind X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Intel Graphics Development , ML dri-devel Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" On 09-12-2021 14:05, Matthew Auld wrote: > On Mon, 29 Nov 2021 at 13:58, Maarten Lankhorst > wrote: >> We want to remove more members of i915_vma, which requires the locking to be >> held more often. >> >> Start requiring gem object lock for i915_vma_unbind, as it's one of the >> callers that may unpin pages. >> >> Some special care is needed when evicting, because the last reference to the >> object may be held by the VMA, so after __i915_vma_unbind, vma may be garbage, >> and we need to cache vma->obj before unlocking. >> >> Signed-off-by: Maarten Lankhorst >> --- > > >> @@ -129,22 +129,47 @@ void i915_ggtt_suspend_vm(struct i915_address_space *vm) >> >> drm_WARN_ON(&vm->i915->drm, !vm->is_ggtt && !vm->is_dpt); >> >> +retry: >> + i915_gem_drain_freed_objects(vm->i915); >> + >> mutex_lock(&vm->mutex); >> >> /* Skip rewriting PTE on VMA unbind. */ >> open = atomic_xchg(&vm->open, 0); >> >> list_for_each_entry_safe(vma, vn, &vm->bound_list, vm_link) { >> + struct drm_i915_gem_object *obj = vma->obj; >> + >> GEM_BUG_ON(!drm_mm_node_allocated(&vma->node)); >> + >> i915_vma_wait_for_bind(vma); >> >> - if (i915_vma_is_pinned(vma)) >> + if (i915_vma_is_pinned(vma) || !i915_vma_is_bound(vma, I915_VMA_GLOBAL_BIND)) >> continue; >> >> - if (!i915_vma_is_bound(vma, I915_VMA_GLOBAL_BIND)) { >> - __i915_vma_evict(vma); >> - drm_mm_remove_node(&vma->node); >> + /* unlikely to race when GPU is idle, so no worry about slowpath.. */ >> + if (!i915_gem_object_trylock(obj, NULL)) { >> + atomic_set(&vm->open, open); > Does this need a comment about barriers? Not sure, it's guarded by vm->mutex. >> + >> + i915_gem_object_get(obj); > Should this not be kref_get_unless_zero? Assuming the vm->mutex is the > only thing keeping the object alive here, won't this lead to potential > uaf/double-free or something? Also should we not plonk this before the > trylock? Or maybe I'm missing something here? Normally you're correct, this is normally the case, but we drain freed objects and this path should only be run during s/r, at which point userspace should be dead, GPU idle, and we just drained all freed objects above. It would be a bug if we still found a dead object, as nothing should be running. >> + mutex_unlock(&vm->mutex); >> + >> + i915_gem_object_lock(obj, NULL); >> + open = i915_vma_unbind(vma); >> + i915_gem_object_unlock(obj); >> + >> + GEM_WARN_ON(open); >> + >> + i915_gem_object_put(obj); >> + goto retry; >> } >> + >> + i915_vma_wait_for_bind(vma); > We also call wait_for_bind above, is that intentional? Should be harmless, but first one should probably be removed. :) From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B22AAC433EF for ; Thu, 9 Dec 2021 17:01:01 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 87E8910EBB7; Thu, 9 Dec 2021 16:54:33 +0000 (UTC) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by gabe.freedesktop.org (Postfix) with ESMTPS id 4DC7789F45; Thu, 9 Dec 2021 13:25:25 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10192"; a="324355905" X-IronPort-AV: E=Sophos;i="5.88,192,1635231600"; d="scan'208";a="324355905" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2021 05:25:24 -0800 X-IronPort-AV: E=Sophos;i="5.88,192,1635231600"; d="scan'208";a="503486596" Received: from shrehore-mobl2.ger.corp.intel.com (HELO [10.252.51.183]) ([10.252.51.183]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Dec 2021 05:25:22 -0800 Message-ID: Date: Thu, 9 Dec 2021 14:25:19 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Firefox/91.0 Thunderbird/91.3.2 Subject: Re: [PATCH v2 12/16] drm/i915: Add i915_vma_unbind_unlocked, and take obj lock for i915_vma_unbind Content-Language: en-US To: Matthew Auld References: <20211129134735.628712-1-maarten.lankhorst@linux.intel.com> <20211129134735.628712-13-maarten.lankhorst@linux.intel.com> From: Maarten Lankhorst In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Intel Graphics Development , ML dri-devel Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" On 09-12-2021 14:05, Matthew Auld wrote: > On Mon, 29 Nov 2021 at 13:58, Maarten Lankhorst > wrote: >> We want to remove more members of i915_vma, which requires the locking to be >> held more often. >> >> Start requiring gem object lock for i915_vma_unbind, as it's one of the >> callers that may unpin pages. >> >> Some special care is needed when evicting, because the last reference to the >> object may be held by the VMA, so after __i915_vma_unbind, vma may be garbage, >> and we need to cache vma->obj before unlocking. >> >> Signed-off-by: Maarten Lankhorst >> --- > > >> @@ -129,22 +129,47 @@ void i915_ggtt_suspend_vm(struct i915_address_space *vm) >> >> drm_WARN_ON(&vm->i915->drm, !vm->is_ggtt && !vm->is_dpt); >> >> +retry: >> + i915_gem_drain_freed_objects(vm->i915); >> + >> mutex_lock(&vm->mutex); >> >> /* Skip rewriting PTE on VMA unbind. */ >> open = atomic_xchg(&vm->open, 0); >> >> list_for_each_entry_safe(vma, vn, &vm->bound_list, vm_link) { >> + struct drm_i915_gem_object *obj = vma->obj; >> + >> GEM_BUG_ON(!drm_mm_node_allocated(&vma->node)); >> + >> i915_vma_wait_for_bind(vma); >> >> - if (i915_vma_is_pinned(vma)) >> + if (i915_vma_is_pinned(vma) || !i915_vma_is_bound(vma, I915_VMA_GLOBAL_BIND)) >> continue; >> >> - if (!i915_vma_is_bound(vma, I915_VMA_GLOBAL_BIND)) { >> - __i915_vma_evict(vma); >> - drm_mm_remove_node(&vma->node); >> + /* unlikely to race when GPU is idle, so no worry about slowpath.. */ >> + if (!i915_gem_object_trylock(obj, NULL)) { >> + atomic_set(&vm->open, open); > Does this need a comment about barriers? Not sure, it's guarded by vm->mutex. >> + >> + i915_gem_object_get(obj); > Should this not be kref_get_unless_zero? Assuming the vm->mutex is the > only thing keeping the object alive here, won't this lead to potential > uaf/double-free or something? Also should we not plonk this before the > trylock? Or maybe I'm missing something here? Normally you're correct, this is normally the case, but we drain freed objects and this path should only be run during s/r, at which point userspace should be dead, GPU idle, and we just drained all freed objects above. It would be a bug if we still found a dead object, as nothing should be running. >> + mutex_unlock(&vm->mutex); >> + >> + i915_gem_object_lock(obj, NULL); >> + open = i915_vma_unbind(vma); >> + i915_gem_object_unlock(obj); >> + >> + GEM_WARN_ON(open); >> + >> + i915_gem_object_put(obj); >> + goto retry; >> } >> + >> + i915_vma_wait_for_bind(vma); > We also call wait_for_bind above, is that intentional? Should be harmless, but first one should probably be removed. :)