From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.2 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, HK_RANDOM_FROM,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 95781C7618F for ; Tue, 16 Jul 2019 15:25:25 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 7510E21743 for ; Tue, 16 Jul 2019 15:25:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387880AbfGPPZY (ORCPT ); Tue, 16 Jul 2019 11:25:24 -0400 Received: from mga05.intel.com ([192.55.52.43]:1675 "EHLO mga05.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728421AbfGPPZY (ORCPT ); Tue, 16 Jul 2019 11:25:24 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 16 Jul 2019 08:25:25 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.63,498,1557212400"; d="scan'208";a="158167369" Received: from esulliva-mobl.ger.corp.intel.com (HELO [10.251.94.109]) ([10.251.94.109]) by orsmga007.jf.intel.com with ESMTP; 16 Jul 2019 08:25:22 -0700 Subject: Re: [Intel-gfx] [PATCH 1/5] drm/i915/userptr: Beware recursive lock_page() To: Chris Wilson , intel-gfx@lists.freedesktop.org Cc: stable@vger.kernel.org References: <20190716124931.5870-1-chris@chris-wilson.co.uk> From: Tvrtko Ursulin Organization: Intel Corporation UK Plc Message-ID: Date: Tue, 16 Jul 2019 16:25:22 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.7.2 MIME-Version: 1.0 In-Reply-To: <20190716124931.5870-1-chris@chris-wilson.co.uk> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org On 16/07/2019 13:49, Chris Wilson wrote: > Following a try_to_unmap() we may want to remove the userptr and so call > put_pages(). However, try_to_unmap() acquires the page lock and so we > must avoid recursively locking the pages ourselves -- which means that > we cannot safely acquire the lock around set_page_dirty(). Since we > can't be sure of the lock, we have to risk skip dirtying the page, or > else risk calling set_page_dirty() without a lock and so risk fs > corruption. So if trylock randomly fail we get data corruption in whatever data set application is working on, which is what the original patch was trying to avoid? Are we able to detect the backing store type so at least we don't risk skipping set_page_dirty with anonymous/shmemfs? Regards, Tvrtko > Reported-by: Lionel Landwerlin > Fixes: cb6d7c7dc7ff ("drm/i915/userptr: Acquire the page lock around set_page_dirty()") > Signed-off-by: Chris Wilson > Cc: Lionel Landwerlin > Cc: Tvrtko Ursulin > Cc: stable@vger.kernel.org > --- > drivers/gpu/drm/i915/gem/i915_gem_userptr.c | 16 ++++++++++++++-- > 1 file changed, 14 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c > index b9d2bb15e4a6..1ad2047a6dbd 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c > +++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c > @@ -672,7 +672,7 @@ i915_gem_userptr_put_pages(struct drm_i915_gem_object *obj, > obj->mm.dirty = false; > > for_each_sgt_page(page, sgt_iter, pages) { > - if (obj->mm.dirty) > + if (obj->mm.dirty && trylock_page(page)) { > /* > * As this may not be anonymous memory (e.g. shmem) > * but exist on a real mapping, we have to lock > @@ -680,8 +680,20 @@ i915_gem_userptr_put_pages(struct drm_i915_gem_object *obj, > * the page reference is not sufficient to > * prevent the inode from being truncated. > * Play safe and take the lock. > + * > + * However...! > + * > + * The mmu-notifier can be invalidated for a > + * migrate_page, that is alreadying holding the lock > + * on the page. Such a try_to_unmap() will result > + * in us calling put_pages() and so recursively try > + * to lock the page. We avoid that deadlock with > + * a trylock_page() and in exchange we risk missing > + * some page dirtying. > */ > - set_page_dirty_lock(page); > + set_page_dirty(page); > + unlock_page(page); > + } > > mark_page_accessed(page); > put_page(page); >