From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.1 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_PATCH,MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 21631C43464 for ; Fri, 18 Sep 2020 02:03:10 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E184822211 for ; Fri, 18 Sep 2020 02:03:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1600394589; bh=mB0PswWSGZUE0hQctMizawBCYsl1DAYFkU5l1hbNNLQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=JwnRn91dYbe0N4rwMg1CKm42TNTH5fHT8e5+8ke+iMwegTNh1W4cOzJoL0qz40JTK MGYXXkJ3zgymzm2x4g4lbXYk3uTTJUzAd3cM1r61X7TEqAz+8ZC1izz9Br+cf+O8S7 BtBa3b+wsiPRYoGTJb3BJFP9GDNPhLmyRdGUnpeQ= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726963AbgIRCDI (ORCPT ); Thu, 17 Sep 2020 22:03:08 -0400 Received: from mail.kernel.org ([198.145.29.99]:48558 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726876AbgIRCCy (ORCPT ); Thu, 17 Sep 2020 22:02:54 -0400 Received: from sasha-vm.mshome.net (c-73-47-72-35.hsd1.nh.comcast.net [73.47.72.35]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id C110123718; Fri, 18 Sep 2020 02:02:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1600394573; bh=mB0PswWSGZUE0hQctMizawBCYsl1DAYFkU5l1hbNNLQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=LctZEaljem9J0OpfHirw5iliTw8czdw5d2wr6BiQ1RR8TLpDZe1UvoZFT1GXRPfzO FCKq6vc1LYXjKrQmkO9dMDCVu7HOZ0wdSrkJ1lAn3VfuPSVMeVW6zFVaG/uPWiTa7h owNuAk4WCnLfwmHUqC7GIA+39Q95CQni+Mr4tbK8= From: Sasha Levin To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Cc: Andrey Grodzovsky , =?UTF-8?q?Christian=20K=C3=B6nig?= , Emily Deng , Sasha Levin , dri-devel@lists.freedesktop.org Subject: [PATCH AUTOSEL 5.4 085/330] drm/scheduler: Avoid accessing freed bad job. Date: Thu, 17 Sep 2020 21:57:05 -0400 Message-Id: <20200918020110.2063155-85-sashal@kernel.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918020110.2063155-1-sashal@kernel.org> References: <20200918020110.2063155-1-sashal@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 X-stable: review X-Patchwork-Hint: Ignore Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Andrey Grodzovsky [ Upstream commit 135517d3565b48f4def3b1b82008bc17eb5d1c90 ] Problem: Due to a race between drm_sched_cleanup_jobs in sched thread and drm_sched_job_timedout in timeout work there is a possiblity that bad job was already freed while still being accessed from the timeout thread. Fix: Instead of just peeking at the bad job in the mirror list remove it from the list under lock and then put it back later when we are garanteed no race with main sched thread is possible which is after the thread is parked. v2: Lock around processing ring_mirror_list in drm_sched_cleanup_jobs. v3: Rebase on top of drm-misc-next. v2 is not needed anymore as drm_sched_get_cleanup_job already has a lock there. v4: Fix comments to relfect latest code in drm-misc. Signed-off-by: Andrey Grodzovsky Reviewed-by: Christian König Reviewed-by: Emily Deng Tested-by: Emily Deng Signed-off-by: Christian König Link: https://patchwork.freedesktop.org/patch/342356 Signed-off-by: Sasha Levin --- drivers/gpu/drm/scheduler/sched_main.c | 27 ++++++++++++++++++++++++++ 1 file changed, 27 insertions(+) diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 30c5ddd6d081c..134e9106ebac1 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -284,10 +284,21 @@ static void drm_sched_job_timedout(struct work_struct *work) unsigned long flags; sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work); + + /* Protects against concurrent deletion in drm_sched_get_cleanup_job */ + spin_lock_irqsave(&sched->job_list_lock, flags); job = list_first_entry_or_null(&sched->ring_mirror_list, struct drm_sched_job, node); if (job) { + /* + * Remove the bad job so it cannot be freed by concurrent + * drm_sched_cleanup_jobs. It will be reinserted back after sched->thread + * is parked at which point it's safe. + */ + list_del_init(&job->node); + spin_unlock_irqrestore(&sched->job_list_lock, flags); + job->sched->ops->timedout_job(job); /* @@ -298,6 +309,8 @@ static void drm_sched_job_timedout(struct work_struct *work) job->sched->ops->free_job(job); sched->free_guilty = false; } + } else { + spin_unlock_irqrestore(&sched->job_list_lock, flags); } spin_lock_irqsave(&sched->job_list_lock, flags); @@ -369,6 +382,20 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad) kthread_park(sched->thread); + /* + * Reinsert back the bad job here - now it's safe as + * drm_sched_get_cleanup_job cannot race against us and release the + * bad job at this point - we parked (waited for) any in progress + * (earlier) cleanups and drm_sched_get_cleanup_job will not be called + * now until the scheduler thread is unparked. + */ + if (bad && bad->sched == sched) + /* + * Add at the head of the queue to reflect it was the earliest + * job extracted. + */ + list_add(&bad->node, &sched->ring_mirror_list); + /* * Iterate the job list from later to earlier one and either deactive * their HW callbacks or remove them from mirror list if they already -- 2.25.1 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 50CCFC43464 for ; Fri, 18 Sep 2020 02:02:55 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 0C654208DB for ; Fri, 18 Sep 2020 02:02:55 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="LctZEalj" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0C654208DB Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=dri-devel-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 7AA516E435; Fri, 18 Sep 2020 02:02:54 +0000 (UTC) Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by gabe.freedesktop.org (Postfix) with ESMTPS id A44B76E435 for ; Fri, 18 Sep 2020 02:02:53 +0000 (UTC) Received: from sasha-vm.mshome.net (c-73-47-72-35.hsd1.nh.comcast.net [73.47.72.35]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id C110123718; Fri, 18 Sep 2020 02:02:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1600394573; bh=mB0PswWSGZUE0hQctMizawBCYsl1DAYFkU5l1hbNNLQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=LctZEaljem9J0OpfHirw5iliTw8czdw5d2wr6BiQ1RR8TLpDZe1UvoZFT1GXRPfzO FCKq6vc1LYXjKrQmkO9dMDCVu7HOZ0wdSrkJ1lAn3VfuPSVMeVW6zFVaG/uPWiTa7h owNuAk4WCnLfwmHUqC7GIA+39Q95CQni+Mr4tbK8= From: Sasha Levin To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Subject: [PATCH AUTOSEL 5.4 085/330] drm/scheduler: Avoid accessing freed bad job. Date: Thu, 17 Sep 2020 21:57:05 -0400 Message-Id: <20200918020110.2063155-85-sashal@kernel.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918020110.2063155-1-sashal@kernel.org> References: <20200918020110.2063155-1-sashal@kernel.org> MIME-Version: 1.0 X-stable: review X-Patchwork-Hint: Ignore X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Sasha Levin , =?UTF-8?q?Christian=20K=C3=B6nig?= , dri-devel@lists.freedesktop.org, Emily Deng Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" RnJvbTogQW5kcmV5IEdyb2R6b3Zza3kgPGFuZHJleS5ncm9kem92c2t5QGFtZC5jb20+CgpbIFVw c3RyZWFtIGNvbW1pdCAxMzU1MTdkMzU2NWI0OGY0ZGVmM2IxYjgyMDA4YmMxN2ViNWQxYzkwIF0K ClByb2JsZW06CkR1ZSB0byBhIHJhY2UgYmV0d2VlbiBkcm1fc2NoZWRfY2xlYW51cF9qb2JzIGlu IHNjaGVkIHRocmVhZCBhbmQKZHJtX3NjaGVkX2pvYl90aW1lZG91dCBpbiB0aW1lb3V0IHdvcmsg dGhlcmUgaXMgYSBwb3NzaWJsaXR5IHRoYXQKYmFkIGpvYiB3YXMgYWxyZWFkeSBmcmVlZCB3aGls ZSBzdGlsbCBiZWluZyBhY2Nlc3NlZCBmcm9tIHRoZQp0aW1lb3V0IHRocmVhZC4KCkZpeDoKSW5z dGVhZCBvZiBqdXN0IHBlZWtpbmcgYXQgdGhlIGJhZCBqb2IgaW4gdGhlIG1pcnJvciBsaXN0CnJl bW92ZSBpdCBmcm9tIHRoZSBsaXN0IHVuZGVyIGxvY2sgYW5kIHRoZW4gcHV0IGl0IGJhY2sgbGF0 ZXIgd2hlbgp3ZSBhcmUgZ2FyYW50ZWVkIG5vIHJhY2Ugd2l0aCBtYWluIHNjaGVkIHRocmVhZCBp cyBwb3NzaWJsZSB3aGljaAppcyBhZnRlciB0aGUgdGhyZWFkIGlzIHBhcmtlZC4KCnYyOiBMb2Nr IGFyb3VuZCBwcm9jZXNzaW5nIHJpbmdfbWlycm9yX2xpc3QgaW4gZHJtX3NjaGVkX2NsZWFudXBf am9icy4KCnYzOiBSZWJhc2Ugb24gdG9wIG9mIGRybS1taXNjLW5leHQuIHYyIGlzIG5vdCBuZWVk ZWQgYW55bW9yZSBhcwpkcm1fc2NoZWRfZ2V0X2NsZWFudXBfam9iIGFscmVhZHkgaGFzIGEgbG9j ayB0aGVyZS4KCnY0OiBGaXggY29tbWVudHMgdG8gcmVsZmVjdCBsYXRlc3QgY29kZSBpbiBkcm0t bWlzYy4KClNpZ25lZC1vZmYtYnk6IEFuZHJleSBHcm9kem92c2t5IDxhbmRyZXkuZ3JvZHpvdnNr eUBhbWQuY29tPgpSZXZpZXdlZC1ieTogQ2hyaXN0aWFuIEvDtm5pZyA8Y2hyaXN0aWFuLmtvZW5p Z0BhbWQuY29tPgpSZXZpZXdlZC1ieTogRW1pbHkgRGVuZyA8RW1pbHkuRGVuZ0BhbWQuY29tPgpU ZXN0ZWQtYnk6IEVtaWx5IERlbmcgPEVtaWx5LkRlbmdAYW1kLmNvbT4KU2lnbmVkLW9mZi1ieTog Q2hyaXN0aWFuIEvDtm5pZyA8Y2hyaXN0aWFuLmtvZW5pZ0BhbWQuY29tPgpMaW5rOiBodHRwczov L3BhdGNod29yay5mcmVlZGVza3RvcC5vcmcvcGF0Y2gvMzQyMzU2ClNpZ25lZC1vZmYtYnk6IFNh c2hhIExldmluIDxzYXNoYWxAa2VybmVsLm9yZz4KLS0tCiBkcml2ZXJzL2dwdS9kcm0vc2NoZWR1 bGVyL3NjaGVkX21haW4uYyB8IDI3ICsrKysrKysrKysrKysrKysrKysrKysrKysrCiAxIGZpbGUg Y2hhbmdlZCwgMjcgaW5zZXJ0aW9ucygrKQoKZGlmZiAtLWdpdCBhL2RyaXZlcnMvZ3B1L2RybS9z Y2hlZHVsZXIvc2NoZWRfbWFpbi5jIGIvZHJpdmVycy9ncHUvZHJtL3NjaGVkdWxlci9zY2hlZF9t YWluLmMKaW5kZXggMzBjNWRkZDZkMDgxYy4uMTM0ZTkxMDZlYmFjMSAxMDA2NDQKLS0tIGEvZHJp dmVycy9ncHUvZHJtL3NjaGVkdWxlci9zY2hlZF9tYWluLmMKKysrIGIvZHJpdmVycy9ncHUvZHJt L3NjaGVkdWxlci9zY2hlZF9tYWluLmMKQEAgLTI4NCwxMCArMjg0LDIxIEBAIHN0YXRpYyB2b2lk IGRybV9zY2hlZF9qb2JfdGltZWRvdXQoc3RydWN0IHdvcmtfc3RydWN0ICp3b3JrKQogCXVuc2ln bmVkIGxvbmcgZmxhZ3M7CiAKIAlzY2hlZCA9IGNvbnRhaW5lcl9vZih3b3JrLCBzdHJ1Y3QgZHJt X2dwdV9zY2hlZHVsZXIsIHdvcmtfdGRyLndvcmspOworCisJLyogUHJvdGVjdHMgYWdhaW5zdCBj b25jdXJyZW50IGRlbGV0aW9uIGluIGRybV9zY2hlZF9nZXRfY2xlYW51cF9qb2IgKi8KKwlzcGlu X2xvY2tfaXJxc2F2ZSgmc2NoZWQtPmpvYl9saXN0X2xvY2ssIGZsYWdzKTsKIAlqb2IgPSBsaXN0 X2ZpcnN0X2VudHJ5X29yX251bGwoJnNjaGVkLT5yaW5nX21pcnJvcl9saXN0LAogCQkJCSAgICAg ICBzdHJ1Y3QgZHJtX3NjaGVkX2pvYiwgbm9kZSk7CiAKIAlpZiAoam9iKSB7CisJCS8qCisJCSAq IFJlbW92ZSB0aGUgYmFkIGpvYiBzbyBpdCBjYW5ub3QgYmUgZnJlZWQgYnkgY29uY3VycmVudAor CQkgKiBkcm1fc2NoZWRfY2xlYW51cF9qb2JzLiBJdCB3aWxsIGJlIHJlaW5zZXJ0ZWQgYmFjayBh ZnRlciBzY2hlZC0+dGhyZWFkCisJCSAqIGlzIHBhcmtlZCBhdCB3aGljaCBwb2ludCBpdCdzIHNh ZmUuCisJCSAqLworCQlsaXN0X2RlbF9pbml0KCZqb2ItPm5vZGUpOworCQlzcGluX3VubG9ja19p cnFyZXN0b3JlKCZzY2hlZC0+am9iX2xpc3RfbG9jaywgZmxhZ3MpOworCiAJCWpvYi0+c2NoZWQt Pm9wcy0+dGltZWRvdXRfam9iKGpvYik7CiAKIAkJLyoKQEAgLTI5OCw2ICszMDksOCBAQCBzdGF0 aWMgdm9pZCBkcm1fc2NoZWRfam9iX3RpbWVkb3V0KHN0cnVjdCB3b3JrX3N0cnVjdCAqd29yaykK IAkJCWpvYi0+c2NoZWQtPm9wcy0+ZnJlZV9qb2Ioam9iKTsKIAkJCXNjaGVkLT5mcmVlX2d1aWx0 eSA9IGZhbHNlOwogCQl9CisJfSBlbHNlIHsKKwkJc3Bpbl91bmxvY2tfaXJxcmVzdG9yZSgmc2No ZWQtPmpvYl9saXN0X2xvY2ssIGZsYWdzKTsKIAl9CiAKIAlzcGluX2xvY2tfaXJxc2F2ZSgmc2No ZWQtPmpvYl9saXN0X2xvY2ssIGZsYWdzKTsKQEAgLTM2OSw2ICszODIsMjAgQEAgdm9pZCBkcm1f c2NoZWRfc3RvcChzdHJ1Y3QgZHJtX2dwdV9zY2hlZHVsZXIgKnNjaGVkLCBzdHJ1Y3QgZHJtX3Nj aGVkX2pvYiAqYmFkKQogCiAJa3RocmVhZF9wYXJrKHNjaGVkLT50aHJlYWQpOwogCisJLyoKKwkg KiBSZWluc2VydCBiYWNrIHRoZSBiYWQgam9iIGhlcmUgLSBub3cgaXQncyBzYWZlIGFzCisJICog ZHJtX3NjaGVkX2dldF9jbGVhbnVwX2pvYiBjYW5ub3QgcmFjZSBhZ2FpbnN0IHVzIGFuZCByZWxl YXNlIHRoZQorCSAqIGJhZCBqb2IgYXQgdGhpcyBwb2ludCAtIHdlIHBhcmtlZCAod2FpdGVkIGZv cikgYW55IGluIHByb2dyZXNzCisJICogKGVhcmxpZXIpIGNsZWFudXBzIGFuZCBkcm1fc2NoZWRf Z2V0X2NsZWFudXBfam9iIHdpbGwgbm90IGJlIGNhbGxlZAorCSAqIG5vdyB1bnRpbCB0aGUgc2No ZWR1bGVyIHRocmVhZCBpcyB1bnBhcmtlZC4KKwkgKi8KKwlpZiAoYmFkICYmIGJhZC0+c2NoZWQg PT0gc2NoZWQpCisJCS8qCisJCSAqIEFkZCBhdCB0aGUgaGVhZCBvZiB0aGUgcXVldWUgdG8gcmVm bGVjdCBpdCB3YXMgdGhlIGVhcmxpZXN0CisJCSAqIGpvYiBleHRyYWN0ZWQuCisJCSAqLworCQls aXN0X2FkZCgmYmFkLT5ub2RlLCAmc2NoZWQtPnJpbmdfbWlycm9yX2xpc3QpOworCiAJLyoKIAkg KiBJdGVyYXRlIHRoZSBqb2IgbGlzdCBmcm9tIGxhdGVyIHRvICBlYXJsaWVyIG9uZSBhbmQgZWl0 aGVyIGRlYWN0aXZlCiAJICogdGhlaXIgSFcgY2FsbGJhY2tzIG9yIHJlbW92ZSB0aGVtIGZyb20g bWlycm9yIGxpc3QgaWYgdGhleSBhbHJlYWR5Ci0tIAoyLjI1LjEKCl9fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fCmRyaS1kZXZlbCBtYWlsaW5nIGxpc3QKZHJp LWRldmVsQGxpc3RzLmZyZWVkZXNrdG9wLm9yZwpodHRwczovL2xpc3RzLmZyZWVkZXNrdG9wLm9y Zy9tYWlsbWFuL2xpc3RpbmZvL2RyaS1kZXZlbAo=