From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932404AbdDGJFk (ORCPT ); Fri, 7 Apr 2017 05:05:40 -0400 Received: from mga05.intel.com ([192.55.52.43]:35023 "EHLO mga05.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755030AbdDGJFa (ORCPT ); Fri, 7 Apr 2017 05:05:30 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.37,164,1488873600"; d="scan'208";a="245758583" Message-ID: <1491555922.3493.18.camel@linux.intel.com> Subject: Re: [Intel-gfx] [PATCH 1/5] i915: avoid kernel hang caused by synchronize rcu struct_mutex deadlock From: Joonas Lahtinen To: Andrea Arcangeli , Martin Kepplinger , Thorsten Leemhuis , daniel.vetter@intel.com, Dave Airlie , Chris Wilson Cc: intel-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org Date: Fri, 07 Apr 2017 12:05:22 +0300 In-Reply-To: <20170406232347.988-2-aarcange@redhat.com> References: <87pogtplxr.fsf@intel.com> <20170406232347.988-1-aarcange@redhat.com> <20170406232347.988-2-aarcange@redhat.com> Organization: Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.20.5 (3.20.5-1.fc24) Mime-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On pe, 2017-04-07 at 01:23 +0200, Andrea Arcangeli wrote: > synchronize_rcu/synchronize_sched/synchronize_rcu_expedited() will > hang until its own workqueues are run. The i915 gem workqueues will > wait on the struct_mutex to be released. So we cannot wait for a > quiescent state using those rcu primitives while holding the > struct_mutex or it creates a circular lock dependency resulting in > kernel hangs (which is reproducible but goes undetected by lockdep). > > This started in commit 3d3d18f086cdda72ee18a454db70ca72c6e3246c and > lockdep didn't detect it apparently. The right format is; Fixes: 3d3d18f086cd ("drm/i915: Avoid rcu_barrier() from reclaim paths (shrinker)") > @@ -324,6 +320,16 @@ i915_gem_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc) >   if (unlock) >   mutex_unlock(&dev->struct_mutex); >   > + if (likely(__mutex_owner(&dev->struct_mutex) != current)) This check can be dropped and synchronize_rcu_expedited() should be embedded directly to the if (unlock) branch as it's functionally equivalent. This can be applied to all the unlock cases, not just this one. That should be the correct action to avoid the deadlock. I've sent a patch to do this (Cc'd you), can you verify that it gets rid of the problem for you? > + /* > +  * If reclaim was invoked by an allocation done while > +  * holding the struct mutex, we cannot call > +  * synchronize_rcu_expedited() as it depends on > +  * workqueues to run but the running workqueue may be > +  * blocked waiting on us to release struct_mutex. > +  */ > + synchronize_rcu_expedited(); > + >   return freed; >  } >   > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Joonas Lahtinen Open Source Technology Center Intel Corporation From mboxrd@z Thu Jan 1 00:00:00 1970 From: Joonas Lahtinen Subject: Re: [Intel-gfx] [PATCH 1/5] i915: avoid kernel hang caused by synchronize rcu struct_mutex deadlock Date: Fri, 07 Apr 2017 12:05:22 +0300 Message-ID: <1491555922.3493.18.camel@linux.intel.com> References: <87pogtplxr.fsf@intel.com> <20170406232347.988-1-aarcange@redhat.com> <20170406232347.988-2-aarcange@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Return-path: In-Reply-To: <20170406232347.988-2-aarcange@redhat.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" To: Andrea Arcangeli , Martin Kepplinger , Thorsten Leemhuis , daniel.vetter@intel.com, Dave Airlie , Chris Wilson Cc: intel-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org List-Id: dri-devel@lists.freedesktop.org T24gcGUsIDIwMTctMDQtMDcgYXQgMDE6MjMgKzAyMDAsIEFuZHJlYSBBcmNhbmdlbGkgd3JvdGU6 Cj4gc3luY2hyb25pemVfcmN1L3N5bmNocm9uaXplX3NjaGVkL3N5bmNocm9uaXplX3JjdV9leHBl ZGl0ZWQoKSB3aWxsCj4gaGFuZyB1bnRpbCBpdHMgb3duIHdvcmtxdWV1ZXMgYXJlIHJ1bi4gVGhl IGk5MTUgZ2VtIHdvcmtxdWV1ZXMgd2lsbAo+IHdhaXQgb24gdGhlIHN0cnVjdF9tdXRleCB0byBi ZSByZWxlYXNlZC4gU28gd2UgY2Fubm90IHdhaXQgZm9yIGEKPiBxdWllc2NlbnQgc3RhdGUgdXNp bmcgdGhvc2UgcmN1IHByaW1pdGl2ZXMgd2hpbGUgaG9sZGluZyB0aGUKPiBzdHJ1Y3RfbXV0ZXgg b3IgaXQgY3JlYXRlcyBhIGNpcmN1bGFyIGxvY2sgZGVwZW5kZW5jeSByZXN1bHRpbmcgaW4KPiBr ZXJuZWwgaGFuZ3MgKHdoaWNoIGlzIHJlcHJvZHVjaWJsZSBidXQgZ29lcyB1bmRldGVjdGVkIGJ5 IGxvY2tkZXApLgo+IAo+IFRoaXMgc3RhcnRlZCBpbiBjb21taXQgM2QzZDE4ZjA4NmNkZGE3MmVl MThhNDU0ZGI3MGNhNzJjNmUzMjQ2YyBhbmQKPiBsb2NrZGVwIGRpZG4ndCBkZXRlY3QgaXQgYXBw YXJlbnRseS4KClRoZSByaWdodCBmb3JtYXQgaXM7CgpGaXhlczogM2QzZDE4ZjA4NmNkICgiZHJt L2k5MTU6IEF2b2lkIHJjdV9iYXJyaWVyKCkgZnJvbSByZWNsYWltIHBhdGhzIChzaHJpbmtlciki KQoKPiBAQCAtMzI0LDYgKzMyMCwxNiBAQCBpOTE1X2dlbV9zaHJpbmtlcl9zY2FuKHN0cnVjdCBz aHJpbmtlciAqc2hyaW5rZXIsIHN0cnVjdCBzaHJpbmtfY29udHJvbCAqc2MpCj4gwqAJaWYgKHVu bG9jaykKPiDCoAkJbXV0ZXhfdW5sb2NrKCZkZXYtPnN0cnVjdF9tdXRleCk7Cj4gwqAKPiArCWlm IChsaWtlbHkoX19tdXRleF9vd25lcigmZGV2LT5zdHJ1Y3RfbXV0ZXgpICE9IGN1cnJlbnQpKQoK VGhpcyBjaGVjayBjYW4gYmUgZHJvcHBlZCBhbmQgc3luY2hyb25pemVfcmN1X2V4cGVkaXRlZCgp IHNob3VsZCBiZQplbWJlZGRlZCBkaXJlY3RseSB0byB0aGUgaWYgKHVubG9jaykgYnJhbmNoIGFz IGl0J3MgZnVuY3Rpb25hbGx5CmVxdWl2YWxlbnQuIFRoaXMgY2FuIGJlIGFwcGxpZWQgdG8gYWxs IHRoZSB1bmxvY2sgY2FzZXMsIG5vdCBqdXN0IHRoaXMKb25lLiBUaGF0IHNob3VsZCBiZSB0aGUg Y29ycmVjdCBhY3Rpb24gdG8gYXZvaWQgdGhlIGRlYWRsb2NrLiBJJ3ZlIHNlbnQKYSBwYXRjaCB0 byBkbyB0aGlzIChDYydkIHlvdSksIGNhbiB5b3UgdmVyaWZ5IHRoYXQgaXQgZ2V0cyByaWQgb2Yg dGhlCnByb2JsZW0gZm9yIHlvdT8KCj4gKwkJLyoKPiArCQnCoCogSWYgcmVjbGFpbSB3YXMgaW52 b2tlZCBieSBhbiBhbGxvY2F0aW9uIGRvbmUgd2hpbGUKPiArCQnCoCogaG9sZGluZyB0aGUgc3Ry dWN0IG11dGV4LCB3ZSBjYW5ub3QgY2FsbAo+ICsJCcKgKiBzeW5jaHJvbml6ZV9yY3VfZXhwZWRp dGVkKCkgYXMgaXQgZGVwZW5kcyBvbgo+ICsJCcKgKiB3b3JrcXVldWVzIHRvIHJ1biBidXQgdGhl IHJ1bm5pbmcgd29ya3F1ZXVlIG1heSBiZQo+ICsJCcKgKiBibG9ja2VkIHdhaXRpbmcgb24gdXMg dG8gcmVsZWFzZSBzdHJ1Y3RfbXV0ZXguCj4gKwkJwqAqLwo+ICsJCXN5bmNocm9uaXplX3JjdV9l eHBlZGl0ZWQoKTsKPiArCj4gwqAJcmV0dXJuIGZyZWVkOwo+IMKgfQo+IMKgCj4gX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KPiBJbnRlbC1nZnggbWFpbGlu ZyBsaXN0Cj4gSW50ZWwtZ2Z4QGxpc3RzLmZyZWVkZXNrdG9wLm9yZwo+IGh0dHBzOi8vbGlzdHMu ZnJlZWRlc2t0b3Aub3JnL21haWxtYW4vbGlzdGluZm8vaW50ZWwtZ2Z4Ci0tIApKb29uYXMgTGFo dGluZW4KT3BlbiBTb3VyY2UgVGVjaG5vbG9neSBDZW50ZXIKSW50ZWwgQ29ycG9yYXRpb24KX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KZHJpLWRldmVsIG1h aWxpbmcgbGlzdApkcmktZGV2ZWxAbGlzdHMuZnJlZWRlc2t0b3Aub3JnCmh0dHBzOi8vbGlzdHMu ZnJlZWRlc2t0b3Aub3JnL21haWxtYW4vbGlzdGluZm8vZHJpLWRldmVsCg==