From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752860AbdLGU5D (ORCPT ); Thu, 7 Dec 2017 15:57:03 -0500 Received: from mail-wm0-f53.google.com ([74.125.82.53]:39474 "EHLO mail-wm0-f53.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750779AbdLGU5B (ORCPT ); Thu, 7 Dec 2017 15:57:01 -0500 X-Google-Smtp-Source: AGs4zMbxKda0OnI4o5K+QPF78Udvnk2mDOUadgb/3nXigO2qOjo3aNKbAoEcCs5ljk8fLpnqUiKOdw== Date: Thu, 7 Dec 2017 21:56:57 +0100 From: Daniel Vetter To: Peter Zijlstra Cc: LKML , DRI Development , Intel Graphics Development , Tvrtko Ursulin , Marta Lofstedt , Byungchul Park , Ingo Molnar , Tejun Heo , Kees Cook , Thomas Gleixner , Shaohua Li , Andrew Morton , Jens Axboe , Greg Kroah-Hartman , Jonathan Corbet , Oleg Nesterov , Daniel Vetter Subject: Re: [Intel-gfx] [PATCH] kthread: finer-grained lockdep/cross-release completion Message-ID: <20171207205657.wkt6xj3e2opegqeq@phenom.ffwll.local> Mail-Followup-To: Peter Zijlstra , LKML , DRI Development , Intel Graphics Development , Tvrtko Ursulin , Marta Lofstedt , Byungchul Park , Ingo Molnar , Tejun Heo , Kees Cook , Thomas Gleixner , Shaohua Li , Andrew Morton , Jens Axboe , Greg Kroah-Hartman , Jonathan Corbet , Oleg Nesterov , Daniel Vetter References: <20171207100849.407-1-daniel.vetter@ffwll.ch> <20171207122255.zi5ishny24k66ik7@hirez.programming.kicks-ass.net> <20171207145828.z52kjbrrlcl75m7m@phenom.ffwll.local> <20171207195709.cxcil57boc6czdot@hirez.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20171207195709.cxcil57boc6czdot@hirez.programming.kicks-ass.net> X-Operating-System: Linux phenom 4.13.0-1-amd64 User-Agent: NeoMutt/20170609 (1.8.3) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Dec 07, 2017 at 08:57:09PM +0100, Peter Zijlstra wrote: > On Thu, Dec 07, 2017 at 03:58:28PM +0100, Daniel Vetter wrote: > > > [ 85.069417] gem_exec_captur/2810 is trying to acquire lock: > > [ 85.069419] ((completion)&self->parked){+.+.}, at: [] kthread_park+0x3d/0x50 > > [ 85.069426] > > but task is already holding lock: > > [ 85.069428] (&dev->struct_mutex){+.+.}, at: [] i915_reset_device+0x1bd/0x230 [i915] > > [ 85.069448] > > which lock already depends on the new lock. > > > > [ 85.069451] > > the existing dependency chain (in reverse order) is: > > [ 85.069454] > > -> #3 (&dev->struct_mutex){+.+.}: > > [ 85.069460] __mutex_lock+0x81/0x9b0 > > [ 85.069481] i915_mutex_lock_interruptible+0x47/0x130 [i915] > > [ 85.069502] i915_gem_fault+0x201/0x760 [i915] > > [ 85.069507] __do_fault+0x15/0x70 > > [ 85.069509] __handle_mm_fault+0x7bf/0xda0 > > [ 85.069512] handle_mm_fault+0x14f/0x2f0 > > [ 85.069515] __do_page_fault+0x2d1/0x560 > > [ 85.069518] page_fault+0x22/0x30 > > [ 85.069520] > > -> #2 (&mm->mmap_sem){++++}: > > [ 85.069525] __might_fault+0x63/0x90 > > [ 85.069529] _copy_to_user+0x1e/0x70 > > [ 85.069532] perf_read+0x21d/0x290 > > [ 85.069534] __vfs_read+0x1e/0x120 > > [ 85.069536] vfs_read+0xa1/0x150 > > [ 85.069539] SyS_read+0x40/0xa0 > > [ 85.069541] entry_SYSCALL_64_fastpath+0x1c/0x89 > > > -> #0 ((completion)&self->parked){+.+.}: > > > [ 85.069692] Chain exists of: > > (completion)&self->parked --> &mm->mmap_sem --> &dev->struct_mutex > > > [ 85.069718] 3 locks held by gem_exec_captur/2810: > > > [ 85.069732] #2: (&dev->struct_mutex){+.+.}, at: [] i915_reset_device+0x1bd/0x230 [i915] > > > stack backtrace: > > > [ 85.069788] lock_acquire+0xaf/0x200 > > [ 85.069793] wait_for_common+0x54/0x210 > > [ 85.069807] kthread_park+0x3d/0x50 > > [ 85.069827] i915_gem_reset_prepare_engine+0x1d/0x90 [i915] > > [ 85.069849] i915_gem_reset_prepare+0x2c/0x60 [i915] > > [ 85.069865] i915_reset+0x66/0x230 [i915] > > [ 85.069881] i915_reset_device+0x1cb/0x230 [i915] > > [ 85.069919] i915_handle_error+0x2d3/0x430 [i915] > > [ 85.069951] i915_wedged_set+0x79/0xc0 [i915] > > [ 85.069955] simple_attr_write+0xab/0xc0 > > [ 85.069959] full_proxy_write+0x4b/0x70 > > [ 85.069961] __vfs_write+0x1e/0x130 > > [ 85.069976] vfs_write+0xc0/0x1b0 > > [ 85.069979] SyS_write+0x40/0xa0 > > [ 85.069982] entry_SYSCALL_64_fastpath+0x1c/0x89 > > > Thread-A k-Thread > > i915_reset_device > #3 mutex_lock(&dev->struct_mutex) > i915_reset() > i915_gem_reset_prepare() > i915_gem_reset_prepare_engine() > kthread_park() > > __do_page_fault() > #2 down_read(&mm->mmap_sem); > handle_mm_fault() > __handle_mm_fault() > __do_fault() > i915_gem_fault() > i915_mutex_lock_interruptible() > #3 mutex_lock(&dev->struct_mutex) > > /* twiddles thumbs forever more */ > > #0 wait_for_common() > > #0 complete() > > > Is what it says I suppose. Now I don't know enough about that i915 code > to say if that breadcrumbs_signal thread can ever trigger a fault or > not. I got properly lost in that dma_fence callback maze. > > You're saying not? Our own kthread, no. At least a tons of run on our CI with the kthread patch applied shut up lockdep splats for good. And since we have all the i915 kthreads still with the same lockdep_map even with the patch applied, since they are all created in the same function, I think that's pretty solid evidence. [There's also really no reasonable reason for it to fault, but I trust automated tools more to check this stuff than my own brain. The test suite we're running is fairly nasty and does all kinds of corner case thrashing. Note that the dma_fence callbacks can be provideded by any other driver (think multi-gpu desktops and stuff), but the contract is that they must be able to handle hardirq context. Faulting's definitely not on the table.] The problem lockdep seems to complain about is that some random other kthread could fault, end up in the i915 fault handler, and get stuck until i915_reset_device is done doing what it needs to do. But as long as that kthread is in turn not providing a service that i915_reset_device needs, I don't see how that can deadlock. And if we have that case (there was definitely plenty of that stuff that cross-release uncovered in our code, we had to shuffle a bunch of allocations and things out from under dev->struct_mutex), then there should be another lock or completion that closes the loop again. Or I'm not understanding what your asking, this stuff is a bit complicated :-) > (also, that comment near need_resched() doesn't make sense to me) I assume you mean the one in intel_breadcrumbs_signaler(). The hw design is somewhat screwed up and depends upon ridiculously low interrupt servicing time. We get there by essentially implementing something like netdev polled mode, from irq context. Like net polling if the scheduler gets pissed at us we stop and dump it all into a kthread. From a latency and ops/sec pov a gpu is pretty close to networking sometimes. [Note: I just have a rough idea what the code is supposed to do, I didn't write/review/debug that one.] Cheers, Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch