From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wr1-f69.google.com (mail-wr1-f69.google.com [209.85.221.69]) by kanga.kvack.org (Postfix) with ESMTP id 530C86B0006 for ; Mon, 12 Nov 2018 00:15:42 -0500 (EST) Received: by mail-wr1-f69.google.com with SMTP id i1-v6so8241908wrr.18 for ; Sun, 11 Nov 2018 21:15:42 -0800 (PST) Received: from mail-sor-f65.google.com (mail-sor-f65.google.com. [209.85.220.65]) by mx.google.com with SMTPS id h15-v6sor2028808wrq.23.2018.11.11.21.15.40 for (Google Transport Security); Sun, 11 Nov 2018 21:15:40 -0800 (PST) Date: Mon, 12 Nov 2018 06:15:37 +0100 From: Ingo Molnar Subject: Re: [RFC PATCH 00/12] locking/lockdep: Add a new class of terminal locks Message-ID: <20181112051537.GB123204@gmail.com> References: <1541709268-3766-1-git-send-email-longman@redhat.com> <20181109080412.GC86700@gmail.com> <1fcaa330-a4be-0f8a-7974-7b17f0ce01ad@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1fcaa330-a4be-0f8a-7974-7b17f0ce01ad@redhat.com> Sender: owner-linux-mm@kvack.org List-ID: To: Waiman Long Cc: Josh Poimboeuf , Peter Zijlstra , Ingo Molnar , Will Deacon , Thomas Gleixner , linux-kernel@vger.kernel.org, kasan-dev@googlegroups.com, linux-mm@kvack.org, Petr Mladek , Sergey Senozhatsky , Andrey Ryabinin , Tejun Heo , Andrew Morton , Linus Torvalds * Waiman Long wrote: > > Could you please measure a locking intense workload instead, such as: > > > > $ perf stat --null --sync --repeat 10 perf bench sched messaging > > > > and profile which locks used there could be marked terminal, and measure > > the before/after performance impact? > > I will run the test. It will probably be done after the LPC next week. Thanks! > >> Below were selected output lines from the lockdep_stats files of the > >> patched and unpatched kernels after bootup and running parallel kernel > >> builds. > >> > >> Item Unpatched kernel Patched kernel % Change > >> ---- ---------------- -------------- -------- > >> direct dependencies 9732 8994 -7.6% > >> dependency chains 18776 17033 -9.3% > >> dependency chain hlocks 76044 68419 -10.0% > >> stack-trace entries 110403 104341 -5.5% > > That's pretty impressive! > > > >> There were some reductions in the size of the lockdep tables. They were > >> not significant, but it is still a good start to rein in the number of > >> entries in those tables to make it harder to overflow them. > > Agreed. > > > > BTW., if you are interested in more radical approaches to optimize > > lockdep, we could also add a static checker via objtool driven call graph > > analysis, and mark those locks terminal that we can prove are terminal. > > > > This would require the unified call graph of the kernel image and of all > > modules to be examined in a final pass, but that's within the principal > > scope of objtool. (This 'final pass' could also be done during bootup, at > > least in initial versions.) > > > > Note that beyond marking it 'terminal' such a static analysis pass would > > also allow the detection of obvious locking bugs at the build (or boot) > > stage already - plus it would allow the disabling of lockdep for > > self-contained locks that don't interact with anything else. > > > > I.e. the static analysis pass would 'augment' lockdep and leave only > > those locks active for runtime lockdep tracking whose dependencies it > > cannot prove to be correct yet. > > It is a pretty interesting idea to use objtool to scan for locks. The > list of locks that I marked as terminal in this patch was found by > looking at /proc/lockdep for those that only have backward dependencies, > but no forward dependency. I focused on those with a large number of BDs > and check the code to see if they could marked as terminal. This is a > rather labor intensive process and is subject to error. Yeah. > [...] It would be nice if it can be done by an automated tool. So I am > going to look into that, but it won't be part of this initial patchset, > though. Of course! > I sent this patchset out to see if anyone has any objection to it. It > seems you don't have any objection to that. So I am going to move ahead > to do more testing and performance analysis. The one worry I have is that this interim solution removes the benefit of a proper static analysis method. But if you promise to make a serious effort on the static analysis tooling as well (which should have awesome performance results and automate the manual markup), then I have no fundamental objections to the interim approach either. If static analysis works as well as I expect it to then in principle we might even be able to have lockdep enabled in production kernels: it would only add overhead to locks that are overly complex - which would create incentives to improve those dependencies. Thanks, Ingo