From: Byungchul Park <byungchul.park@lge.com> To: peterz@infradead.org, mingo@kernel.org Cc: tglx@linutronix.de, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kernel-team@lge.com Subject: [PATCH v2 1/3] lockdep: Introduce CROSSRELEASE_STACK_TRACE and make it not unwind as default Date: Thu, 19 Oct 2017 14:55:29 +0900 [thread overview] Message-ID: <1508392531-11284-2-git-send-email-byungchul.park@lge.com> (raw) In-Reply-To: <1508392531-11284-1-git-send-email-byungchul.park@lge.com> Johan Hovold reported a performance regression by crossrelease like: > Boot time (from "Linux version" to login prompt) had in fact doubled > since 4.13 where it took 17 seconds (with my current config) compared to > the 35 seconds I now see with 4.14-rc4. > > I quick bisect pointed to lockdep and specifically the following commit: > > 28a903f63ec0 ("locking/lockdep: Handle non(or multi)-acquisition > of a crosslock") > > which I've verified is the commit which doubled the boot time (compared > to 28a903f63ec0^) (added by lockdep crossrelease series [1]). Currently crossrelease performs unwind on every acquisition. But, that overloads systems too much. So this patch makes unwind optional and set it to N as default. Instead, it records only acquire_ip normally. Of course, unwind is sometimes required for full analysis. In that case, we can set CROSSRELEASE_STACK_TRACE to Y and use it. In my qemu ubuntu machin (x86_64, 4 cores, 512M), the regression was fixed like, measuring booting times with 'perf stat --null --repeat 10 $QEMU', where $QEMU launchs a kernel with init=/bin/true: 1. No lockdep enabled Performance counter stats for 'qemu_booting_time.sh bzImage' (10 runs): 2.756558155 seconds time elapsed ( +- 0.09% ) 2. Lockdep enabled Performance counter stats for 'qemu_booting_time.sh bzImage' (10 runs): 2.968710420 seconds time elapsed ( +- 0.12% ) 3. Lockdep enabled + crossrelease enabled Performance counter stats for 'qemu_booting_time.sh bzImage' (10 runs): 3.153839636 seconds time elapsed ( +- 0.31% ) 4. Lockdep enabled + crossrelease enabled + this patch applied Performance counter stats for 'qemu_booting_time.sh bzImage' (10 runs): 2.963669551 seconds time elapsed ( +- 0.11% ) Signed-off-by: Byungchul Park <byungchul.park@lge.com> --- include/linux/lockdep.h | 4 ++++ kernel/locking/lockdep.c | 5 +++++ lib/Kconfig.debug | 15 +++++++++++++++ 3 files changed, 24 insertions(+) diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h index bfa8e0b..70358b5 100644 --- a/include/linux/lockdep.h +++ b/include/linux/lockdep.h @@ -278,7 +278,11 @@ struct held_lock { }; #ifdef CONFIG_LOCKDEP_CROSSRELEASE +#ifdef CONFIG_CROSSRELEASE_STACK_TRACE #define MAX_XHLOCK_TRACE_ENTRIES 5 +#else +#define MAX_XHLOCK_TRACE_ENTRIES 1 +#endif /* * This is for keeping locks waiting for commit so that true dependencies diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c index e36e652..5c2ddf2 100644 --- a/kernel/locking/lockdep.c +++ b/kernel/locking/lockdep.c @@ -4863,8 +4863,13 @@ static void add_xhlock(struct held_lock *hlock) xhlock->trace.nr_entries = 0; xhlock->trace.max_entries = MAX_XHLOCK_TRACE_ENTRIES; xhlock->trace.entries = xhlock->trace_entries; +#ifdef CONFIG_CROSSRELEASE_STACK_TRACE xhlock->trace.skip = 3; save_stack_trace(&xhlock->trace); +#else + xhlock->trace.nr_entries = 1; + xhlock->trace.entries[0] = hlock->acquire_ip; +#endif } static inline int same_context_xhlock(struct hist_lock *xhlock) diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index 3db9167..90ea784 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -1225,6 +1225,21 @@ config LOCKDEP_COMPLETIONS A deadlock caused by wait_for_completion() and complete() can be detected by lockdep using crossrelease feature. +config CROSSRELEASE_STACK_TRACE + bool "Record more than one entity of stack trace in crossrelease" + depends on LOCKDEP_CROSSRELEASE + default n + help + The lockdep "cross-release" feature needs to record stack traces + (of calling functions) for all acquisitions, for eventual later + use during analysis. By default only a single caller is recorded, + because the unwind operation can be very expensive with deeper + stack chains. However, sometimes deeper traces are required for + full analysis. This option turns on the saving of the full stack + trace entries. + + If unsure, say N. + config DEBUG_LOCKDEP bool "Lock dependency engine debugging" depends on DEBUG_KERNEL && LOCKDEP -- 1.9.1
WARNING: multiple messages have this Message-ID (diff)
From: Byungchul Park <byungchul.park@lge.com> To: peterz@infradead.org, mingo@kernel.org Cc: tglx@linutronix.de, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kernel-team@lge.com Subject: [PATCH v2 1/3] lockdep: Introduce CROSSRELEASE_STACK_TRACE and make it not unwind as default Date: Thu, 19 Oct 2017 14:55:29 +0900 [thread overview] Message-ID: <1508392531-11284-2-git-send-email-byungchul.park@lge.com> (raw) In-Reply-To: <1508392531-11284-1-git-send-email-byungchul.park@lge.com> Johan Hovold reported a performance regression by crossrelease like: > Boot time (from "Linux version" to login prompt) had in fact doubled > since 4.13 where it took 17 seconds (with my current config) compared to > the 35 seconds I now see with 4.14-rc4. > > I quick bisect pointed to lockdep and specifically the following commit: > > 28a903f63ec0 ("locking/lockdep: Handle non(or multi)-acquisition > of a crosslock") > > which I've verified is the commit which doubled the boot time (compared > to 28a903f63ec0^) (added by lockdep crossrelease series [1]). Currently crossrelease performs unwind on every acquisition. But, that overloads systems too much. So this patch makes unwind optional and set it to N as default. Instead, it records only acquire_ip normally. Of course, unwind is sometimes required for full analysis. In that case, we can set CROSSRELEASE_STACK_TRACE to Y and use it. In my qemu ubuntu machin (x86_64, 4 cores, 512M), the regression was fixed like, measuring booting times with 'perf stat --null --repeat 10 $QEMU', where $QEMU launchs a kernel with init=/bin/true: 1. No lockdep enabled Performance counter stats for 'qemu_booting_time.sh bzImage' (10 runs): 2.756558155 seconds time elapsed ( +- 0.09% ) 2. Lockdep enabled Performance counter stats for 'qemu_booting_time.sh bzImage' (10 runs): 2.968710420 seconds time elapsed ( +- 0.12% ) 3. Lockdep enabled + crossrelease enabled Performance counter stats for 'qemu_booting_time.sh bzImage' (10 runs): 3.153839636 seconds time elapsed ( +- 0.31% ) 4. Lockdep enabled + crossrelease enabled + this patch applied Performance counter stats for 'qemu_booting_time.sh bzImage' (10 runs): 2.963669551 seconds time elapsed ( +- 0.11% ) Signed-off-by: Byungchul Park <byungchul.park@lge.com> --- include/linux/lockdep.h | 4 ++++ kernel/locking/lockdep.c | 5 +++++ lib/Kconfig.debug | 15 +++++++++++++++ 3 files changed, 24 insertions(+) diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h index bfa8e0b..70358b5 100644 --- a/include/linux/lockdep.h +++ b/include/linux/lockdep.h @@ -278,7 +278,11 @@ struct held_lock { }; #ifdef CONFIG_LOCKDEP_CROSSRELEASE +#ifdef CONFIG_CROSSRELEASE_STACK_TRACE #define MAX_XHLOCK_TRACE_ENTRIES 5 +#else +#define MAX_XHLOCK_TRACE_ENTRIES 1 +#endif /* * This is for keeping locks waiting for commit so that true dependencies diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c index e36e652..5c2ddf2 100644 --- a/kernel/locking/lockdep.c +++ b/kernel/locking/lockdep.c @@ -4863,8 +4863,13 @@ static void add_xhlock(struct held_lock *hlock) xhlock->trace.nr_entries = 0; xhlock->trace.max_entries = MAX_XHLOCK_TRACE_ENTRIES; xhlock->trace.entries = xhlock->trace_entries; +#ifdef CONFIG_CROSSRELEASE_STACK_TRACE xhlock->trace.skip = 3; save_stack_trace(&xhlock->trace); +#else + xhlock->trace.nr_entries = 1; + xhlock->trace.entries[0] = hlock->acquire_ip; +#endif } static inline int same_context_xhlock(struct hist_lock *xhlock) diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index 3db9167..90ea784 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -1225,6 +1225,21 @@ config LOCKDEP_COMPLETIONS A deadlock caused by wait_for_completion() and complete() can be detected by lockdep using crossrelease feature. +config CROSSRELEASE_STACK_TRACE + bool "Record more than one entity of stack trace in crossrelease" + depends on LOCKDEP_CROSSRELEASE + default n + help + The lockdep "cross-release" feature needs to record stack traces + (of calling functions) for all acquisitions, for eventual later + use during analysis. By default only a single caller is recorded, + because the unwind operation can be very expensive with deeper + stack chains. However, sometimes deeper traces are required for + full analysis. This option turns on the saving of the full stack + trace entries. + + If unsure, say N. + config DEBUG_LOCKDEP bool "Lock dependency engine debugging" depends on DEBUG_KERNEL && LOCKDEP -- 1.9.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2017-10-19 5:55 UTC|newest] Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top 2017-10-19 5:55 [PATCH v2 0/3] crossrelease: make it not unwind by default Byungchul Park 2017-10-19 5:55 ` Byungchul Park 2017-10-19 5:55 ` Byungchul Park [this message] 2017-10-19 5:55 ` [PATCH v2 1/3] lockdep: Introduce CROSSRELEASE_STACK_TRACE and make it not unwind as default Byungchul Park 2017-10-19 5:55 ` [PATCH v2 2/3] lockdep: Remove BROKEN flag of LOCKDEP_CROSSRELEASE Byungchul Park 2017-10-19 5:55 ` Byungchul Park 2017-10-19 15:05 ` Bart Van Assche 2017-10-19 15:05 ` Bart Van Assche 2017-10-19 15:34 ` Thomas Gleixner 2017-10-19 15:34 ` Thomas Gleixner 2017-10-19 15:47 ` Bart Van Assche 2017-10-19 19:04 ` Thomas Gleixner 2017-10-19 19:04 ` Thomas Gleixner 2017-10-19 19:12 ` Thomas Gleixner 2017-10-19 19:12 ` Thomas Gleixner 2017-10-19 20:21 ` Bart Van Assche 2017-10-19 20:21 ` Bart Van Assche 2017-10-19 20:33 ` Matthew Wilcox 2017-10-19 20:33 ` Matthew Wilcox 2017-10-19 20:41 ` Bart Van Assche 2017-10-19 20:53 ` Thomas Gleixner 2017-10-19 20:53 ` Thomas Gleixner 2017-10-19 20:49 ` Thomas Gleixner 2017-10-19 20:49 ` Thomas Gleixner 2017-10-20 7:30 ` Ingo Molnar 2017-10-20 7:30 ` Ingo Molnar 2017-10-20 6:03 ` Byungchul Park 2017-10-20 6:03 ` Byungchul Park 2017-10-19 5:55 ` [PATCH v2 3/3] lockdep: Add a kernel parameter, crossrelease_fullstack Byungchul Park 2017-10-19 5:55 ` Byungchul Park 2017-10-19 7:03 ` [PATCH v2 0/4] Fix false positives by cross-release feature Byungchul Park 2017-10-19 7:03 ` Byungchul Park 2017-10-19 7:03 ` [PATCH v2 1/4] completion: Add support for initializing completion with lockdep_map Byungchul Park 2017-10-19 7:03 ` Byungchul Park 2017-10-19 7:03 ` [PATCH v2 2/4] lockdep: Remove unnecessary acquisitions wrt workqueue flush Byungchul Park 2017-10-19 7:03 ` Byungchul Park 2017-10-19 7:03 ` [PATCH v2 3/4] genhd.h: Remove trailing white space Byungchul Park 2017-10-19 7:03 ` Byungchul Park 2017-10-19 7:03 ` [PATCH v2 4/4] lockdep: Assign a lock_class per gendisk used for wait_for_completion() Byungchul Park 2017-10-19 7:03 ` Byungchul Park 2017-10-20 14:44 ` Christoph Hellwig 2017-10-20 14:44 ` Christoph Hellwig 2017-10-20 14:44 ` Christoph Hellwig 2017-10-22 23:53 ` Byungchul Park 2017-10-22 23:53 ` Byungchul Park 2017-10-23 6:36 ` Christoph Hellwig 2017-10-23 6:36 ` Christoph Hellwig 2017-10-23 7:04 ` Byungchul Park 2017-10-23 7:04 ` Byungchul Park 2017-10-21 19:17 ` kbuild test robot
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=1508392531-11284-2-git-send-email-byungchul.park@lge.com \ --to=byungchul.park@lge.com \ --cc=kernel-team@lge.com \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=mingo@kernel.org \ --cc=peterz@infradead.org \ --cc=tglx@linutronix.de \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.