All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/8] cross-release: enhence performance and fix false positives
@ 2017-10-24  9:38 ` Byungchul Park
  0 siblings, 0 replies; 40+ messages in thread
From: Byungchul Park @ 2017-10-24  9:38 UTC (permalink / raw)
  To: peterz, mingo, axboe
  Cc: tglx, linux-kernel, linux-mm, tj, johannes.berg, oleg, amir73il,
	david, darrick.wong, linux-xfs, linux-fsdevel, linux-block, hch,
	idryomov, kernel-team

Changes from v2
- Combine 2 serises, fixing false positives and enhance performance
- Add Christoph Hellwig's patch simplifying submit_bio_wait() code
- Add 2 more 'init with lockdep map' macros for completionm
- Rename init_completion_with_map() to init_completion_map()

Changes from v1
- Fix kconfig description as Ingo suggested
- Fix commit message writing out CONFIG_ variable
- Introduce a new kernel parameter, crossrelease_fullstack
- Replace the number with the output of *perf*
- Separate a patch removing white space

Byungchul Park (7):
  lockdep: Introduce CROSSRELEASE_STACK_TRACE and make it not unwind as
    default
  lockdep: Remove BROKEN flag of LOCKDEP_CROSSRELEASE
  lockdep: Add a kernel parameter, crossrelease_fullstack
  completion: Add support for initializing completion with lockdep_map
  lockdep: Remove unnecessary acquisitions wrt workqueue flush
  genhd.h: Remove trailing white space
  block: Assign a lock_class per gendisk used for wait_for_completion()

Christoph Hellwig (1):
  block: use DECLARE_COMPLETION_ONSTACK in submit_bio_wait

 Documentation/admin-guide/kernel-parameters.txt |  3 +++
 block/bio.c                                     | 19 +++++-------------
 block/genhd.c                                   | 13 +++++--------
 include/linux/completion.h                      | 14 +++++++++++++
 include/linux/genhd.h                           | 26 +++++++++++++++++++++----
 include/linux/lockdep.h                         |  4 ++++
 include/linux/workqueue.h                       |  4 ++--
 kernel/locking/lockdep.c                        | 23 ++++++++++++++++++++--
 kernel/workqueue.c                              | 19 +++---------------
 lib/Kconfig.debug                               | 20 +++++++++++++++++--
 10 files changed, 97 insertions(+), 48 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v3 0/8] cross-release: enhence performance and fix false positives
@ 2017-10-24  9:38 ` Byungchul Park
  0 siblings, 0 replies; 40+ messages in thread
From: Byungchul Park @ 2017-10-24  9:38 UTC (permalink / raw)
  To: peterz, mingo, axboe
  Cc: tglx, linux-kernel, linux-mm, tj, johannes.berg, oleg, amir73il,
	david, darrick.wong, linux-xfs, linux-fsdevel, linux-block, hch,
	idryomov, kernel-team

Changes from v2
- Combine 2 serises, fixing false positives and enhance performance
- Add Christoph Hellwig's patch simplifying submit_bio_wait() code
- Add 2 more 'init with lockdep map' macros for completionm
- Rename init_completion_with_map() to init_completion_map()

Changes from v1
- Fix kconfig description as Ingo suggested
- Fix commit message writing out CONFIG_ variable
- Introduce a new kernel parameter, crossrelease_fullstack
- Replace the number with the output of *perf*
- Separate a patch removing white space

Byungchul Park (7):
  lockdep: Introduce CROSSRELEASE_STACK_TRACE and make it not unwind as
    default
  lockdep: Remove BROKEN flag of LOCKDEP_CROSSRELEASE
  lockdep: Add a kernel parameter, crossrelease_fullstack
  completion: Add support for initializing completion with lockdep_map
  lockdep: Remove unnecessary acquisitions wrt workqueue flush
  genhd.h: Remove trailing white space
  block: Assign a lock_class per gendisk used for wait_for_completion()

Christoph Hellwig (1):
  block: use DECLARE_COMPLETION_ONSTACK in submit_bio_wait

 Documentation/admin-guide/kernel-parameters.txt |  3 +++
 block/bio.c                                     | 19 +++++-------------
 block/genhd.c                                   | 13 +++++--------
 include/linux/completion.h                      | 14 +++++++++++++
 include/linux/genhd.h                           | 26 +++++++++++++++++++++----
 include/linux/lockdep.h                         |  4 ++++
 include/linux/workqueue.h                       |  4 ++--
 kernel/locking/lockdep.c                        | 23 ++++++++++++++++++++--
 kernel/workqueue.c                              | 19 +++---------------
 lib/Kconfig.debug                               | 20 +++++++++++++++++--
 10 files changed, 97 insertions(+), 48 deletions(-)

-- 
1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v3 1/8] block: use DECLARE_COMPLETION_ONSTACK in submit_bio_wait
  2017-10-24  9:38 ` Byungchul Park
@ 2017-10-24  9:38   ` Byungchul Park
  -1 siblings, 0 replies; 40+ messages in thread
From: Byungchul Park @ 2017-10-24  9:38 UTC (permalink / raw)
  To: peterz, mingo, axboe
  Cc: tglx, linux-kernel, linux-mm, tj, johannes.berg, oleg, amir73il,
	david, darrick.wong, linux-xfs, linux-fsdevel, linux-block, hch,
	idryomov, kernel-team, Christoph Hellwig

From: Christoph Hellwig <hch@lst.de>

Simplify the code by getting rid of the submit_bio_ret structure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/bio.c | 19 +++++--------------
 1 file changed, 5 insertions(+), 14 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index 101c2a9..5e901bf 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -917,17 +917,9 @@ int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
 }
 EXPORT_SYMBOL_GPL(bio_iov_iter_get_pages);
 
-struct submit_bio_ret {
-	struct completion event;
-	int error;
-};
-
 static void submit_bio_wait_endio(struct bio *bio)
 {
-	struct submit_bio_ret *ret = bio->bi_private;
-
-	ret->error = blk_status_to_errno(bio->bi_status);
-	complete(&ret->event);
+	complete(bio->bi_private);
 }
 
 /**
@@ -943,16 +935,15 @@ static void submit_bio_wait_endio(struct bio *bio)
  */
 int submit_bio_wait(struct bio *bio)
 {
-	struct submit_bio_ret ret;
+	DECLARE_COMPLETION_ONSTACK(done);
 
-	init_completion(&ret.event);
-	bio->bi_private = &ret;
+	bio->bi_private = &done;
 	bio->bi_end_io = submit_bio_wait_endio;
 	bio->bi_opf |= REQ_SYNC;
 	submit_bio(bio);
-	wait_for_completion_io(&ret.event);
+	wait_for_completion_io(&done);
 
-	return ret.error;
+	return blk_status_to_errno(bio->bi_status);
 }
 EXPORT_SYMBOL(submit_bio_wait);
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v3 1/8] block: use DECLARE_COMPLETION_ONSTACK in submit_bio_wait
@ 2017-10-24  9:38   ` Byungchul Park
  0 siblings, 0 replies; 40+ messages in thread
From: Byungchul Park @ 2017-10-24  9:38 UTC (permalink / raw)
  To: peterz, mingo, axboe
  Cc: tglx, linux-kernel, linux-mm, tj, johannes.berg, oleg, amir73il,
	david, darrick.wong, linux-xfs, linux-fsdevel, linux-block, hch,
	idryomov, kernel-team, Christoph Hellwig

From: Christoph Hellwig <hch@lst.de>

Simplify the code by getting rid of the submit_bio_ret structure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/bio.c | 19 +++++--------------
 1 file changed, 5 insertions(+), 14 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index 101c2a9..5e901bf 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -917,17 +917,9 @@ int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
 }
 EXPORT_SYMBOL_GPL(bio_iov_iter_get_pages);
 
-struct submit_bio_ret {
-	struct completion event;
-	int error;
-};
-
 static void submit_bio_wait_endio(struct bio *bio)
 {
-	struct submit_bio_ret *ret = bio->bi_private;
-
-	ret->error = blk_status_to_errno(bio->bi_status);
-	complete(&ret->event);
+	complete(bio->bi_private);
 }
 
 /**
@@ -943,16 +935,15 @@ static void submit_bio_wait_endio(struct bio *bio)
  */
 int submit_bio_wait(struct bio *bio)
 {
-	struct submit_bio_ret ret;
+	DECLARE_COMPLETION_ONSTACK(done);
 
-	init_completion(&ret.event);
-	bio->bi_private = &ret;
+	bio->bi_private = &done;
 	bio->bi_end_io = submit_bio_wait_endio;
 	bio->bi_opf |= REQ_SYNC;
 	submit_bio(bio);
-	wait_for_completion_io(&ret.event);
+	wait_for_completion_io(&done);
 
-	return ret.error;
+	return blk_status_to_errno(bio->bi_status);
 }
 EXPORT_SYMBOL(submit_bio_wait);
 
-- 
1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v3 2/8] lockdep: Introduce CROSSRELEASE_STACK_TRACE and make it not unwind as default
  2017-10-24  9:38 ` Byungchul Park
@ 2017-10-24  9:38   ` Byungchul Park
  -1 siblings, 0 replies; 40+ messages in thread
From: Byungchul Park @ 2017-10-24  9:38 UTC (permalink / raw)
  To: peterz, mingo, axboe
  Cc: tglx, linux-kernel, linux-mm, tj, johannes.berg, oleg, amir73il,
	david, darrick.wong, linux-xfs, linux-fsdevel, linux-block, hch,
	idryomov, kernel-team

Johan Hovold reported a performance regression by crossrelease like:

> Boot time (from "Linux version" to login prompt) had in fact doubled
> since 4.13 where it took 17 seconds (with my current config) compared to
> the 35 seconds I now see with 4.14-rc4.
>
> I quick bisect pointed to lockdep and specifically the following commit:
>
> 	28a903f63ec0 ("locking/lockdep: Handle non(or multi)-acquisition
> 	               of a crosslock")
>
> which I've verified is the commit which doubled the boot time (compared
> to 28a903f63ec0^) (added by lockdep crossrelease series [1]).

Currently crossrelease performs unwind on every acquisition. But, that
overloads systems too much. So this patch makes unwind optional and set
it to N as default. Instead, it records only acquire_ip normally. Of
course, unwind is sometimes required for full analysis. In that case, we
can set CROSSRELEASE_STACK_TRACE to Y and use it.

In my qemu ubuntu machin (x86_64, 4 cores, 512M), the regression was
fixed like, measuring booting times with 'perf stat --null --repeat 10
$QEMU', where $QEMU launchs a kernel with init=/bin/true:

1. No lockdep enabled

 Performance counter stats for 'qemu_booting_time.sh bzImage' (10 runs):

       2.756558155 seconds time elapsed                    ( +-  0.09% )

2. Lockdep enabled

 Performance counter stats for 'qemu_booting_time.sh bzImage' (10 runs):

       2.968710420 seconds time elapsed                    ( +-  0.12% )

3. Lockdep enabled + crossrelease enabled

 Performance counter stats for 'qemu_booting_time.sh bzImage' (10 runs):

       3.153839636 seconds time elapsed                    ( +-  0.31% )

4. Lockdep enabled + crossrelease enabled + this patch applied

 Performance counter stats for 'qemu_booting_time.sh bzImage' (10 runs):

       2.963669551 seconds time elapsed                    ( +-  0.11% )

Signed-off-by: Byungchul Park <byungchul.park@lge.com>
---
 include/linux/lockdep.h  |  4 ++++
 kernel/locking/lockdep.c |  5 +++++
 lib/Kconfig.debug        | 15 +++++++++++++++
 3 files changed, 24 insertions(+)

diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index bfa8e0b..70358b5 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -278,7 +278,11 @@ struct held_lock {
 };
 
 #ifdef CONFIG_LOCKDEP_CROSSRELEASE
+#ifdef CONFIG_CROSSRELEASE_STACK_TRACE
 #define MAX_XHLOCK_TRACE_ENTRIES 5
+#else
+#define MAX_XHLOCK_TRACE_ENTRIES 1
+#endif
 
 /*
  * This is for keeping locks waiting for commit so that true dependencies
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index e36e652..5c2ddf2 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -4863,8 +4863,13 @@ static void add_xhlock(struct held_lock *hlock)
 	xhlock->trace.nr_entries = 0;
 	xhlock->trace.max_entries = MAX_XHLOCK_TRACE_ENTRIES;
 	xhlock->trace.entries = xhlock->trace_entries;
+#ifdef CONFIG_CROSSRELEASE_STACK_TRACE
 	xhlock->trace.skip = 3;
 	save_stack_trace(&xhlock->trace);
+#else
+	xhlock->trace.nr_entries = 1;
+	xhlock->trace.entries[0] = hlock->acquire_ip;
+#endif
 }
 
 static inline int same_context_xhlock(struct hist_lock *xhlock)
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 3db9167..90ea784 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1225,6 +1225,21 @@ config LOCKDEP_COMPLETIONS
 	 A deadlock caused by wait_for_completion() and complete() can be
 	 detected by lockdep using crossrelease feature.
 
+config CROSSRELEASE_STACK_TRACE
+	bool "Record more than one entity of stack trace in crossrelease"
+	depends on LOCKDEP_CROSSRELEASE
+	default n
+	help
+	 The lockdep "cross-release" feature needs to record stack traces
+	 (of calling functions) for all acquisitions, for eventual later
+	 use during analysis. By default only a single caller is recorded,
+	 because the unwind operation can be very expensive with deeper
+	 stack chains. However, sometimes deeper traces are required for
+	 full analysis. This option turns on the saving of the full stack
+	 trace entries.
+
+	 If unsure, say N.
+
 config DEBUG_LOCKDEP
 	bool "Lock dependency engine debugging"
 	depends on DEBUG_KERNEL && LOCKDEP
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v3 2/8] lockdep: Introduce CROSSRELEASE_STACK_TRACE and make it not unwind as default
@ 2017-10-24  9:38   ` Byungchul Park
  0 siblings, 0 replies; 40+ messages in thread
From: Byungchul Park @ 2017-10-24  9:38 UTC (permalink / raw)
  To: peterz, mingo, axboe
  Cc: tglx, linux-kernel, linux-mm, tj, johannes.berg, oleg, amir73il,
	david, darrick.wong, linux-xfs, linux-fsdevel, linux-block, hch,
	idryomov, kernel-team

Johan Hovold reported a performance regression by crossrelease like:

> Boot time (from "Linux version" to login prompt) had in fact doubled
> since 4.13 where it took 17 seconds (with my current config) compared to
> the 35 seconds I now see with 4.14-rc4.
>
> I quick bisect pointed to lockdep and specifically the following commit:
>
> 	28a903f63ec0 ("locking/lockdep: Handle non(or multi)-acquisition
> 	               of a crosslock")
>
> which I've verified is the commit which doubled the boot time (compared
> to 28a903f63ec0^) (added by lockdep crossrelease series [1]).

Currently crossrelease performs unwind on every acquisition. But, that
overloads systems too much. So this patch makes unwind optional and set
it to N as default. Instead, it records only acquire_ip normally. Of
course, unwind is sometimes required for full analysis. In that case, we
can set CROSSRELEASE_STACK_TRACE to Y and use it.

In my qemu ubuntu machin (x86_64, 4 cores, 512M), the regression was
fixed like, measuring booting times with 'perf stat --null --repeat 10
$QEMU', where $QEMU launchs a kernel with init=/bin/true:

1. No lockdep enabled

 Performance counter stats for 'qemu_booting_time.sh bzImage' (10 runs):

       2.756558155 seconds time elapsed                    ( +-  0.09% )

2. Lockdep enabled

 Performance counter stats for 'qemu_booting_time.sh bzImage' (10 runs):

       2.968710420 seconds time elapsed                    ( +-  0.12% )

3. Lockdep enabled + crossrelease enabled

 Performance counter stats for 'qemu_booting_time.sh bzImage' (10 runs):

       3.153839636 seconds time elapsed                    ( +-  0.31% )

4. Lockdep enabled + crossrelease enabled + this patch applied

 Performance counter stats for 'qemu_booting_time.sh bzImage' (10 runs):

       2.963669551 seconds time elapsed                    ( +-  0.11% )

Signed-off-by: Byungchul Park <byungchul.park@lge.com>
---
 include/linux/lockdep.h  |  4 ++++
 kernel/locking/lockdep.c |  5 +++++
 lib/Kconfig.debug        | 15 +++++++++++++++
 3 files changed, 24 insertions(+)

diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index bfa8e0b..70358b5 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -278,7 +278,11 @@ struct held_lock {
 };
 
 #ifdef CONFIG_LOCKDEP_CROSSRELEASE
+#ifdef CONFIG_CROSSRELEASE_STACK_TRACE
 #define MAX_XHLOCK_TRACE_ENTRIES 5
+#else
+#define MAX_XHLOCK_TRACE_ENTRIES 1
+#endif
 
 /*
  * This is for keeping locks waiting for commit so that true dependencies
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index e36e652..5c2ddf2 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -4863,8 +4863,13 @@ static void add_xhlock(struct held_lock *hlock)
 	xhlock->trace.nr_entries = 0;
 	xhlock->trace.max_entries = MAX_XHLOCK_TRACE_ENTRIES;
 	xhlock->trace.entries = xhlock->trace_entries;
+#ifdef CONFIG_CROSSRELEASE_STACK_TRACE
 	xhlock->trace.skip = 3;
 	save_stack_trace(&xhlock->trace);
+#else
+	xhlock->trace.nr_entries = 1;
+	xhlock->trace.entries[0] = hlock->acquire_ip;
+#endif
 }
 
 static inline int same_context_xhlock(struct hist_lock *xhlock)
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 3db9167..90ea784 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1225,6 +1225,21 @@ config LOCKDEP_COMPLETIONS
 	 A deadlock caused by wait_for_completion() and complete() can be
 	 detected by lockdep using crossrelease feature.
 
+config CROSSRELEASE_STACK_TRACE
+	bool "Record more than one entity of stack trace in crossrelease"
+	depends on LOCKDEP_CROSSRELEASE
+	default n
+	help
+	 The lockdep "cross-release" feature needs to record stack traces
+	 (of calling functions) for all acquisitions, for eventual later
+	 use during analysis. By default only a single caller is recorded,
+	 because the unwind operation can be very expensive with deeper
+	 stack chains. However, sometimes deeper traces are required for
+	 full analysis. This option turns on the saving of the full stack
+	 trace entries.
+
+	 If unsure, say N.
+
 config DEBUG_LOCKDEP
 	bool "Lock dependency engine debugging"
 	depends on DEBUG_KERNEL && LOCKDEP
-- 
1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v3 3/8] lockdep: Remove BROKEN flag of LOCKDEP_CROSSRELEASE
  2017-10-24  9:38 ` Byungchul Park
@ 2017-10-24  9:38   ` Byungchul Park
  -1 siblings, 0 replies; 40+ messages in thread
From: Byungchul Park @ 2017-10-24  9:38 UTC (permalink / raw)
  To: peterz, mingo, axboe
  Cc: tglx, linux-kernel, linux-mm, tj, johannes.berg, oleg, amir73il,
	david, darrick.wong, linux-xfs, linux-fsdevel, linux-block, hch,
	idryomov, kernel-team

Now the performance regression was fixed, re-enable
CONFIG_LOCKDEP_CROSSRELEASE and CONFIG_LOCKDEP_COMPLETIONS.

Signed-off-by: Byungchul Park <byungchul.park@lge.com>
---
 lib/Kconfig.debug | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 90ea784..fe8fceb 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1138,8 +1138,8 @@ config PROVE_LOCKING
 	select DEBUG_MUTEXES
 	select DEBUG_RT_MUTEXES if RT_MUTEXES
 	select DEBUG_LOCK_ALLOC
-	select LOCKDEP_CROSSRELEASE if BROKEN
-	select LOCKDEP_COMPLETIONS if BROKEN
+	select LOCKDEP_CROSSRELEASE
+	select LOCKDEP_COMPLETIONS
 	select TRACE_IRQFLAGS
 	default n
 	help
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v3 3/8] lockdep: Remove BROKEN flag of LOCKDEP_CROSSRELEASE
@ 2017-10-24  9:38   ` Byungchul Park
  0 siblings, 0 replies; 40+ messages in thread
From: Byungchul Park @ 2017-10-24  9:38 UTC (permalink / raw)
  To: peterz, mingo, axboe
  Cc: tglx, linux-kernel, linux-mm, tj, johannes.berg, oleg, amir73il,
	david, darrick.wong, linux-xfs, linux-fsdevel, linux-block, hch,
	idryomov, kernel-team

Now the performance regression was fixed, re-enable
CONFIG_LOCKDEP_CROSSRELEASE and CONFIG_LOCKDEP_COMPLETIONS.

Signed-off-by: Byungchul Park <byungchul.park@lge.com>
---
 lib/Kconfig.debug | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 90ea784..fe8fceb 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1138,8 +1138,8 @@ config PROVE_LOCKING
 	select DEBUG_MUTEXES
 	select DEBUG_RT_MUTEXES if RT_MUTEXES
 	select DEBUG_LOCK_ALLOC
-	select LOCKDEP_CROSSRELEASE if BROKEN
-	select LOCKDEP_COMPLETIONS if BROKEN
+	select LOCKDEP_CROSSRELEASE
+	select LOCKDEP_COMPLETIONS
 	select TRACE_IRQFLAGS
 	default n
 	help
-- 
1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v3 4/8] lockdep: Add a kernel parameter, crossrelease_fullstack
  2017-10-24  9:38 ` Byungchul Park
@ 2017-10-24  9:38   ` Byungchul Park
  -1 siblings, 0 replies; 40+ messages in thread
From: Byungchul Park @ 2017-10-24  9:38 UTC (permalink / raw)
  To: peterz, mingo, axboe
  Cc: tglx, linux-kernel, linux-mm, tj, johannes.berg, oleg, amir73il,
	david, darrick.wong, linux-xfs, linux-fsdevel, linux-block, hch,
	idryomov, kernel-team

Make whether to allow recording full stack, in cross-release feature,
switchable at boot time via a kernel parameter, 'crossrelease_fullstack'.
In case of a splat with no stack trace, one could just reboot and set
the kernel parameter to get the full data without having to recompile
the kernel.

Change CONFIG_CROSSRELEASE_STACK_TRACE default from N to Y, and
introduce the new kernel parameter.

Signed-off-by: Byungchul Park <byungchul.park@lge.com>
---
 Documentation/admin-guide/kernel-parameters.txt |  3 +++
 kernel/locking/lockdep.c                        | 18 ++++++++++++++++--
 lib/Kconfig.debug                               |  5 +++--
 3 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index ead7f40..4107b01 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -709,6 +709,9 @@
 			It will be ignored when crashkernel=X,high is not used
 			or memory reserved is below 4G.
 
+	crossrelease_fullstack
+			[KNL] Allow to record full stack trace in cross-release
+
 	cryptomgr.notests
                         [KNL] Disable crypto self-tests
 
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 5c2ddf2..feba887 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -76,6 +76,15 @@
 #define lock_stat 0
 #endif
 
+static int crossrelease_fullstack;
+static int __init allow_crossrelease_fullstack(char *str)
+{
+	crossrelease_fullstack = 1;
+	return 0;
+}
+
+early_param("crossrelease_fullstack", allow_crossrelease_fullstack);
+
 /*
  * lockdep_lock: protects the lockdep graph, the hashes and the
  *               class/list/hash allocators.
@@ -4864,8 +4873,13 @@ static void add_xhlock(struct held_lock *hlock)
 	xhlock->trace.max_entries = MAX_XHLOCK_TRACE_ENTRIES;
 	xhlock->trace.entries = xhlock->trace_entries;
 #ifdef CONFIG_CROSSRELEASE_STACK_TRACE
-	xhlock->trace.skip = 3;
-	save_stack_trace(&xhlock->trace);
+	if (crossrelease_fullstack) {
+		xhlock->trace.skip = 3;
+		save_stack_trace(&xhlock->trace);
+	} else {
+		xhlock->trace.nr_entries = 1;
+		xhlock->trace.entries[0] = hlock->acquire_ip;
+	}
 #else
 	xhlock->trace.nr_entries = 1;
 	xhlock->trace.entries[0] = hlock->acquire_ip;
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index fe8fceb..132536d 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1228,7 +1228,7 @@ config LOCKDEP_COMPLETIONS
 config CROSSRELEASE_STACK_TRACE
 	bool "Record more than one entity of stack trace in crossrelease"
 	depends on LOCKDEP_CROSSRELEASE
-	default n
+	default y
 	help
 	 The lockdep "cross-release" feature needs to record stack traces
 	 (of calling functions) for all acquisitions, for eventual later
@@ -1238,7 +1238,8 @@ config CROSSRELEASE_STACK_TRACE
 	 full analysis. This option turns on the saving of the full stack
 	 trace entries.
 
-	 If unsure, say N.
+	 To make the feature actually on, set "crossrelease_fullstack"
+	 kernel parameter, too.
 
 config DEBUG_LOCKDEP
 	bool "Lock dependency engine debugging"
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v3 4/8] lockdep: Add a kernel parameter, crossrelease_fullstack
@ 2017-10-24  9:38   ` Byungchul Park
  0 siblings, 0 replies; 40+ messages in thread
From: Byungchul Park @ 2017-10-24  9:38 UTC (permalink / raw)
  To: peterz, mingo, axboe
  Cc: tglx, linux-kernel, linux-mm, tj, johannes.berg, oleg, amir73il,
	david, darrick.wong, linux-xfs, linux-fsdevel, linux-block, hch,
	idryomov, kernel-team

Make whether to allow recording full stack, in cross-release feature,
switchable at boot time via a kernel parameter, 'crossrelease_fullstack'.
In case of a splat with no stack trace, one could just reboot and set
the kernel parameter to get the full data without having to recompile
the kernel.

Change CONFIG_CROSSRELEASE_STACK_TRACE default from N to Y, and
introduce the new kernel parameter.

Signed-off-by: Byungchul Park <byungchul.park@lge.com>
---
 Documentation/admin-guide/kernel-parameters.txt |  3 +++
 kernel/locking/lockdep.c                        | 18 ++++++++++++++++--
 lib/Kconfig.debug                               |  5 +++--
 3 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index ead7f40..4107b01 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -709,6 +709,9 @@
 			It will be ignored when crashkernel=X,high is not used
 			or memory reserved is below 4G.
 
+	crossrelease_fullstack
+			[KNL] Allow to record full stack trace in cross-release
+
 	cryptomgr.notests
                         [KNL] Disable crypto self-tests
 
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 5c2ddf2..feba887 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -76,6 +76,15 @@
 #define lock_stat 0
 #endif
 
+static int crossrelease_fullstack;
+static int __init allow_crossrelease_fullstack(char *str)
+{
+	crossrelease_fullstack = 1;
+	return 0;
+}
+
+early_param("crossrelease_fullstack", allow_crossrelease_fullstack);
+
 /*
  * lockdep_lock: protects the lockdep graph, the hashes and the
  *               class/list/hash allocators.
@@ -4864,8 +4873,13 @@ static void add_xhlock(struct held_lock *hlock)
 	xhlock->trace.max_entries = MAX_XHLOCK_TRACE_ENTRIES;
 	xhlock->trace.entries = xhlock->trace_entries;
 #ifdef CONFIG_CROSSRELEASE_STACK_TRACE
-	xhlock->trace.skip = 3;
-	save_stack_trace(&xhlock->trace);
+	if (crossrelease_fullstack) {
+		xhlock->trace.skip = 3;
+		save_stack_trace(&xhlock->trace);
+	} else {
+		xhlock->trace.nr_entries = 1;
+		xhlock->trace.entries[0] = hlock->acquire_ip;
+	}
 #else
 	xhlock->trace.nr_entries = 1;
 	xhlock->trace.entries[0] = hlock->acquire_ip;
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index fe8fceb..132536d 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1228,7 +1228,7 @@ config LOCKDEP_COMPLETIONS
 config CROSSRELEASE_STACK_TRACE
 	bool "Record more than one entity of stack trace in crossrelease"
 	depends on LOCKDEP_CROSSRELEASE
-	default n
+	default y
 	help
 	 The lockdep "cross-release" feature needs to record stack traces
 	 (of calling functions) for all acquisitions, for eventual later
@@ -1238,7 +1238,8 @@ config CROSSRELEASE_STACK_TRACE
 	 full analysis. This option turns on the saving of the full stack
 	 trace entries.
 
-	 If unsure, say N.
+	 To make the feature actually on, set "crossrelease_fullstack"
+	 kernel parameter, too.
 
 config DEBUG_LOCKDEP
 	bool "Lock dependency engine debugging"
-- 
1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v3 5/8] completion: Add support for initializing completion with lockdep_map
  2017-10-24  9:38 ` Byungchul Park
@ 2017-10-24  9:38   ` Byungchul Park
  -1 siblings, 0 replies; 40+ messages in thread
From: Byungchul Park @ 2017-10-24  9:38 UTC (permalink / raw)
  To: peterz, mingo, axboe
  Cc: tglx, linux-kernel, linux-mm, tj, johannes.berg, oleg, amir73il,
	david, darrick.wong, linux-xfs, linux-fsdevel, linux-block, hch,
	idryomov, kernel-team

Sometimes, we want to initialize completions with sparate lockdep maps
to assign lock classes as desired. For example, the workqueue code
needs to directly manage lockdep maps, since only the code is aware of
how to classify lockdep maps properly.

Provide additional macros initializing completions in that way.

Signed-off-by: Byungchul Park <byungchul.park@lge.com>
---
 include/linux/completion.h | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/include/linux/completion.h b/include/linux/completion.h
index cae5400..02f8cde 100644
--- a/include/linux/completion.h
+++ b/include/linux/completion.h
@@ -49,6 +49,13 @@ static inline void complete_release_commit(struct completion *x)
 	lock_commit_crosslock((struct lockdep_map *)&x->map);
 }
 
+#define init_completion_map(x, m)					\
+do {									\
+	lockdep_init_map_crosslock((struct lockdep_map *)&(x)->map,	\
+			(m)->name, (m)->key, 0);				\
+	__init_completion(x);						\
+} while (0)
+
 #define init_completion(x)						\
 do {									\
 	static struct lock_class_key __key;				\
@@ -58,6 +65,7 @@ static inline void complete_release_commit(struct completion *x)
 	__init_completion(x);						\
 } while (0)
 #else
+#define init_completion_map(x, m) __init_completion(x)
 #define init_completion(x) __init_completion(x)
 static inline void complete_acquire(struct completion *x) {}
 static inline void complete_release(struct completion *x) {}
@@ -73,6 +81,9 @@ static inline void complete_release_commit(struct completion *x) {}
 	{ 0, __WAIT_QUEUE_HEAD_INITIALIZER((work).wait) }
 #endif
 
+#define COMPLETION_INITIALIZER_ONSTACK_MAP(work, map) \
+	(*({ init_completion_map(&(work), &(map)); &(work); }))
+
 #define COMPLETION_INITIALIZER_ONSTACK(work) \
 	(*({ init_completion(&work); &work; }))
 
@@ -102,8 +113,11 @@ static inline void complete_release_commit(struct completion *x) {}
 #ifdef CONFIG_LOCKDEP
 # define DECLARE_COMPLETION_ONSTACK(work) \
 	struct completion work = COMPLETION_INITIALIZER_ONSTACK(work)
+# define DECLARE_COMPLETION_ONSTACK_MAP(work, map) \
+	struct completion work = COMPLETION_INITIALIZER_ONSTACK_MAP(work, map)
 #else
 # define DECLARE_COMPLETION_ONSTACK(work) DECLARE_COMPLETION(work)
+# define DECLARE_COMPLETION_ONSTACK_MAP(work, map) DECLARE_COMPLETION(work)
 #endif
 
 /**
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v3 5/8] completion: Add support for initializing completion with lockdep_map
@ 2017-10-24  9:38   ` Byungchul Park
  0 siblings, 0 replies; 40+ messages in thread
From: Byungchul Park @ 2017-10-24  9:38 UTC (permalink / raw)
  To: peterz, mingo, axboe
  Cc: tglx, linux-kernel, linux-mm, tj, johannes.berg, oleg, amir73il,
	david, darrick.wong, linux-xfs, linux-fsdevel, linux-block, hch,
	idryomov, kernel-team

Sometimes, we want to initialize completions with sparate lockdep maps
to assign lock classes as desired. For example, the workqueue code
needs to directly manage lockdep maps, since only the code is aware of
how to classify lockdep maps properly.

Provide additional macros initializing completions in that way.

Signed-off-by: Byungchul Park <byungchul.park@lge.com>
---
 include/linux/completion.h | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/include/linux/completion.h b/include/linux/completion.h
index cae5400..02f8cde 100644
--- a/include/linux/completion.h
+++ b/include/linux/completion.h
@@ -49,6 +49,13 @@ static inline void complete_release_commit(struct completion *x)
 	lock_commit_crosslock((struct lockdep_map *)&x->map);
 }
 
+#define init_completion_map(x, m)					\
+do {									\
+	lockdep_init_map_crosslock((struct lockdep_map *)&(x)->map,	\
+			(m)->name, (m)->key, 0);				\
+	__init_completion(x);						\
+} while (0)
+
 #define init_completion(x)						\
 do {									\
 	static struct lock_class_key __key;				\
@@ -58,6 +65,7 @@ static inline void complete_release_commit(struct completion *x)
 	__init_completion(x);						\
 } while (0)
 #else
+#define init_completion_map(x, m) __init_completion(x)
 #define init_completion(x) __init_completion(x)
 static inline void complete_acquire(struct completion *x) {}
 static inline void complete_release(struct completion *x) {}
@@ -73,6 +81,9 @@ static inline void complete_release_commit(struct completion *x) {}
 	{ 0, __WAIT_QUEUE_HEAD_INITIALIZER((work).wait) }
 #endif
 
+#define COMPLETION_INITIALIZER_ONSTACK_MAP(work, map) \
+	(*({ init_completion_map(&(work), &(map)); &(work); }))
+
 #define COMPLETION_INITIALIZER_ONSTACK(work) \
 	(*({ init_completion(&work); &work; }))
 
@@ -102,8 +113,11 @@ static inline void complete_release_commit(struct completion *x) {}
 #ifdef CONFIG_LOCKDEP
 # define DECLARE_COMPLETION_ONSTACK(work) \
 	struct completion work = COMPLETION_INITIALIZER_ONSTACK(work)
+# define DECLARE_COMPLETION_ONSTACK_MAP(work, map) \
+	struct completion work = COMPLETION_INITIALIZER_ONSTACK_MAP(work, map)
 #else
 # define DECLARE_COMPLETION_ONSTACK(work) DECLARE_COMPLETION(work)
+# define DECLARE_COMPLETION_ONSTACK_MAP(work, map) DECLARE_COMPLETION(work)
 #endif
 
 /**
-- 
1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v3 6/8] lockdep: Remove unnecessary acquisitions wrt workqueue flush
  2017-10-24  9:38 ` Byungchul Park
@ 2017-10-24  9:38   ` Byungchul Park
  -1 siblings, 0 replies; 40+ messages in thread
From: Byungchul Park @ 2017-10-24  9:38 UTC (permalink / raw)
  To: peterz, mingo, axboe
  Cc: tglx, linux-kernel, linux-mm, tj, johannes.berg, oleg, amir73il,
	david, darrick.wong, linux-xfs, linux-fsdevel, linux-block, hch,
	idryomov, kernel-team

The workqueue added manual acquisitions to catch deadlock cases.
Now crossrelease was introduced, some of those are redundant, since
wait_for_completion() already includes the acquisition for itself.
Removed it.

Signed-off-by: Byungchul Park <byungchul.park@lge.com>
---
 include/linux/workqueue.h |  4 ++--
 kernel/workqueue.c        | 19 +++----------------
 2 files changed, 5 insertions(+), 18 deletions(-)

diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index f3c47a0..1455b5e 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -218,7 +218,7 @@ static inline void destroy_delayed_work_on_stack(struct delayed_work *work) { }
 									\
 		__init_work((_work), _onstack);				\
 		(_work)->data = (atomic_long_t) WORK_DATA_INIT();	\
-		lockdep_init_map(&(_work)->lockdep_map, #_work, &__key, 0); \
+		lockdep_init_map(&(_work)->lockdep_map, "(complete)"#_work, &__key, 0); \
 		INIT_LIST_HEAD(&(_work)->entry);			\
 		(_work)->func = (_func);				\
 	} while (0)
@@ -399,7 +399,7 @@ enum {
 	static struct lock_class_key __key;				\
 	const char *__lock_name;					\
 									\
-	__lock_name = #fmt#args;					\
+	__lock_name = "(complete)"#fmt#args;				\
 									\
 	__alloc_workqueue_key((fmt), (flags), (max_active),		\
 			      &__key, __lock_name, ##args);		\
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index c77fdf6..ee05d19 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -2496,15 +2496,8 @@ static void insert_wq_barrier(struct pool_workqueue *pwq,
 	INIT_WORK_ONSTACK(&barr->work, wq_barrier_func);
 	__set_bit(WORK_STRUCT_PENDING_BIT, work_data_bits(&barr->work));
 
-	/*
-	 * Explicitly init the crosslock for wq_barrier::done, make its lock
-	 * key a subkey of the corresponding work. As a result we won't
-	 * build a dependency between wq_barrier::done and unrelated work.
-	 */
-	lockdep_init_map_crosslock((struct lockdep_map *)&barr->done.map,
-				   "(complete)wq_barr::done",
-				   target->lockdep_map.key, 1);
-	__init_completion(&barr->done);
+	init_completion_map(&barr->done, &target->lockdep_map);
+
 	barr->task = current;
 
 	/*
@@ -2610,16 +2603,13 @@ void flush_workqueue(struct workqueue_struct *wq)
 	struct wq_flusher this_flusher = {
 		.list = LIST_HEAD_INIT(this_flusher.list),
 		.flush_color = -1,
-		.done = COMPLETION_INITIALIZER_ONSTACK(this_flusher.done),
+		.done = COMPLETION_INITIALIZER_ONSTACK_MAP(this_flusher.done, wq->lockdep_map),
 	};
 	int next_color;
 
 	if (WARN_ON(!wq_online))
 		return;
 
-	lock_map_acquire(&wq->lockdep_map);
-	lock_map_release(&wq->lockdep_map);
-
 	mutex_lock(&wq->mutex);
 
 	/*
@@ -2882,9 +2872,6 @@ bool flush_work(struct work_struct *work)
 	if (WARN_ON(!wq_online))
 		return false;
 
-	lock_map_acquire(&work->lockdep_map);
-	lock_map_release(&work->lockdep_map);
-
 	if (start_flush_work(work, &barr)) {
 		wait_for_completion(&barr.done);
 		destroy_work_on_stack(&barr.work);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v3 6/8] lockdep: Remove unnecessary acquisitions wrt workqueue flush
@ 2017-10-24  9:38   ` Byungchul Park
  0 siblings, 0 replies; 40+ messages in thread
From: Byungchul Park @ 2017-10-24  9:38 UTC (permalink / raw)
  To: peterz, mingo, axboe
  Cc: tglx, linux-kernel, linux-mm, tj, johannes.berg, oleg, amir73il,
	david, darrick.wong, linux-xfs, linux-fsdevel, linux-block, hch,
	idryomov, kernel-team

The workqueue added manual acquisitions to catch deadlock cases.
Now crossrelease was introduced, some of those are redundant, since
wait_for_completion() already includes the acquisition for itself.
Removed it.

Signed-off-by: Byungchul Park <byungchul.park@lge.com>
---
 include/linux/workqueue.h |  4 ++--
 kernel/workqueue.c        | 19 +++----------------
 2 files changed, 5 insertions(+), 18 deletions(-)

diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index f3c47a0..1455b5e 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -218,7 +218,7 @@ static inline void destroy_delayed_work_on_stack(struct delayed_work *work) { }
 									\
 		__init_work((_work), _onstack);				\
 		(_work)->data = (atomic_long_t) WORK_DATA_INIT();	\
-		lockdep_init_map(&(_work)->lockdep_map, #_work, &__key, 0); \
+		lockdep_init_map(&(_work)->lockdep_map, "(complete)"#_work, &__key, 0); \
 		INIT_LIST_HEAD(&(_work)->entry);			\
 		(_work)->func = (_func);				\
 	} while (0)
@@ -399,7 +399,7 @@ enum {
 	static struct lock_class_key __key;				\
 	const char *__lock_name;					\
 									\
-	__lock_name = #fmt#args;					\
+	__lock_name = "(complete)"#fmt#args;				\
 									\
 	__alloc_workqueue_key((fmt), (flags), (max_active),		\
 			      &__key, __lock_name, ##args);		\
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index c77fdf6..ee05d19 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -2496,15 +2496,8 @@ static void insert_wq_barrier(struct pool_workqueue *pwq,
 	INIT_WORK_ONSTACK(&barr->work, wq_barrier_func);
 	__set_bit(WORK_STRUCT_PENDING_BIT, work_data_bits(&barr->work));
 
-	/*
-	 * Explicitly init the crosslock for wq_barrier::done, make its lock
-	 * key a subkey of the corresponding work. As a result we won't
-	 * build a dependency between wq_barrier::done and unrelated work.
-	 */
-	lockdep_init_map_crosslock((struct lockdep_map *)&barr->done.map,
-				   "(complete)wq_barr::done",
-				   target->lockdep_map.key, 1);
-	__init_completion(&barr->done);
+	init_completion_map(&barr->done, &target->lockdep_map);
+
 	barr->task = current;
 
 	/*
@@ -2610,16 +2603,13 @@ void flush_workqueue(struct workqueue_struct *wq)
 	struct wq_flusher this_flusher = {
 		.list = LIST_HEAD_INIT(this_flusher.list),
 		.flush_color = -1,
-		.done = COMPLETION_INITIALIZER_ONSTACK(this_flusher.done),
+		.done = COMPLETION_INITIALIZER_ONSTACK_MAP(this_flusher.done, wq->lockdep_map),
 	};
 	int next_color;
 
 	if (WARN_ON(!wq_online))
 		return;
 
-	lock_map_acquire(&wq->lockdep_map);
-	lock_map_release(&wq->lockdep_map);
-
 	mutex_lock(&wq->mutex);
 
 	/*
@@ -2882,9 +2872,6 @@ bool flush_work(struct work_struct *work)
 	if (WARN_ON(!wq_online))
 		return false;
 
-	lock_map_acquire(&work->lockdep_map);
-	lock_map_release(&work->lockdep_map);
-
 	if (start_flush_work(work, &barr)) {
 		wait_for_completion(&barr.done);
 		destroy_work_on_stack(&barr.work);
-- 
1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v3 7/8] genhd.h: Remove trailing white space
  2017-10-24  9:38 ` Byungchul Park
@ 2017-10-24  9:38   ` Byungchul Park
  -1 siblings, 0 replies; 40+ messages in thread
From: Byungchul Park @ 2017-10-24  9:38 UTC (permalink / raw)
  To: peterz, mingo, axboe
  Cc: tglx, linux-kernel, linux-mm, tj, johannes.berg, oleg, amir73il,
	david, darrick.wong, linux-xfs, linux-fsdevel, linux-block, hch,
	idryomov, kernel-team

Trailing white space is not accepted in kernel coding style. Remove
them.

Signed-off-by: Byungchul Park <byungchul.park@lge.com>
---
 include/linux/genhd.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index ea652bf..6d85a75 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -3,7 +3,7 @@
 
 /*
  * 	genhd.h Copyright (C) 1992 Drew Eckhardt
- *	Generic hard disk header file by  
+ *	Generic hard disk header file by
  * 		Drew Eckhardt
  *
  *		<drew@colorado.edu>
@@ -471,7 +471,7 @@ struct bsd_disklabel {
 	__s16	d_type;			/* drive type */
 	__s16	d_subtype;		/* controller/d_type specific */
 	char	d_typename[16];		/* type name, e.g. "eagle" */
-	char	d_packname[16];			/* pack identifier */ 
+	char	d_packname[16];			/* pack identifier */
 	__u32	d_secsize;		/* # of bytes per sector */
 	__u32	d_nsectors;		/* # of data sectors per track */
 	__u32	d_ntracks;		/* # of tracks per cylinder */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v3 7/8] genhd.h: Remove trailing white space
@ 2017-10-24  9:38   ` Byungchul Park
  0 siblings, 0 replies; 40+ messages in thread
From: Byungchul Park @ 2017-10-24  9:38 UTC (permalink / raw)
  To: peterz, mingo, axboe
  Cc: tglx, linux-kernel, linux-mm, tj, johannes.berg, oleg, amir73il,
	david, darrick.wong, linux-xfs, linux-fsdevel, linux-block, hch,
	idryomov, kernel-team

Trailing white space is not accepted in kernel coding style. Remove
them.

Signed-off-by: Byungchul Park <byungchul.park@lge.com>
---
 include/linux/genhd.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index ea652bf..6d85a75 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -3,7 +3,7 @@
 
 /*
  * 	genhd.h Copyright (C) 1992 Drew Eckhardt
- *	Generic hard disk header file by  
+ *	Generic hard disk header file by
  * 		Drew Eckhardt
  *
  *		<drew@colorado.edu>
@@ -471,7 +471,7 @@ struct bsd_disklabel {
 	__s16	d_type;			/* drive type */
 	__s16	d_subtype;		/* controller/d_type specific */
 	char	d_typename[16];		/* type name, e.g. "eagle" */
-	char	d_packname[16];			/* pack identifier */ 
+	char	d_packname[16];			/* pack identifier */
 	__u32	d_secsize;		/* # of bytes per sector */
 	__u32	d_nsectors;		/* # of data sectors per track */
 	__u32	d_ntracks;		/* # of tracks per cylinder */
-- 
1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v3 8/8] block: Assign a lock_class per gendisk used for wait_for_completion()
  2017-10-24  9:38 ` Byungchul Park
@ 2017-10-24  9:38   ` Byungchul Park
  -1 siblings, 0 replies; 40+ messages in thread
From: Byungchul Park @ 2017-10-24  9:38 UTC (permalink / raw)
  To: peterz, mingo, axboe
  Cc: tglx, linux-kernel, linux-mm, tj, johannes.berg, oleg, amir73il,
	david, darrick.wong, linux-xfs, linux-fsdevel, linux-block, hch,
	idryomov, kernel-team

Darrick and Dave Chinner posted the following warning:

> ======================================================
> WARNING: possible circular locking dependency detected
> 4.14.0-rc1-fixes #1 Tainted: G        W
> ------------------------------------------------------
> loop0/31693 is trying to acquire lock:
>  (&(&ip->i_mmaplock)->mr_lock){++++}, at: [<ffffffffa00f1b0c>] xfs_ilock+0x23c/0x330 [xfs]
>
> but now in release context of a crosslock acquired at the following:
>  ((complete)&ret.event){+.+.}, at: [<ffffffff81326c1f>] submit_bio_wait+0x7f/0xb0
>
> which lock already depends on the new lock.
>
> the existing dependency chain (in reverse order) is:
>
> -> #2 ((complete)&ret.event){+.+.}:
>        lock_acquire+0xab/0x200
>        wait_for_completion_io+0x4e/0x1a0
>        submit_bio_wait+0x7f/0xb0
>        blkdev_issue_zeroout+0x71/0xa0
>        xfs_bmapi_convert_unwritten+0x11f/0x1d0 [xfs]
>        xfs_bmapi_write+0x374/0x11f0 [xfs]
>        xfs_iomap_write_direct+0x2ac/0x430 [xfs]
>        xfs_file_iomap_begin+0x20d/0xd50 [xfs]
>        iomap_apply+0x43/0xe0
>        dax_iomap_rw+0x89/0xf0
>        xfs_file_dax_write+0xcc/0x220 [xfs]
>        xfs_file_write_iter+0xf0/0x130 [xfs]
>        __vfs_write+0xd9/0x150
>        vfs_write+0xc8/0x1c0
>        SyS_write+0x45/0xa0
>        entry_SYSCALL_64_fastpath+0x1f/0xbe
>
> -> #1 (&xfs_nondir_ilock_class){++++}:
>        lock_acquire+0xab/0x200
>        down_write_nested+0x4a/0xb0
>        xfs_ilock+0x263/0x330 [xfs]
>        xfs_setattr_size+0x152/0x370 [xfs]
>        xfs_vn_setattr+0x6b/0x90 [xfs]
>        notify_change+0x27d/0x3f0
>        do_truncate+0x5b/0x90
>        path_openat+0x237/0xa90
>        do_filp_open+0x8a/0xf0
>        do_sys_open+0x11c/0x1f0
>        entry_SYSCALL_64_fastpath+0x1f/0xbe
>
> -> #0 (&(&ip->i_mmaplock)->mr_lock){++++}:
>        up_write+0x1c/0x40
>        xfs_iunlock+0x1d0/0x310 [xfs]
>        xfs_file_fallocate+0x8a/0x310 [xfs]
>        loop_queue_work+0xb7/0x8d0
>        kthread_worker_fn+0xb9/0x1f0
>
> Chain exists of:
>   &(&ip->i_mmaplock)->mr_lock --> &xfs_nondir_ilock_class --> (complete)&ret.event
>
>  Possible unsafe locking scenario by crosslock:
>
>        CPU0                    CPU1
>        ----                    ----
>   lock(&xfs_nondir_ilock_class);
>   lock((complete)&ret.event);
>                                lock(&(&ip->i_mmaplock)->mr_lock);
>                                unlock((complete)&ret.event);
>
>                *** DEADLOCK ***

The warning is a false positive, caused by the fact that all
wait_for_completion()s in submit_bio_wait() are waiting with the same
lock class.

However, some bios have nothing to do with others, for example, the case
might happen while using loop devices, between bios of an upper device
and a lower device(=loop device).

The safest way to assign different lock classes to different devices is
to do it for each gendisk. In other words, this patch assigns a
lockdep_map per gendisk and uses it when initializing completion in
submit_bio_wait().

Of course, it might be too conservative. But, making it safest for now
and extended by block layer experts later is good, atm.

Signed-off-by: Byungchul Park <byungchul.park@lge.com>
---
 block/bio.c           |  2 +-
 block/genhd.c         | 13 +++++--------
 include/linux/genhd.h | 22 ++++++++++++++++++++--
 3 files changed, 26 insertions(+), 11 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index 5e901bf..cc60213 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -935,7 +935,7 @@ static void submit_bio_wait_endio(struct bio *bio)
  */
 int submit_bio_wait(struct bio *bio)
 {
-	DECLARE_COMPLETION_ONSTACK(done);
+	DECLARE_COMPLETION_ONSTACK_MAP(done, bio->bi_disk->lockdep_map);
 
 	bio->bi_private = &done;
 	bio->bi_end_io = submit_bio_wait_endio;
diff --git a/block/genhd.c b/block/genhd.c
index dd305c6..f195d22 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -1354,13 +1354,7 @@ dev_t blk_lookup_devt(const char *name, int partno)
 }
 EXPORT_SYMBOL(blk_lookup_devt);
 
-struct gendisk *alloc_disk(int minors)
-{
-	return alloc_disk_node(minors, NUMA_NO_NODE);
-}
-EXPORT_SYMBOL(alloc_disk);
-
-struct gendisk *alloc_disk_node(int minors, int node_id)
+struct gendisk *__alloc_disk_node(int minors, int node_id, struct lock_class_key *key, const char *lock_name)
 {
 	struct gendisk *disk;
 	struct disk_part_tbl *ptbl;
@@ -1409,9 +1403,12 @@ struct gendisk *alloc_disk_node(int minors, int node_id)
 		disk_to_dev(disk)->type = &disk_type;
 		device_initialize(disk_to_dev(disk));
 	}
+
+	lockdep_init_map(&disk->lockdep_map, lock_name, key, 0);
+
 	return disk;
 }
-EXPORT_SYMBOL(alloc_disk_node);
+EXPORT_SYMBOL(__alloc_disk_node);
 
 struct kobject *get_disk(struct gendisk *disk)
 {
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index 6d85a75..9832e3c 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -206,6 +206,9 @@ struct gendisk {
 #endif	/* CONFIG_BLK_DEV_INTEGRITY */
 	int node_id;
 	struct badblocks *bb;
+#ifdef CONFIG_LOCKDEP_COMPLETIONS
+	struct lockdep_map lockdep_map;
+#endif
 };
 
 static inline struct gendisk *part_to_disk(struct hd_struct *part)
@@ -590,8 +593,7 @@ extern struct hd_struct * __must_check add_partition(struct gendisk *disk,
 extern void delete_partition(struct gendisk *, int);
 extern void printk_all_partitions(void);
 
-extern struct gendisk *alloc_disk_node(int minors, int node_id);
-extern struct gendisk *alloc_disk(int minors);
+extern struct gendisk *__alloc_disk_node(int minors, int node_id, struct lock_class_key *key, const char *lock_name);
 extern struct kobject *get_disk(struct gendisk *disk);
 extern void put_disk(struct gendisk *disk);
 extern void blk_register_region(dev_t devt, unsigned long range,
@@ -615,6 +617,22 @@ extern ssize_t part_fail_store(struct device *dev,
 			       const char *buf, size_t count);
 #endif /* CONFIG_FAIL_MAKE_REQUEST */
 
+#ifdef CONFIG_LOCKDEP_COMPLETIONS
+#define alloc_disk_node(m, id) \
+({									\
+	static struct lock_class_key __key;				\
+	const char *__lock_name;					\
+									\
+	__lock_name = "(complete)"#m"("#id")";				\
+									\
+	__alloc_disk_node(m, id, &__key, __lock_name);			\
+})
+#else
+#define alloc_disk_node(m, id)	__alloc_disk_node(m, id, NULL, NULL)
+#endif
+
+#define alloc_disk(m)		alloc_disk_node(m, NUMA_NO_NODE)
+
 static inline int hd_ref_init(struct hd_struct *part)
 {
 	if (percpu_ref_init(&part->ref, __delete_partition, 0,
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v3 8/8] block: Assign a lock_class per gendisk used for wait_for_completion()
@ 2017-10-24  9:38   ` Byungchul Park
  0 siblings, 0 replies; 40+ messages in thread
From: Byungchul Park @ 2017-10-24  9:38 UTC (permalink / raw)
  To: peterz, mingo, axboe
  Cc: tglx, linux-kernel, linux-mm, tj, johannes.berg, oleg, amir73il,
	david, darrick.wong, linux-xfs, linux-fsdevel, linux-block, hch,
	idryomov, kernel-team

Darrick and Dave Chinner posted the following warning:

> ======================================================
> WARNING: possible circular locking dependency detected
> 4.14.0-rc1-fixes #1 Tainted: G        W
> ------------------------------------------------------
> loop0/31693 is trying to acquire lock:
>  (&(&ip->i_mmaplock)->mr_lock){++++}, at: [<ffffffffa00f1b0c>] xfs_ilock+0x23c/0x330 [xfs]
>
> but now in release context of a crosslock acquired at the following:
>  ((complete)&ret.event){+.+.}, at: [<ffffffff81326c1f>] submit_bio_wait+0x7f/0xb0
>
> which lock already depends on the new lock.
>
> the existing dependency chain (in reverse order) is:
>
> -> #2 ((complete)&ret.event){+.+.}:
>        lock_acquire+0xab/0x200
>        wait_for_completion_io+0x4e/0x1a0
>        submit_bio_wait+0x7f/0xb0
>        blkdev_issue_zeroout+0x71/0xa0
>        xfs_bmapi_convert_unwritten+0x11f/0x1d0 [xfs]
>        xfs_bmapi_write+0x374/0x11f0 [xfs]
>        xfs_iomap_write_direct+0x2ac/0x430 [xfs]
>        xfs_file_iomap_begin+0x20d/0xd50 [xfs]
>        iomap_apply+0x43/0xe0
>        dax_iomap_rw+0x89/0xf0
>        xfs_file_dax_write+0xcc/0x220 [xfs]
>        xfs_file_write_iter+0xf0/0x130 [xfs]
>        __vfs_write+0xd9/0x150
>        vfs_write+0xc8/0x1c0
>        SyS_write+0x45/0xa0
>        entry_SYSCALL_64_fastpath+0x1f/0xbe
>
> -> #1 (&xfs_nondir_ilock_class){++++}:
>        lock_acquire+0xab/0x200
>        down_write_nested+0x4a/0xb0
>        xfs_ilock+0x263/0x330 [xfs]
>        xfs_setattr_size+0x152/0x370 [xfs]
>        xfs_vn_setattr+0x6b/0x90 [xfs]
>        notify_change+0x27d/0x3f0
>        do_truncate+0x5b/0x90
>        path_openat+0x237/0xa90
>        do_filp_open+0x8a/0xf0
>        do_sys_open+0x11c/0x1f0
>        entry_SYSCALL_64_fastpath+0x1f/0xbe
>
> -> #0 (&(&ip->i_mmaplock)->mr_lock){++++}:
>        up_write+0x1c/0x40
>        xfs_iunlock+0x1d0/0x310 [xfs]
>        xfs_file_fallocate+0x8a/0x310 [xfs]
>        loop_queue_work+0xb7/0x8d0
>        kthread_worker_fn+0xb9/0x1f0
>
> Chain exists of:
>   &(&ip->i_mmaplock)->mr_lock --> &xfs_nondir_ilock_class --> (complete)&ret.event
>
>  Possible unsafe locking scenario by crosslock:
>
>        CPU0                    CPU1
>        ----                    ----
>   lock(&xfs_nondir_ilock_class);
>   lock((complete)&ret.event);
>                                lock(&(&ip->i_mmaplock)->mr_lock);
>                                unlock((complete)&ret.event);
>
>                *** DEADLOCK ***

The warning is a false positive, caused by the fact that all
wait_for_completion()s in submit_bio_wait() are waiting with the same
lock class.

However, some bios have nothing to do with others, for example, the case
might happen while using loop devices, between bios of an upper device
and a lower device(=loop device).

The safest way to assign different lock classes to different devices is
to do it for each gendisk. In other words, this patch assigns a
lockdep_map per gendisk and uses it when initializing completion in
submit_bio_wait().

Of course, it might be too conservative. But, making it safest for now
and extended by block layer experts later is good, atm.

Signed-off-by: Byungchul Park <byungchul.park@lge.com>
---
 block/bio.c           |  2 +-
 block/genhd.c         | 13 +++++--------
 include/linux/genhd.h | 22 ++++++++++++++++++++--
 3 files changed, 26 insertions(+), 11 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index 5e901bf..cc60213 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -935,7 +935,7 @@ static void submit_bio_wait_endio(struct bio *bio)
  */
 int submit_bio_wait(struct bio *bio)
 {
-	DECLARE_COMPLETION_ONSTACK(done);
+	DECLARE_COMPLETION_ONSTACK_MAP(done, bio->bi_disk->lockdep_map);
 
 	bio->bi_private = &done;
 	bio->bi_end_io = submit_bio_wait_endio;
diff --git a/block/genhd.c b/block/genhd.c
index dd305c6..f195d22 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -1354,13 +1354,7 @@ dev_t blk_lookup_devt(const char *name, int partno)
 }
 EXPORT_SYMBOL(blk_lookup_devt);
 
-struct gendisk *alloc_disk(int minors)
-{
-	return alloc_disk_node(minors, NUMA_NO_NODE);
-}
-EXPORT_SYMBOL(alloc_disk);
-
-struct gendisk *alloc_disk_node(int minors, int node_id)
+struct gendisk *__alloc_disk_node(int minors, int node_id, struct lock_class_key *key, const char *lock_name)
 {
 	struct gendisk *disk;
 	struct disk_part_tbl *ptbl;
@@ -1409,9 +1403,12 @@ struct gendisk *alloc_disk_node(int minors, int node_id)
 		disk_to_dev(disk)->type = &disk_type;
 		device_initialize(disk_to_dev(disk));
 	}
+
+	lockdep_init_map(&disk->lockdep_map, lock_name, key, 0);
+
 	return disk;
 }
-EXPORT_SYMBOL(alloc_disk_node);
+EXPORT_SYMBOL(__alloc_disk_node);
 
 struct kobject *get_disk(struct gendisk *disk)
 {
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index 6d85a75..9832e3c 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -206,6 +206,9 @@ struct gendisk {
 #endif	/* CONFIG_BLK_DEV_INTEGRITY */
 	int node_id;
 	struct badblocks *bb;
+#ifdef CONFIG_LOCKDEP_COMPLETIONS
+	struct lockdep_map lockdep_map;
+#endif
 };
 
 static inline struct gendisk *part_to_disk(struct hd_struct *part)
@@ -590,8 +593,7 @@ extern struct hd_struct * __must_check add_partition(struct gendisk *disk,
 extern void delete_partition(struct gendisk *, int);
 extern void printk_all_partitions(void);
 
-extern struct gendisk *alloc_disk_node(int minors, int node_id);
-extern struct gendisk *alloc_disk(int minors);
+extern struct gendisk *__alloc_disk_node(int minors, int node_id, struct lock_class_key *key, const char *lock_name);
 extern struct kobject *get_disk(struct gendisk *disk);
 extern void put_disk(struct gendisk *disk);
 extern void blk_register_region(dev_t devt, unsigned long range,
@@ -615,6 +617,22 @@ extern ssize_t part_fail_store(struct device *dev,
 			       const char *buf, size_t count);
 #endif /* CONFIG_FAIL_MAKE_REQUEST */
 
+#ifdef CONFIG_LOCKDEP_COMPLETIONS
+#define alloc_disk_node(m, id) \
+({									\
+	static struct lock_class_key __key;				\
+	const char *__lock_name;					\
+									\
+	__lock_name = "(complete)"#m"("#id")";				\
+									\
+	__alloc_disk_node(m, id, &__key, __lock_name);			\
+})
+#else
+#define alloc_disk_node(m, id)	__alloc_disk_node(m, id, NULL, NULL)
+#endif
+
+#define alloc_disk(m)		alloc_disk_node(m, NUMA_NO_NODE)
+
 static inline int hd_ref_init(struct hd_struct *part)
 {
 	if (percpu_ref_init(&part->ref, __delete_partition, 0,
-- 
1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 2/8] lockdep: Introduce CROSSRELEASE_STACK_TRACE and make it not unwind as default
  2017-10-24  9:38   ` Byungchul Park
@ 2017-10-24 10:05     ` Ingo Molnar
  -1 siblings, 0 replies; 40+ messages in thread
From: Ingo Molnar @ 2017-10-24 10:05 UTC (permalink / raw)
  To: Byungchul Park
  Cc: peterz, axboe, tglx, linux-kernel, linux-mm, tj, johannes.berg,
	oleg, amir73il, david, darrick.wong, linux-xfs, linux-fsdevel,
	linux-block, hch, idryomov, kernel-team


* Byungchul Park <byungchul.park@lge.com> wrote:

> Johan Hovold reported a performance regression by crossrelease like:

Pplease add Reported-by and Analyzed-by tags - you didn't even Cc: Johan!

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 2/8] lockdep: Introduce CROSSRELEASE_STACK_TRACE and make it not unwind as default
@ 2017-10-24 10:05     ` Ingo Molnar
  0 siblings, 0 replies; 40+ messages in thread
From: Ingo Molnar @ 2017-10-24 10:05 UTC (permalink / raw)
  To: Byungchul Park
  Cc: peterz, axboe, tglx, linux-kernel, linux-mm, tj, johannes.berg,
	oleg, amir73il, david, darrick.wong, linux-xfs, linux-fsdevel,
	linux-block, hch, idryomov, kernel-team


* Byungchul Park <byungchul.park@lge.com> wrote:

> Johan Hovold reported a performance regression by crossrelease like:

Pplease add Reported-by and Analyzed-by tags - you didn't even Cc: Johan!

Thanks,

	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 2/8] lockdep: Introduce CROSSRELEASE_STACK_TRACE and make it not unwind as default
  2017-10-24  9:38   ` Byungchul Park
@ 2017-10-24 10:06     ` Ingo Molnar
  -1 siblings, 0 replies; 40+ messages in thread
From: Ingo Molnar @ 2017-10-24 10:06 UTC (permalink / raw)
  To: Byungchul Park
  Cc: peterz, axboe, tglx, linux-kernel, linux-mm, tj, johannes.berg,
	oleg, amir73il, david, darrick.wong, linux-xfs, linux-fsdevel,
	linux-block, hch, idryomov, kernel-team


Cannot pick up this series yet, but I have enhanced the changelog to:

=============>
Subject: locking/lockdep: Introduce CONFIG_CROSSRELEASE_STACK_TRACE and make it not unwind by default
From: Byungchul Park <byungchul.park@lge.com>
Date: Tue, 24 Oct 2017 18:38:03 +0900

Johan Hovold reported a heavy performance regression caused by
lockdep cross-release:

 > Boot time (from "Linux version" to login prompt) had in fact doubled
 > since 4.13 where it took 17 seconds (with my current config) compared to
 > the 35 seconds I now see with 4.14-rc4.
 >
 > I quick bisect pointed to lockdep and specifically the following commit:
 >
 >	28a903f63ec0 ("locking/lockdep: Handle non(or multi)-acquisition
 >	               of a crosslock")
 >
 > which I've verified is the commit which doubled the boot time (compared
 > to 28a903f63ec0^) (added by lockdep crossrelease series [1]).

Currently crossrelease performs unwind on every acquisition, but that is
very expensive.

This patch makes unwind optional and disables it by default and only records
acquire_ip.

Full stack traces are sometimes required for full analysis, in which
case CROSSRELEASE_STACK_TRACE can be enabled.

On my qemu Ubuntu machine (x86_64, 4 cores, 512M), the regression was
fixed. We measure boot times with 'perf stat --null --repeat 10 $QEMU',
where $QEMU launches a kernel with init=/bin/true:

1. No lockdep enabled:

 Performance counter stats for 'qemu_booting_time.sh bzImage' (10 runs):

       2.756558155 seconds time elapsed                    ( +-  0.09% )

2. Lockdep enabled:

 Performance counter stats for 'qemu_booting_time.sh bzImage' (10 runs):

       2.968710420 seconds time elapsed                    ( +-  0.12% )

3. Lockdep enabled + crossrelease enabled:

 Performance counter stats for 'qemu_booting_time.sh bzImage' (10 runs):

       3.153839636 seconds time elapsed                    ( +-  0.31% )

4. Lockdep enabled + crossrelease enabled + this patch applied:

 Performance counter stats for 'qemu_booting_time.sh bzImage' (10 runs):

       2.963669551 seconds time elapsed                    ( +-  0.11% )

I.e. lockdep-crossrelease performance is now indistinguishable
from vanilla lockdep.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 2/8] lockdep: Introduce CROSSRELEASE_STACK_TRACE and make it not unwind as default
@ 2017-10-24 10:06     ` Ingo Molnar
  0 siblings, 0 replies; 40+ messages in thread
From: Ingo Molnar @ 2017-10-24 10:06 UTC (permalink / raw)
  To: Byungchul Park
  Cc: peterz, axboe, tglx, linux-kernel, linux-mm, tj, johannes.berg,
	oleg, amir73il, david, darrick.wong, linux-xfs, linux-fsdevel,
	linux-block, hch, idryomov, kernel-team


Cannot pick up this series yet, but I have enhanced the changelog to:

=============>
Subject: locking/lockdep: Introduce CONFIG_CROSSRELEASE_STACK_TRACE and make it not unwind by default
From: Byungchul Park <byungchul.park@lge.com>
Date: Tue, 24 Oct 2017 18:38:03 +0900

Johan Hovold reported a heavy performance regression caused by
lockdep cross-release:

 > Boot time (from "Linux version" to login prompt) had in fact doubled
 > since 4.13 where it took 17 seconds (with my current config) compared to
 > the 35 seconds I now see with 4.14-rc4.
 >
 > I quick bisect pointed to lockdep and specifically the following commit:
 >
 >	28a903f63ec0 ("locking/lockdep: Handle non(or multi)-acquisition
 >	               of a crosslock")
 >
 > which I've verified is the commit which doubled the boot time (compared
 > to 28a903f63ec0^) (added by lockdep crossrelease series [1]).

Currently crossrelease performs unwind on every acquisition, but that is
very expensive.

This patch makes unwind optional and disables it by default and only records
acquire_ip.

Full stack traces are sometimes required for full analysis, in which
case CROSSRELEASE_STACK_TRACE can be enabled.

On my qemu Ubuntu machine (x86_64, 4 cores, 512M), the regression was
fixed. We measure boot times with 'perf stat --null --repeat 10 $QEMU',
where $QEMU launches a kernel with init=/bin/true:

1. No lockdep enabled:

 Performance counter stats for 'qemu_booting_time.sh bzImage' (10 runs):

       2.756558155 seconds time elapsed                    ( +-  0.09% )

2. Lockdep enabled:

 Performance counter stats for 'qemu_booting_time.sh bzImage' (10 runs):

       2.968710420 seconds time elapsed                    ( +-  0.12% )

3. Lockdep enabled + crossrelease enabled:

 Performance counter stats for 'qemu_booting_time.sh bzImage' (10 runs):

       3.153839636 seconds time elapsed                    ( +-  0.31% )

4. Lockdep enabled + crossrelease enabled + this patch applied:

 Performance counter stats for 'qemu_booting_time.sh bzImage' (10 runs):

       2.963669551 seconds time elapsed                    ( +-  0.11% )

I.e. lockdep-crossrelease performance is now indistinguishable
from vanilla lockdep.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 3/8] lockdep: Remove BROKEN flag of LOCKDEP_CROSSRELEASE
  2017-10-24  9:38   ` Byungchul Park
@ 2017-10-24 10:06     ` Ingo Molnar
  -1 siblings, 0 replies; 40+ messages in thread
From: Ingo Molnar @ 2017-10-24 10:06 UTC (permalink / raw)
  To: Byungchul Park
  Cc: peterz, axboe, tglx, linux-kernel, linux-mm, tj, johannes.berg,
	oleg, amir73il, david, darrick.wong, linux-xfs, linux-fsdevel,
	linux-block, hch, idryomov, kernel-team


better changelog:

================>
Subject: locking/lockdep: Remove the BROKEN flag from CONFIG_LOCKDEP_CROSSRELEASE and CONFIG_LOCKDEP_COMPLETIONS
From: Byungchul Park <byungchul.park@lge.com>
Date: Tue, 24 Oct 2017 18:38:04 +0900

Now that the performance regression is fixed, re-enable
CONFIG_LOCKDEP_CROSSRELEASE=y and CONFIG_LOCKDEP_COMPLETIONS=y.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 3/8] lockdep: Remove BROKEN flag of LOCKDEP_CROSSRELEASE
@ 2017-10-24 10:06     ` Ingo Molnar
  0 siblings, 0 replies; 40+ messages in thread
From: Ingo Molnar @ 2017-10-24 10:06 UTC (permalink / raw)
  To: Byungchul Park
  Cc: peterz, axboe, tglx, linux-kernel, linux-mm, tj, johannes.berg,
	oleg, amir73il, david, darrick.wong, linux-xfs, linux-fsdevel,
	linux-block, hch, idryomov, kernel-team


better changelog:

================>
Subject: locking/lockdep: Remove the BROKEN flag from CONFIG_LOCKDEP_CROSSRELEASE and CONFIG_LOCKDEP_COMPLETIONS
From: Byungchul Park <byungchul.park@lge.com>
Date: Tue, 24 Oct 2017 18:38:04 +0900

Now that the performance regression is fixed, re-enable
CONFIG_LOCKDEP_CROSSRELEASE=y and CONFIG_LOCKDEP_COMPLETIONS=y.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 4/8] lockdep: Add a kernel parameter, crossrelease_fullstack
  2017-10-24  9:38   ` Byungchul Park
@ 2017-10-24 10:08     ` Ingo Molnar
  -1 siblings, 0 replies; 40+ messages in thread
From: Ingo Molnar @ 2017-10-24 10:08 UTC (permalink / raw)
  To: Byungchul Park
  Cc: peterz, axboe, tglx, linux-kernel, linux-mm, tj, johannes.berg,
	oleg, amir73il, david, darrick.wong, linux-xfs, linux-fsdevel,
	linux-block, hch, idryomov, kernel-team


* Byungchul Park <byungchul.park@lge.com> wrote:

> Make whether to allow recording full stack, in cross-release feature,
> switchable at boot time via a kernel parameter, 'crossrelease_fullstack'.
> In case of a splat with no stack trace, one could just reboot and set
> the kernel parameter to get the full data without having to recompile
> the kernel.
> 
> Change CONFIG_CROSSRELEASE_STACK_TRACE default from N to Y, and
> introduce the new kernel parameter.
> 
> Signed-off-by: Byungchul Park <byungchul.park@lge.com>
> ---
>  Documentation/admin-guide/kernel-parameters.txt |  3 +++
>  kernel/locking/lockdep.c                        | 18 ++++++++++++++++--
>  lib/Kconfig.debug                               |  5 +++--
>  3 files changed, 22 insertions(+), 4 deletions(-)
> 
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index ead7f40..4107b01 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -709,6 +709,9 @@
>  			It will be ignored when crashkernel=X,high is not used
>  			or memory reserved is below 4G.
>  
> +	crossrelease_fullstack
> +			[KNL] Allow to record full stack trace in cross-release
> +
>  	cryptomgr.notests
>                          [KNL] Disable crypto self-tests
>  
> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> index 5c2ddf2..feba887 100644
> --- a/kernel/locking/lockdep.c
> +++ b/kernel/locking/lockdep.c
> @@ -76,6 +76,15 @@
>  #define lock_stat 0
>  #endif
>  
> +static int crossrelease_fullstack;
> +static int __init allow_crossrelease_fullstack(char *str)
> +{
> +	crossrelease_fullstack = 1;
> +	return 0;
> +}
> +
> +early_param("crossrelease_fullstack", allow_crossrelease_fullstack);
> +
>  /*
>   * lockdep_lock: protects the lockdep graph, the hashes and the
>   *               class/list/hash allocators.
> @@ -4864,8 +4873,13 @@ static void add_xhlock(struct held_lock *hlock)
>  	xhlock->trace.max_entries = MAX_XHLOCK_TRACE_ENTRIES;
>  	xhlock->trace.entries = xhlock->trace_entries;
>  #ifdef CONFIG_CROSSRELEASE_STACK_TRACE
> -	xhlock->trace.skip = 3;
> -	save_stack_trace(&xhlock->trace);
> +	if (crossrelease_fullstack) {
> +		xhlock->trace.skip = 3;
> +		save_stack_trace(&xhlock->trace);
> +	} else {
> +		xhlock->trace.nr_entries = 1;
> +		xhlock->trace.entries[0] = hlock->acquire_ip;
> +	}
>  #else
>  	xhlock->trace.nr_entries = 1;
>  	xhlock->trace.entries[0] = hlock->acquire_ip;
> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index fe8fceb..132536d 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -1228,7 +1228,7 @@ config LOCKDEP_COMPLETIONS
>  config CROSSRELEASE_STACK_TRACE
>  	bool "Record more than one entity of stack trace in crossrelease"
>  	depends on LOCKDEP_CROSSRELEASE
> -	default n
> +	default y
>  	help
>  	 The lockdep "cross-release" feature needs to record stack traces
>  	 (of calling functions) for all acquisitions, for eventual later
> @@ -1238,7 +1238,8 @@ config CROSSRELEASE_STACK_TRACE
>  	 full analysis. This option turns on the saving of the full stack
>  	 trace entries.
>  
> -	 If unsure, say N.
> +	 To make the feature actually on, set "crossrelease_fullstack"
> +	 kernel parameter, too.
>  
>  config DEBUG_LOCKDEP
>  	bool "Lock dependency engine debugging"

This is really unnecessarily complex.

The proper logic is to introduce the crossrelease_fullstack boot parameter, and to 
also have a Kconfig option that enables it: 

	CONFIG_BOOTPARAM_LOCKDEP_CROSSRELEASE_FULLSTACK=y

No #ifdefs please - just an "if ()" branch dependent on the current value of 
crossrelease_fullstack.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 4/8] lockdep: Add a kernel parameter, crossrelease_fullstack
@ 2017-10-24 10:08     ` Ingo Molnar
  0 siblings, 0 replies; 40+ messages in thread
From: Ingo Molnar @ 2017-10-24 10:08 UTC (permalink / raw)
  To: Byungchul Park
  Cc: peterz, axboe, tglx, linux-kernel, linux-mm, tj, johannes.berg,
	oleg, amir73il, david, darrick.wong, linux-xfs, linux-fsdevel,
	linux-block, hch, idryomov, kernel-team


* Byungchul Park <byungchul.park@lge.com> wrote:

> Make whether to allow recording full stack, in cross-release feature,
> switchable at boot time via a kernel parameter, 'crossrelease_fullstack'.
> In case of a splat with no stack trace, one could just reboot and set
> the kernel parameter to get the full data without having to recompile
> the kernel.
> 
> Change CONFIG_CROSSRELEASE_STACK_TRACE default from N to Y, and
> introduce the new kernel parameter.
> 
> Signed-off-by: Byungchul Park <byungchul.park@lge.com>
> ---
>  Documentation/admin-guide/kernel-parameters.txt |  3 +++
>  kernel/locking/lockdep.c                        | 18 ++++++++++++++++--
>  lib/Kconfig.debug                               |  5 +++--
>  3 files changed, 22 insertions(+), 4 deletions(-)
> 
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index ead7f40..4107b01 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -709,6 +709,9 @@
>  			It will be ignored when crashkernel=X,high is not used
>  			or memory reserved is below 4G.
>  
> +	crossrelease_fullstack
> +			[KNL] Allow to record full stack trace in cross-release
> +
>  	cryptomgr.notests
>                          [KNL] Disable crypto self-tests
>  
> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> index 5c2ddf2..feba887 100644
> --- a/kernel/locking/lockdep.c
> +++ b/kernel/locking/lockdep.c
> @@ -76,6 +76,15 @@
>  #define lock_stat 0
>  #endif
>  
> +static int crossrelease_fullstack;
> +static int __init allow_crossrelease_fullstack(char *str)
> +{
> +	crossrelease_fullstack = 1;
> +	return 0;
> +}
> +
> +early_param("crossrelease_fullstack", allow_crossrelease_fullstack);
> +
>  /*
>   * lockdep_lock: protects the lockdep graph, the hashes and the
>   *               class/list/hash allocators.
> @@ -4864,8 +4873,13 @@ static void add_xhlock(struct held_lock *hlock)
>  	xhlock->trace.max_entries = MAX_XHLOCK_TRACE_ENTRIES;
>  	xhlock->trace.entries = xhlock->trace_entries;
>  #ifdef CONFIG_CROSSRELEASE_STACK_TRACE
> -	xhlock->trace.skip = 3;
> -	save_stack_trace(&xhlock->trace);
> +	if (crossrelease_fullstack) {
> +		xhlock->trace.skip = 3;
> +		save_stack_trace(&xhlock->trace);
> +	} else {
> +		xhlock->trace.nr_entries = 1;
> +		xhlock->trace.entries[0] = hlock->acquire_ip;
> +	}
>  #else
>  	xhlock->trace.nr_entries = 1;
>  	xhlock->trace.entries[0] = hlock->acquire_ip;
> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index fe8fceb..132536d 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -1228,7 +1228,7 @@ config LOCKDEP_COMPLETIONS
>  config CROSSRELEASE_STACK_TRACE
>  	bool "Record more than one entity of stack trace in crossrelease"
>  	depends on LOCKDEP_CROSSRELEASE
> -	default n
> +	default y
>  	help
>  	 The lockdep "cross-release" feature needs to record stack traces
>  	 (of calling functions) for all acquisitions, for eventual later
> @@ -1238,7 +1238,8 @@ config CROSSRELEASE_STACK_TRACE
>  	 full analysis. This option turns on the saving of the full stack
>  	 trace entries.
>  
> -	 If unsure, say N.
> +	 To make the feature actually on, set "crossrelease_fullstack"
> +	 kernel parameter, too.
>  
>  config DEBUG_LOCKDEP
>  	bool "Lock dependency engine debugging"

This is really unnecessarily complex.

The proper logic is to introduce the crossrelease_fullstack boot parameter, and to 
also have a Kconfig option that enables it: 

	CONFIG_BOOTPARAM_LOCKDEP_CROSSRELEASE_FULLSTACK=y

No #ifdefs please - just an "if ()" branch dependent on the current value of 
crossrelease_fullstack.

Thanks,

	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 7/8] genhd.h: Remove trailing white space
  2017-10-24  9:38   ` Byungchul Park
@ 2017-10-24 10:10     ` Ingo Molnar
  -1 siblings, 0 replies; 40+ messages in thread
From: Ingo Molnar @ 2017-10-24 10:10 UTC (permalink / raw)
  To: Byungchul Park
  Cc: peterz, axboe, tglx, linux-kernel, linux-mm, tj, johannes.berg,
	oleg, amir73il, david, darrick.wong, linux-xfs, linux-fsdevel,
	linux-block, hch, idryomov, kernel-team


* Byungchul Park <byungchul.park@lge.com> wrote:

> Trailing white space is not accepted in kernel coding style. Remove
> them.
> 
> Signed-off-by: Byungchul Park <byungchul.park@lge.com>
> ---
>  include/linux/genhd.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/genhd.h b/include/linux/genhd.h
> index ea652bf..6d85a75 100644
> --- a/include/linux/genhd.h
> +++ b/include/linux/genhd.h
> @@ -3,7 +3,7 @@
>  
>  /*
>   * 	genhd.h Copyright (C) 1992 Drew Eckhardt
> - *	Generic hard disk header file by  
> + *	Generic hard disk header file by
>   * 		Drew Eckhardt
>   *
>   *		<drew@colorado.edu>
> @@ -471,7 +471,7 @@ struct bsd_disklabel {
>  	__s16	d_type;			/* drive type */
>  	__s16	d_subtype;		/* controller/d_type specific */
>  	char	d_typename[16];		/* type name, e.g. "eagle" */
> -	char	d_packname[16];			/* pack identifier */ 
> +	char	d_packname[16];			/* pack identifier */
>  	__u32	d_secsize;		/* # of bytes per sector */
>  	__u32	d_nsectors;		/* # of data sectors per track */
>  	__u32	d_ntracks;		/* # of tracks per cylinder */

This patch should not be part of this lockdep series - please send it to Jens 
separately - who might or might not apply it, depending on the subsystem's policy 
regarding whitespace-only patches.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 7/8] genhd.h: Remove trailing white space
@ 2017-10-24 10:10     ` Ingo Molnar
  0 siblings, 0 replies; 40+ messages in thread
From: Ingo Molnar @ 2017-10-24 10:10 UTC (permalink / raw)
  To: Byungchul Park
  Cc: peterz, axboe, tglx, linux-kernel, linux-mm, tj, johannes.berg,
	oleg, amir73il, david, darrick.wong, linux-xfs, linux-fsdevel,
	linux-block, hch, idryomov, kernel-team


* Byungchul Park <byungchul.park@lge.com> wrote:

> Trailing white space is not accepted in kernel coding style. Remove
> them.
> 
> Signed-off-by: Byungchul Park <byungchul.park@lge.com>
> ---
>  include/linux/genhd.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/genhd.h b/include/linux/genhd.h
> index ea652bf..6d85a75 100644
> --- a/include/linux/genhd.h
> +++ b/include/linux/genhd.h
> @@ -3,7 +3,7 @@
>  
>  /*
>   * 	genhd.h Copyright (C) 1992 Drew Eckhardt
> - *	Generic hard disk header file by  
> + *	Generic hard disk header file by
>   * 		Drew Eckhardt
>   *
>   *		<drew@colorado.edu>
> @@ -471,7 +471,7 @@ struct bsd_disklabel {
>  	__s16	d_type;			/* drive type */
>  	__s16	d_subtype;		/* controller/d_type specific */
>  	char	d_typename[16];		/* type name, e.g. "eagle" */
> -	char	d_packname[16];			/* pack identifier */ 
> +	char	d_packname[16];			/* pack identifier */
>  	__u32	d_secsize;		/* # of bytes per sector */
>  	__u32	d_nsectors;		/* # of data sectors per track */
>  	__u32	d_ntracks;		/* # of tracks per cylinder */

This patch should not be part of this lockdep series - please send it to Jens 
separately - who might or might not apply it, depending on the subsystem's policy 
regarding whitespace-only patches.

Thanks,

	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 8/8] block: Assign a lock_class per gendisk used for wait_for_completion()
  2017-10-24  9:38   ` Byungchul Park
@ 2017-10-24 10:15     ` Ingo Molnar
  -1 siblings, 0 replies; 40+ messages in thread
From: Ingo Molnar @ 2017-10-24 10:15 UTC (permalink / raw)
  To: Byungchul Park
  Cc: peterz, axboe, tglx, linux-kernel, linux-mm, tj, johannes.berg,
	oleg, amir73il, david, darrick.wong, linux-xfs, linux-fsdevel,
	linux-block, hch, idryomov, kernel-team


* Byungchul Park <byungchul.park@lge.com> wrote:

> Darrick and Dave Chinner posted the following warning:
> 
> > ======================================================
> > WARNING: possible circular locking dependency detected
> > 4.14.0-rc1-fixes #1 Tainted: G        W
> > ------------------------------------------------------
> > loop0/31693 is trying to acquire lock:
> >  (&(&ip->i_mmaplock)->mr_lock){++++}, at: [<ffffffffa00f1b0c>] xfs_ilock+0x23c/0x330 [xfs]
> >
> > but now in release context of a crosslock acquired at the following:
> >  ((complete)&ret.event){+.+.}, at: [<ffffffff81326c1f>] submit_bio_wait+0x7f/0xb0
> >
> > which lock already depends on the new lock.
> >
> > the existing dependency chain (in reverse order) is:
> >
> > -> #2 ((complete)&ret.event){+.+.}:
> >        lock_acquire+0xab/0x200
> >        wait_for_completion_io+0x4e/0x1a0
> >        submit_bio_wait+0x7f/0xb0
> >        blkdev_issue_zeroout+0x71/0xa0
> >        xfs_bmapi_convert_unwritten+0x11f/0x1d0 [xfs]
> >        xfs_bmapi_write+0x374/0x11f0 [xfs]
> >        xfs_iomap_write_direct+0x2ac/0x430 [xfs]
> >        xfs_file_iomap_begin+0x20d/0xd50 [xfs]
> >        iomap_apply+0x43/0xe0
> >        dax_iomap_rw+0x89/0xf0
> >        xfs_file_dax_write+0xcc/0x220 [xfs]
> >        xfs_file_write_iter+0xf0/0x130 [xfs]
> >        __vfs_write+0xd9/0x150
> >        vfs_write+0xc8/0x1c0
> >        SyS_write+0x45/0xa0
> >        entry_SYSCALL_64_fastpath+0x1f/0xbe
> >
> > -> #1 (&xfs_nondir_ilock_class){++++}:
> >        lock_acquire+0xab/0x200
> >        down_write_nested+0x4a/0xb0
> >        xfs_ilock+0x263/0x330 [xfs]
> >        xfs_setattr_size+0x152/0x370 [xfs]
> >        xfs_vn_setattr+0x6b/0x90 [xfs]
> >        notify_change+0x27d/0x3f0
> >        do_truncate+0x5b/0x90
> >        path_openat+0x237/0xa90
> >        do_filp_open+0x8a/0xf0
> >        do_sys_open+0x11c/0x1f0
> >        entry_SYSCALL_64_fastpath+0x1f/0xbe
> >
> > -> #0 (&(&ip->i_mmaplock)->mr_lock){++++}:
> >        up_write+0x1c/0x40
> >        xfs_iunlock+0x1d0/0x310 [xfs]
> >        xfs_file_fallocate+0x8a/0x310 [xfs]
> >        loop_queue_work+0xb7/0x8d0
> >        kthread_worker_fn+0xb9/0x1f0
> >
> > Chain exists of:
> >   &(&ip->i_mmaplock)->mr_lock --> &xfs_nondir_ilock_class --> (complete)&ret.event
> >
> >  Possible unsafe locking scenario by crosslock:
> >
> >        CPU0                    CPU1
> >        ----                    ----
> >   lock(&xfs_nondir_ilock_class);
> >   lock((complete)&ret.event);
> >                                lock(&(&ip->i_mmaplock)->mr_lock);
> >                                unlock((complete)&ret.event);
> >
> >                *** DEADLOCK ***
> 
> The warning is a false positive, caused by the fact that all
> wait_for_completion()s in submit_bio_wait() are waiting with the same
> lock class.
> 
> However, some bios have nothing to do with others, for example, the case
> might happen while using loop devices, between bios of an upper device
> and a lower device(=loop device).
> 
> The safest way to assign different lock classes to different devices is
> to do it for each gendisk. In other words, this patch assigns a
> lockdep_map per gendisk and uses it when initializing completion in
> submit_bio_wait().
> 
> Of course, it might be too conservative. But, making it safest for now
> and extended by block layer experts later is good, atm.
> 
> Signed-off-by: Byungchul Park <byungchul.park@lge.com>
> ---
>  block/bio.c           |  2 +-
>  block/genhd.c         | 13 +++++--------
>  include/linux/genhd.h | 22 ++++++++++++++++++++--
>  3 files changed, 26 insertions(+), 11 deletions(-)
> 
> diff --git a/block/bio.c b/block/bio.c
> index 5e901bf..cc60213 100644
> --- a/block/bio.c
> +++ b/block/bio.c
> @@ -935,7 +935,7 @@ static void submit_bio_wait_endio(struct bio *bio)
>   */
>  int submit_bio_wait(struct bio *bio)
>  {
> -	DECLARE_COMPLETION_ONSTACK(done);
> +	DECLARE_COMPLETION_ONSTACK_MAP(done, bio->bi_disk->lockdep_map);
>  
>  	bio->bi_private = &done;
>  	bio->bi_end_io = submit_bio_wait_endio;
> diff --git a/block/genhd.c b/block/genhd.c
> index dd305c6..f195d22 100644
> --- a/block/genhd.c
> +++ b/block/genhd.c
> @@ -1354,13 +1354,7 @@ dev_t blk_lookup_devt(const char *name, int partno)
>  }
>  EXPORT_SYMBOL(blk_lookup_devt);
>  
> -struct gendisk *alloc_disk(int minors)
> -{
> -	return alloc_disk_node(minors, NUMA_NO_NODE);
> -}
> -EXPORT_SYMBOL(alloc_disk);
> -
> -struct gendisk *alloc_disk_node(int minors, int node_id)
> +struct gendisk *__alloc_disk_node(int minors, int node_id, struct lock_class_key *key, const char *lock_name)
>  {
>  	struct gendisk *disk;
>  	struct disk_part_tbl *ptbl;
> @@ -1409,9 +1403,12 @@ struct gendisk *alloc_disk_node(int minors, int node_id)
>  		disk_to_dev(disk)->type = &disk_type;
>  		device_initialize(disk_to_dev(disk));
>  	}
> +
> +	lockdep_init_map(&disk->lockdep_map, lock_name, key, 0);

lockdep_init_map() depends on CONFIG_DEBUG_LOCK_ALLOC IIRC, but the data structure 
change you made depends on CONFIG_LOCKDEP_COMPLETIONS:

>  	return disk;
>  }
> -EXPORT_SYMBOL(alloc_disk_node);
> +EXPORT_SYMBOL(__alloc_disk_node);
>  
>  struct kobject *get_disk(struct gendisk *disk)
>  {
> diff --git a/include/linux/genhd.h b/include/linux/genhd.h
> index 6d85a75..9832e3c 100644
> --- a/include/linux/genhd.h
> +++ b/include/linux/genhd.h
> @@ -206,6 +206,9 @@ struct gendisk {
>  #endif	/* CONFIG_BLK_DEV_INTEGRITY */
>  	int node_id;
>  	struct badblocks *bb;
> +#ifdef CONFIG_LOCKDEP_COMPLETIONS
> +	struct lockdep_map lockdep_map;
> +#endif
>  };

Which is risking a future build failure at minimum.

Isn't lockdep_map a zero size structure that is always defined? If yes then 
there's no need for an #ifdef.

Also:

>  
>  static inline struct gendisk *part_to_disk(struct hd_struct *part)
> @@ -590,8 +593,7 @@ extern struct hd_struct * __must_check add_partition(struct gendisk *disk,
>  extern void delete_partition(struct gendisk *, int);
>  extern void printk_all_partitions(void);
>  
> -extern struct gendisk *alloc_disk_node(int minors, int node_id);
> -extern struct gendisk *alloc_disk(int minors);
> +extern struct gendisk *__alloc_disk_node(int minors, int node_id, struct lock_class_key *key, const char *lock_name);
>  extern struct kobject *get_disk(struct gendisk *disk);
>  extern void put_disk(struct gendisk *disk);
>  extern void blk_register_region(dev_t devt, unsigned long range,
> @@ -615,6 +617,22 @@ extern ssize_t part_fail_store(struct device *dev,
>  			       const char *buf, size_t count);
>  #endif /* CONFIG_FAIL_MAKE_REQUEST */
>  
> +#ifdef CONFIG_LOCKDEP_COMPLETIONS
> +#define alloc_disk_node(m, id) \
> +({									\
> +	static struct lock_class_key __key;				\
> +	const char *__lock_name;					\
> +									\
> +	__lock_name = "(complete)"#m"("#id")";				\
> +									\
> +	__alloc_disk_node(m, id, &__key, __lock_name);			\
> +})
> +#else
> +#define alloc_disk_node(m, id)	__alloc_disk_node(m, id, NULL, NULL)
> +#endif
> +
> +#define alloc_disk(m)		alloc_disk_node(m, NUMA_NO_NODE)
> +
>  static inline int hd_ref_init(struct hd_struct *part)
>  {
>  	if (percpu_ref_init(&part->ref, __delete_partition, 0,

Why is the lockdep_map passed in to the init function? Since it's wrapped in an 
ugly fashion anyway, why not introduce a clean inline function that calls 
lockdep_init_map() on the returned structure.

No #ifdefs required, and no uglification of the alloc_disk_node() interface.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 8/8] block: Assign a lock_class per gendisk used for wait_for_completion()
@ 2017-10-24 10:15     ` Ingo Molnar
  0 siblings, 0 replies; 40+ messages in thread
From: Ingo Molnar @ 2017-10-24 10:15 UTC (permalink / raw)
  To: Byungchul Park
  Cc: peterz, axboe, tglx, linux-kernel, linux-mm, tj, johannes.berg,
	oleg, amir73il, david, darrick.wong, linux-xfs, linux-fsdevel,
	linux-block, hch, idryomov, kernel-team


* Byungchul Park <byungchul.park@lge.com> wrote:

> Darrick and Dave Chinner posted the following warning:
> 
> > ======================================================
> > WARNING: possible circular locking dependency detected
> > 4.14.0-rc1-fixes #1 Tainted: G        W
> > ------------------------------------------------------
> > loop0/31693 is trying to acquire lock:
> >  (&(&ip->i_mmaplock)->mr_lock){++++}, at: [<ffffffffa00f1b0c>] xfs_ilock+0x23c/0x330 [xfs]
> >
> > but now in release context of a crosslock acquired at the following:
> >  ((complete)&ret.event){+.+.}, at: [<ffffffff81326c1f>] submit_bio_wait+0x7f/0xb0
> >
> > which lock already depends on the new lock.
> >
> > the existing dependency chain (in reverse order) is:
> >
> > -> #2 ((complete)&ret.event){+.+.}:
> >        lock_acquire+0xab/0x200
> >        wait_for_completion_io+0x4e/0x1a0
> >        submit_bio_wait+0x7f/0xb0
> >        blkdev_issue_zeroout+0x71/0xa0
> >        xfs_bmapi_convert_unwritten+0x11f/0x1d0 [xfs]
> >        xfs_bmapi_write+0x374/0x11f0 [xfs]
> >        xfs_iomap_write_direct+0x2ac/0x430 [xfs]
> >        xfs_file_iomap_begin+0x20d/0xd50 [xfs]
> >        iomap_apply+0x43/0xe0
> >        dax_iomap_rw+0x89/0xf0
> >        xfs_file_dax_write+0xcc/0x220 [xfs]
> >        xfs_file_write_iter+0xf0/0x130 [xfs]
> >        __vfs_write+0xd9/0x150
> >        vfs_write+0xc8/0x1c0
> >        SyS_write+0x45/0xa0
> >        entry_SYSCALL_64_fastpath+0x1f/0xbe
> >
> > -> #1 (&xfs_nondir_ilock_class){++++}:
> >        lock_acquire+0xab/0x200
> >        down_write_nested+0x4a/0xb0
> >        xfs_ilock+0x263/0x330 [xfs]
> >        xfs_setattr_size+0x152/0x370 [xfs]
> >        xfs_vn_setattr+0x6b/0x90 [xfs]
> >        notify_change+0x27d/0x3f0
> >        do_truncate+0x5b/0x90
> >        path_openat+0x237/0xa90
> >        do_filp_open+0x8a/0xf0
> >        do_sys_open+0x11c/0x1f0
> >        entry_SYSCALL_64_fastpath+0x1f/0xbe
> >
> > -> #0 (&(&ip->i_mmaplock)->mr_lock){++++}:
> >        up_write+0x1c/0x40
> >        xfs_iunlock+0x1d0/0x310 [xfs]
> >        xfs_file_fallocate+0x8a/0x310 [xfs]
> >        loop_queue_work+0xb7/0x8d0
> >        kthread_worker_fn+0xb9/0x1f0
> >
> > Chain exists of:
> >   &(&ip->i_mmaplock)->mr_lock --> &xfs_nondir_ilock_class --> (complete)&ret.event
> >
> >  Possible unsafe locking scenario by crosslock:
> >
> >        CPU0                    CPU1
> >        ----                    ----
> >   lock(&xfs_nondir_ilock_class);
> >   lock((complete)&ret.event);
> >                                lock(&(&ip->i_mmaplock)->mr_lock);
> >                                unlock((complete)&ret.event);
> >
> >                *** DEADLOCK ***
> 
> The warning is a false positive, caused by the fact that all
> wait_for_completion()s in submit_bio_wait() are waiting with the same
> lock class.
> 
> However, some bios have nothing to do with others, for example, the case
> might happen while using loop devices, between bios of an upper device
> and a lower device(=loop device).
> 
> The safest way to assign different lock classes to different devices is
> to do it for each gendisk. In other words, this patch assigns a
> lockdep_map per gendisk and uses it when initializing completion in
> submit_bio_wait().
> 
> Of course, it might be too conservative. But, making it safest for now
> and extended by block layer experts later is good, atm.
> 
> Signed-off-by: Byungchul Park <byungchul.park@lge.com>
> ---
>  block/bio.c           |  2 +-
>  block/genhd.c         | 13 +++++--------
>  include/linux/genhd.h | 22 ++++++++++++++++++++--
>  3 files changed, 26 insertions(+), 11 deletions(-)
> 
> diff --git a/block/bio.c b/block/bio.c
> index 5e901bf..cc60213 100644
> --- a/block/bio.c
> +++ b/block/bio.c
> @@ -935,7 +935,7 @@ static void submit_bio_wait_endio(struct bio *bio)
>   */
>  int submit_bio_wait(struct bio *bio)
>  {
> -	DECLARE_COMPLETION_ONSTACK(done);
> +	DECLARE_COMPLETION_ONSTACK_MAP(done, bio->bi_disk->lockdep_map);
>  
>  	bio->bi_private = &done;
>  	bio->bi_end_io = submit_bio_wait_endio;
> diff --git a/block/genhd.c b/block/genhd.c
> index dd305c6..f195d22 100644
> --- a/block/genhd.c
> +++ b/block/genhd.c
> @@ -1354,13 +1354,7 @@ dev_t blk_lookup_devt(const char *name, int partno)
>  }
>  EXPORT_SYMBOL(blk_lookup_devt);
>  
> -struct gendisk *alloc_disk(int minors)
> -{
> -	return alloc_disk_node(minors, NUMA_NO_NODE);
> -}
> -EXPORT_SYMBOL(alloc_disk);
> -
> -struct gendisk *alloc_disk_node(int minors, int node_id)
> +struct gendisk *__alloc_disk_node(int minors, int node_id, struct lock_class_key *key, const char *lock_name)
>  {
>  	struct gendisk *disk;
>  	struct disk_part_tbl *ptbl;
> @@ -1409,9 +1403,12 @@ struct gendisk *alloc_disk_node(int minors, int node_id)
>  		disk_to_dev(disk)->type = &disk_type;
>  		device_initialize(disk_to_dev(disk));
>  	}
> +
> +	lockdep_init_map(&disk->lockdep_map, lock_name, key, 0);

lockdep_init_map() depends on CONFIG_DEBUG_LOCK_ALLOC IIRC, but the data structure 
change you made depends on CONFIG_LOCKDEP_COMPLETIONS:

>  	return disk;
>  }
> -EXPORT_SYMBOL(alloc_disk_node);
> +EXPORT_SYMBOL(__alloc_disk_node);
>  
>  struct kobject *get_disk(struct gendisk *disk)
>  {
> diff --git a/include/linux/genhd.h b/include/linux/genhd.h
> index 6d85a75..9832e3c 100644
> --- a/include/linux/genhd.h
> +++ b/include/linux/genhd.h
> @@ -206,6 +206,9 @@ struct gendisk {
>  #endif	/* CONFIG_BLK_DEV_INTEGRITY */
>  	int node_id;
>  	struct badblocks *bb;
> +#ifdef CONFIG_LOCKDEP_COMPLETIONS
> +	struct lockdep_map lockdep_map;
> +#endif
>  };

Which is risking a future build failure at minimum.

Isn't lockdep_map a zero size structure that is always defined? If yes then 
there's no need for an #ifdef.

Also:

>  
>  static inline struct gendisk *part_to_disk(struct hd_struct *part)
> @@ -590,8 +593,7 @@ extern struct hd_struct * __must_check add_partition(struct gendisk *disk,
>  extern void delete_partition(struct gendisk *, int);
>  extern void printk_all_partitions(void);
>  
> -extern struct gendisk *alloc_disk_node(int minors, int node_id);
> -extern struct gendisk *alloc_disk(int minors);
> +extern struct gendisk *__alloc_disk_node(int minors, int node_id, struct lock_class_key *key, const char *lock_name);
>  extern struct kobject *get_disk(struct gendisk *disk);
>  extern void put_disk(struct gendisk *disk);
>  extern void blk_register_region(dev_t devt, unsigned long range,
> @@ -615,6 +617,22 @@ extern ssize_t part_fail_store(struct device *dev,
>  			       const char *buf, size_t count);
>  #endif /* CONFIG_FAIL_MAKE_REQUEST */
>  
> +#ifdef CONFIG_LOCKDEP_COMPLETIONS
> +#define alloc_disk_node(m, id) \
> +({									\
> +	static struct lock_class_key __key;				\
> +	const char *__lock_name;					\
> +									\
> +	__lock_name = "(complete)"#m"("#id")";				\
> +									\
> +	__alloc_disk_node(m, id, &__key, __lock_name);			\
> +})
> +#else
> +#define alloc_disk_node(m, id)	__alloc_disk_node(m, id, NULL, NULL)
> +#endif
> +
> +#define alloc_disk(m)		alloc_disk_node(m, NUMA_NO_NODE)
> +
>  static inline int hd_ref_init(struct hd_struct *part)
>  {
>  	if (percpu_ref_init(&part->ref, __delete_partition, 0,

Why is the lockdep_map passed in to the init function? Since it's wrapped in an 
ugly fashion anyway, why not introduce a clean inline function that calls 
lockdep_init_map() on the returned structure.

No #ifdefs required, and no uglification of the alloc_disk_node() interface.

Thanks,

	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 4/8] lockdep: Add a kernel parameter, crossrelease_fullstack
  2017-10-24 10:08     ` Ingo Molnar
@ 2017-10-24 23:45       ` Byungchul Park
  -1 siblings, 0 replies; 40+ messages in thread
From: Byungchul Park @ 2017-10-24 23:45 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: peterz, axboe, tglx, linux-kernel, linux-mm, tj, johannes.berg,
	oleg, amir73il, david, darrick.wong, linux-xfs, linux-fsdevel,
	linux-block, hch, idryomov, kernel-team

On Tue, Oct 24, 2017 at 12:08:58PM +0200, Ingo Molnar wrote:
> This is really unnecessarily complex.

I mis-understood your suggestion. I will change it.

> The proper logic is to introduce the crossrelease_fullstack boot parameter, and to 
> also have a Kconfig option that enables it: 
> 
> 	CONFIG_BOOTPARAM_LOCKDEP_CROSSRELEASE_FULLSTACK=y
> 
> No #ifdefs please - just an "if ()" branch dependent on the current value of 
> crossrelease_fullstack.

Ok. I will.

Thanks,
Byungchul

> Thanks,
> 
> 	Ingo

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 4/8] lockdep: Add a kernel parameter, crossrelease_fullstack
@ 2017-10-24 23:45       ` Byungchul Park
  0 siblings, 0 replies; 40+ messages in thread
From: Byungchul Park @ 2017-10-24 23:45 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: peterz, axboe, tglx, linux-kernel, linux-mm, tj, johannes.berg,
	oleg, amir73il, david, darrick.wong, linux-xfs, linux-fsdevel,
	linux-block, hch, idryomov, kernel-team

On Tue, Oct 24, 2017 at 12:08:58PM +0200, Ingo Molnar wrote:
> This is really unnecessarily complex.

I mis-understood your suggestion. I will change it.

> The proper logic is to introduce the crossrelease_fullstack boot parameter, and to 
> also have a Kconfig option that enables it: 
> 
> 	CONFIG_BOOTPARAM_LOCKDEP_CROSSRELEASE_FULLSTACK=y
> 
> No #ifdefs please - just an "if ()" branch dependent on the current value of 
> crossrelease_fullstack.

Ok. I will.

Thanks,
Byungchul

> Thanks,
> 
> 	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 8/8] block: Assign a lock_class per gendisk used for wait_for_completion()
  2017-10-24 10:15     ` Ingo Molnar
@ 2017-10-25  0:26       ` Byungchul Park
  -1 siblings, 0 replies; 40+ messages in thread
From: Byungchul Park @ 2017-10-25  0:26 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: peterz, axboe, tglx, linux-kernel, linux-mm, tj, johannes.berg,
	oleg, amir73il, david, darrick.wong, linux-xfs, linux-fsdevel,
	linux-block, hch, idryomov, kernel-team

On Tue, Oct 24, 2017 at 12:15:51PM +0200, Ingo Molnar wrote:
> > @@ -1409,9 +1403,12 @@ struct gendisk *alloc_disk_node(int minors, int node_id)
> >  		disk_to_dev(disk)->type = &disk_type;
> >  		device_initialize(disk_to_dev(disk));
> >  	}
> > +
> > +	lockdep_init_map(&disk->lockdep_map, lock_name, key, 0);
> 
> lockdep_init_map() depends on CONFIG_DEBUG_LOCK_ALLOC IIRC, but the data structure 
> change you made depends on CONFIG_LOCKDEP_COMPLETIONS:

OMG, my mistake! I am very sorry. I will fix it.

BTW, lockdep_init_map() seems to decide whether using lockdep_map or
ignoring it, depending on CONFIG_LOCKDEP than CONFIG_DEBUG_LOCK_ALLOC.

> >  	return disk;
> >  }
> > -EXPORT_SYMBOL(alloc_disk_node);
> > +EXPORT_SYMBOL(__alloc_disk_node);
> >  
> >  struct kobject *get_disk(struct gendisk *disk)
> >  {
> > diff --git a/include/linux/genhd.h b/include/linux/genhd.h
> > index 6d85a75..9832e3c 100644
> > --- a/include/linux/genhd.h
> > +++ b/include/linux/genhd.h
> > @@ -206,6 +206,9 @@ struct gendisk {
> >  #endif	/* CONFIG_BLK_DEV_INTEGRITY */
> >  	int node_id;
> >  	struct badblocks *bb;
> > +#ifdef CONFIG_LOCKDEP_COMPLETIONS
> > +	struct lockdep_map lockdep_map;
> > +#endif
> >  };
> 
> Which is risking a future build failure at minimum.
> 
> Isn't lockdep_map a zero size structure that is always defined? If yes then 
> there's no need for an #ifdef.

No, a zero size structure for lockdep_map is not provided yet.
There are two options I can do:

1. Add a zero size structure for lockdep_map and remove #ifdef
2. Replace CONFIG_LOCKDEP_COMPLETIONS with CONFIG_LOCKDEP here.

Or something else?

Which one do you prefer?

> Also:
> 
> >  
> >  static inline struct gendisk *part_to_disk(struct hd_struct *part)
> > @@ -590,8 +593,7 @@ extern struct hd_struct * __must_check add_partition(struct gendisk *disk,
> >  extern void delete_partition(struct gendisk *, int);
> >  extern void printk_all_partitions(void);
> >  
> > -extern struct gendisk *alloc_disk_node(int minors, int node_id);
> > -extern struct gendisk *alloc_disk(int minors);
> > +extern struct gendisk *__alloc_disk_node(int minors, int node_id, struct lock_class_key *key, const char *lock_name);
> >  extern struct kobject *get_disk(struct gendisk *disk);
> >  extern void put_disk(struct gendisk *disk);
> >  extern void blk_register_region(dev_t devt, unsigned long range,
> > @@ -615,6 +617,22 @@ extern ssize_t part_fail_store(struct device *dev,
> >  			       const char *buf, size_t count);
> >  #endif /* CONFIG_FAIL_MAKE_REQUEST */
> >  
> > +#ifdef CONFIG_LOCKDEP_COMPLETIONS
> > +#define alloc_disk_node(m, id) \
> > +({									\
> > +	static struct lock_class_key __key;				\
> > +	const char *__lock_name;					\
> > +									\
> > +	__lock_name = "(complete)"#m"("#id")";				\
> > +									\
> > +	__alloc_disk_node(m, id, &__key, __lock_name);			\
> > +})
> > +#else
> > +#define alloc_disk_node(m, id)	__alloc_disk_node(m, id, NULL, NULL)
> > +#endif
> > +
> > +#define alloc_disk(m)		alloc_disk_node(m, NUMA_NO_NODE)
> > +
> >  static inline int hd_ref_init(struct hd_struct *part)
> >  {
> >  	if (percpu_ref_init(&part->ref, __delete_partition, 0,
> 
> Why is the lockdep_map passed in to the init function? Since it's wrapped in an 
> ugly fashion anyway, why not introduce a clean inline function that calls 

This is the way workqueue adopted for that purpose. BTW, can I make
a lock_class_key distinguishable from another of a different gendisk,
using inline function?

> lockdep_init_map() on the returned structure.

Ok. I will make it work on the returned structre instead of passing it.

> No #ifdefs required, and no uglification of the alloc_disk_node() interface.

Ok. I will remove this #ifdef.

Thank you very much.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 8/8] block: Assign a lock_class per gendisk used for wait_for_completion()
@ 2017-10-25  0:26       ` Byungchul Park
  0 siblings, 0 replies; 40+ messages in thread
From: Byungchul Park @ 2017-10-25  0:26 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: peterz, axboe, tglx, linux-kernel, linux-mm, tj, johannes.berg,
	oleg, amir73il, david, darrick.wong, linux-xfs, linux-fsdevel,
	linux-block, hch, idryomov, kernel-team

On Tue, Oct 24, 2017 at 12:15:51PM +0200, Ingo Molnar wrote:
> > @@ -1409,9 +1403,12 @@ struct gendisk *alloc_disk_node(int minors, int node_id)
> >  		disk_to_dev(disk)->type = &disk_type;
> >  		device_initialize(disk_to_dev(disk));
> >  	}
> > +
> > +	lockdep_init_map(&disk->lockdep_map, lock_name, key, 0);
> 
> lockdep_init_map() depends on CONFIG_DEBUG_LOCK_ALLOC IIRC, but the data structure 
> change you made depends on CONFIG_LOCKDEP_COMPLETIONS:

OMG, my mistake! I am very sorry. I will fix it.

BTW, lockdep_init_map() seems to decide whether using lockdep_map or
ignoring it, depending on CONFIG_LOCKDEP than CONFIG_DEBUG_LOCK_ALLOC.

> >  	return disk;
> >  }
> > -EXPORT_SYMBOL(alloc_disk_node);
> > +EXPORT_SYMBOL(__alloc_disk_node);
> >  
> >  struct kobject *get_disk(struct gendisk *disk)
> >  {
> > diff --git a/include/linux/genhd.h b/include/linux/genhd.h
> > index 6d85a75..9832e3c 100644
> > --- a/include/linux/genhd.h
> > +++ b/include/linux/genhd.h
> > @@ -206,6 +206,9 @@ struct gendisk {
> >  #endif	/* CONFIG_BLK_DEV_INTEGRITY */
> >  	int node_id;
> >  	struct badblocks *bb;
> > +#ifdef CONFIG_LOCKDEP_COMPLETIONS
> > +	struct lockdep_map lockdep_map;
> > +#endif
> >  };
> 
> Which is risking a future build failure at minimum.
> 
> Isn't lockdep_map a zero size structure that is always defined? If yes then 
> there's no need for an #ifdef.

No, a zero size structure for lockdep_map is not provided yet.
There are two options I can do:

1. Add a zero size structure for lockdep_map and remove #ifdef
2. Replace CONFIG_LOCKDEP_COMPLETIONS with CONFIG_LOCKDEP here.

Or something else?

Which one do you prefer?

> Also:
> 
> >  
> >  static inline struct gendisk *part_to_disk(struct hd_struct *part)
> > @@ -590,8 +593,7 @@ extern struct hd_struct * __must_check add_partition(struct gendisk *disk,
> >  extern void delete_partition(struct gendisk *, int);
> >  extern void printk_all_partitions(void);
> >  
> > -extern struct gendisk *alloc_disk_node(int minors, int node_id);
> > -extern struct gendisk *alloc_disk(int minors);
> > +extern struct gendisk *__alloc_disk_node(int minors, int node_id, struct lock_class_key *key, const char *lock_name);
> >  extern struct kobject *get_disk(struct gendisk *disk);
> >  extern void put_disk(struct gendisk *disk);
> >  extern void blk_register_region(dev_t devt, unsigned long range,
> > @@ -615,6 +617,22 @@ extern ssize_t part_fail_store(struct device *dev,
> >  			       const char *buf, size_t count);
> >  #endif /* CONFIG_FAIL_MAKE_REQUEST */
> >  
> > +#ifdef CONFIG_LOCKDEP_COMPLETIONS
> > +#define alloc_disk_node(m, id) \
> > +({									\
> > +	static struct lock_class_key __key;				\
> > +	const char *__lock_name;					\
> > +									\
> > +	__lock_name = "(complete)"#m"("#id")";				\
> > +									\
> > +	__alloc_disk_node(m, id, &__key, __lock_name);			\
> > +})
> > +#else
> > +#define alloc_disk_node(m, id)	__alloc_disk_node(m, id, NULL, NULL)
> > +#endif
> > +
> > +#define alloc_disk(m)		alloc_disk_node(m, NUMA_NO_NODE)
> > +
> >  static inline int hd_ref_init(struct hd_struct *part)
> >  {
> >  	if (percpu_ref_init(&part->ref, __delete_partition, 0,
> 
> Why is the lockdep_map passed in to the init function? Since it's wrapped in an 
> ugly fashion anyway, why not introduce a clean inline function that calls 

This is the way workqueue adopted for that purpose. BTW, can I make
a lock_class_key distinguishable from another of a different gendisk,
using inline function?

> lockdep_init_map() on the returned structure.

Ok. I will make it work on the returned structre instead of passing it.

> No #ifdefs required, and no uglification of the alloc_disk_node() interface.

Ok. I will remove this #ifdef.

Thank you very much.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 2/8] lockdep: Introduce CROSSRELEASE_STACK_TRACE and make it not unwind as default
  2017-10-24 10:05     ` Ingo Molnar
@ 2017-10-25  1:01       ` Byungchul Park
  -1 siblings, 0 replies; 40+ messages in thread
From: Byungchul Park @ 2017-10-25  1:01 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: peterz, axboe, tglx, linux-kernel, linux-mm, tj, johannes.berg,
	oleg, amir73il, david, darrick.wong, linux-xfs, linux-fsdevel,
	linux-block, hch, idryomov, kernel-team

On Tue, Oct 24, 2017 at 12:05:16PM +0200, Ingo Molnar wrote:
> 
> * Byungchul Park <byungchul.park@lge.com> wrote:
> 
> > Johan Hovold reported a performance regression by crossrelease like:
> 
> Pplease add Reported-by and Analyzed-by tags - you didn't even Cc: Johan!

Excuse me but, I am sure, whom is the issue analyzed by? Me?

> Thanks,
> 
> 	Ingo

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 2/8] lockdep: Introduce CROSSRELEASE_STACK_TRACE and make it not unwind as default
@ 2017-10-25  1:01       ` Byungchul Park
  0 siblings, 0 replies; 40+ messages in thread
From: Byungchul Park @ 2017-10-25  1:01 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: peterz, axboe, tglx, linux-kernel, linux-mm, tj, johannes.berg,
	oleg, amir73il, david, darrick.wong, linux-xfs, linux-fsdevel,
	linux-block, hch, idryomov, kernel-team

On Tue, Oct 24, 2017 at 12:05:16PM +0200, Ingo Molnar wrote:
> 
> * Byungchul Park <byungchul.park@lge.com> wrote:
> 
> > Johan Hovold reported a performance regression by crossrelease like:
> 
> Pplease add Reported-by and Analyzed-by tags - you didn't even Cc: Johan!

Excuse me but, I am sure, whom is the issue analyzed by? Me?

> Thanks,
> 
> 	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 2/8] lockdep: Introduce CROSSRELEASE_STACK_TRACE and make it not unwind as default
  2017-10-25  1:01       ` Byungchul Park
@ 2017-10-25  5:53         ` Ingo Molnar
  -1 siblings, 0 replies; 40+ messages in thread
From: Ingo Molnar @ 2017-10-25  5:53 UTC (permalink / raw)
  To: Byungchul Park
  Cc: peterz, axboe, tglx, linux-kernel, linux-mm, tj, johannes.berg,
	oleg, amir73il, david, darrick.wong, linux-xfs, linux-fsdevel,
	linux-block, hch, idryomov, kernel-team


* Byungchul Park <byungchul.park@lge.com> wrote:

> On Tue, Oct 24, 2017 at 12:05:16PM +0200, Ingo Molnar wrote:
> > 
> > * Byungchul Park <byungchul.park@lge.com> wrote:
> > 
> > > Johan Hovold reported a performance regression by crossrelease like:
> > 
> > Pplease add Reported-by and Analyzed-by tags - you didn't even Cc: Johan!
> 
> Excuse me but, I am sure, whom is the issue analyzed by? Me?

Well, Johan tracked it all down for us, Thomas gave the right suggestion to fix 
the performance regression, so I meant something like:

  Reported-by: Johan Hovold <johan@kernel.org>
  Bisected-by: Johan Hovold <johan@kernel.org>
  Analyzed-by: Thomas Gleixner <tglx@linutronix.de>

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 2/8] lockdep: Introduce CROSSRELEASE_STACK_TRACE and make it not unwind as default
@ 2017-10-25  5:53         ` Ingo Molnar
  0 siblings, 0 replies; 40+ messages in thread
From: Ingo Molnar @ 2017-10-25  5:53 UTC (permalink / raw)
  To: Byungchul Park
  Cc: peterz, axboe, tglx, linux-kernel, linux-mm, tj, johannes.berg,
	oleg, amir73il, david, darrick.wong, linux-xfs, linux-fsdevel,
	linux-block, hch, idryomov, kernel-team


* Byungchul Park <byungchul.park@lge.com> wrote:

> On Tue, Oct 24, 2017 at 12:05:16PM +0200, Ingo Molnar wrote:
> > 
> > * Byungchul Park <byungchul.park@lge.com> wrote:
> > 
> > > Johan Hovold reported a performance regression by crossrelease like:
> > 
> > Pplease add Reported-by and Analyzed-by tags - you didn't even Cc: Johan!
> 
> Excuse me but, I am sure, whom is the issue analyzed by? Me?

Well, Johan tracked it all down for us, Thomas gave the right suggestion to fix 
the performance regression, so I meant something like:

  Reported-by: Johan Hovold <johan@kernel.org>
  Bisected-by: Johan Hovold <johan@kernel.org>
  Analyzed-by: Thomas Gleixner <tglx@linutronix.de>

Thanks,

	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 8/8] block: Assign a lock_class per gendisk used for wait_for_completion()
  2017-10-25  0:26       ` Byungchul Park
@ 2017-10-25  5:55         ` Ingo Molnar
  -1 siblings, 0 replies; 40+ messages in thread
From: Ingo Molnar @ 2017-10-25  5:55 UTC (permalink / raw)
  To: Byungchul Park
  Cc: peterz, axboe, tglx, linux-kernel, linux-mm, tj, johannes.berg,
	oleg, amir73il, david, darrick.wong, linux-xfs, linux-fsdevel,
	linux-block, hch, idryomov, kernel-team


* Byungchul Park <byungchul.park@lge.com> wrote:

> > Isn't lockdep_map a zero size structure that is always defined? If yes then 
> > there's no need for an #ifdef.
> 
> No, a zero size structure for lockdep_map is not provided yet.
> There are two options I can do:
> 
> 1. Add a zero size structure for lockdep_map and remove #ifdef
> 2. Replace CONFIG_LOCKDEP_COMPLETIONS with CONFIG_LOCKDEP here.
> 
> Or something else?
> 
> Which one do you prefer?

Ok, could we try #1 in a new patch and re-spin the simplified block layer patch on 
top of that?

The less ugly a debug facility's impact on unrelated kernel is, the better - 
especially when it comes to annotating false positives.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 8/8] block: Assign a lock_class per gendisk used for wait_for_completion()
@ 2017-10-25  5:55         ` Ingo Molnar
  0 siblings, 0 replies; 40+ messages in thread
From: Ingo Molnar @ 2017-10-25  5:55 UTC (permalink / raw)
  To: Byungchul Park
  Cc: peterz, axboe, tglx, linux-kernel, linux-mm, tj, johannes.berg,
	oleg, amir73il, david, darrick.wong, linux-xfs, linux-fsdevel,
	linux-block, hch, idryomov, kernel-team


* Byungchul Park <byungchul.park@lge.com> wrote:

> > Isn't lockdep_map a zero size structure that is always defined? If yes then 
> > there's no need for an #ifdef.
> 
> No, a zero size structure for lockdep_map is not provided yet.
> There are two options I can do:
> 
> 1. Add a zero size structure for lockdep_map and remove #ifdef
> 2. Replace CONFIG_LOCKDEP_COMPLETIONS with CONFIG_LOCKDEP here.
> 
> Or something else?
> 
> Which one do you prefer?

Ok, could we try #1 in a new patch and re-spin the simplified block layer patch on 
top of that?

The less ugly a debug facility's impact on unrelated kernel is, the better - 
especially when it comes to annotating false positives.

Thanks,

	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2017-10-25  5:55 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-24  9:38 [PATCH v3 0/8] cross-release: enhence performance and fix false positives Byungchul Park
2017-10-24  9:38 ` Byungchul Park
2017-10-24  9:38 ` [PATCH v3 1/8] block: use DECLARE_COMPLETION_ONSTACK in submit_bio_wait Byungchul Park
2017-10-24  9:38   ` Byungchul Park
2017-10-24  9:38 ` [PATCH v3 2/8] lockdep: Introduce CROSSRELEASE_STACK_TRACE and make it not unwind as default Byungchul Park
2017-10-24  9:38   ` Byungchul Park
2017-10-24 10:05   ` Ingo Molnar
2017-10-24 10:05     ` Ingo Molnar
2017-10-25  1:01     ` Byungchul Park
2017-10-25  1:01       ` Byungchul Park
2017-10-25  5:53       ` Ingo Molnar
2017-10-25  5:53         ` Ingo Molnar
2017-10-24 10:06   ` Ingo Molnar
2017-10-24 10:06     ` Ingo Molnar
2017-10-24  9:38 ` [PATCH v3 3/8] lockdep: Remove BROKEN flag of LOCKDEP_CROSSRELEASE Byungchul Park
2017-10-24  9:38   ` Byungchul Park
2017-10-24 10:06   ` Ingo Molnar
2017-10-24 10:06     ` Ingo Molnar
2017-10-24  9:38 ` [PATCH v3 4/8] lockdep: Add a kernel parameter, crossrelease_fullstack Byungchul Park
2017-10-24  9:38   ` Byungchul Park
2017-10-24 10:08   ` Ingo Molnar
2017-10-24 10:08     ` Ingo Molnar
2017-10-24 23:45     ` Byungchul Park
2017-10-24 23:45       ` Byungchul Park
2017-10-24  9:38 ` [PATCH v3 5/8] completion: Add support for initializing completion with lockdep_map Byungchul Park
2017-10-24  9:38   ` Byungchul Park
2017-10-24  9:38 ` [PATCH v3 6/8] lockdep: Remove unnecessary acquisitions wrt workqueue flush Byungchul Park
2017-10-24  9:38   ` Byungchul Park
2017-10-24  9:38 ` [PATCH v3 7/8] genhd.h: Remove trailing white space Byungchul Park
2017-10-24  9:38   ` Byungchul Park
2017-10-24 10:10   ` Ingo Molnar
2017-10-24 10:10     ` Ingo Molnar
2017-10-24  9:38 ` [PATCH v3 8/8] block: Assign a lock_class per gendisk used for wait_for_completion() Byungchul Park
2017-10-24  9:38   ` Byungchul Park
2017-10-24 10:15   ` Ingo Molnar
2017-10-24 10:15     ` Ingo Molnar
2017-10-25  0:26     ` Byungchul Park
2017-10-25  0:26       ` Byungchul Park
2017-10-25  5:55       ` Ingo Molnar
2017-10-25  5:55         ` Ingo Molnar

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.