linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] mmu notifer debug annotations
@ 2019-08-26 20:14 Daniel Vetter
  2019-08-26 20:14 ` [PATCH 1/5] mm, notifier: Add a lockdep map for invalidate_range_start/end Daniel Vetter
                   ` (6 more replies)
  0 siblings, 7 replies; 15+ messages in thread
From: Daniel Vetter @ 2019-08-26 20:14 UTC (permalink / raw)
  To: LKML; +Cc: Linux MM, DRI Development, Daniel Vetter

Hi all,

Next round. Changes:

- I kept the two lockdep annotations patches since when I rebased this
  before retesting linux-next didn't yet have them. Otherwise unchanged
  except for a trivial conflict.

- Ack from Peter Z. on the kernel.h patch.

- Added annotations for non_block to invalidate_range_end. I can't test
  that readily since i915 doesn't use it.

- Added might_sleep annotations to also make sure the mm side keeps up
  it's side of the contract here around what's allowed and what's not.

Comments, feedback, review as usual very much appreciated.

Cheers, Daniel

Daniel Vetter (5):
  mm, notifier: Add a lockdep map for invalidate_range_start/end
  mm, notifier: Prime lockdep
  kernel.h: Add non_block_start/end()
  mm, notifier: Catch sleeping/blocking for !blockable
  mm, notifier: annotate with might_sleep()

 include/linux/kernel.h       | 25 ++++++++++++++++++++++++-
 include/linux/mmu_notifier.h | 13 +++++++++++++
 include/linux/sched.h        |  4 ++++
 kernel/sched/core.c          | 19 ++++++++++++++-----
 mm/mmu_notifier.c            | 31 +++++++++++++++++++++++++++++--
 5 files changed, 84 insertions(+), 8 deletions(-)

-- 
2.23.0


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/5] mm, notifier: Add a lockdep map for invalidate_range_start/end
  2019-08-26 20:14 [PATCH 0/5] mmu notifer debug annotations Daniel Vetter
@ 2019-08-26 20:14 ` Daniel Vetter
  2019-08-26 20:14 ` [PATCH 2/5] mm, notifier: Prime lockdep Daniel Vetter
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 15+ messages in thread
From: Daniel Vetter @ 2019-08-26 20:14 UTC (permalink / raw)
  To: LKML
  Cc: Linux MM, DRI Development, Daniel Vetter, Jason Gunthorpe,
	Chris Wilson, Andrew Morton, David Rientjes,
	Jérôme Glisse, Michal Hocko, Christian König,
	Greg Kroah-Hartman, Mike Rapoport, Jason Gunthorpe,
	Daniel Vetter

This is a similar idea to the fs_reclaim fake lockdep lock. It's
fairly easy to provoke a specific notifier to be run on a specific
range: Just prep it, and then munmap() it.

A bit harder, but still doable, is to provoke the mmu notifiers for
all the various callchains that might lead to them. But both at the
same time is really hard to reliable hit, especially when you want to
exercise paths like direct reclaim or compaction, where it's not
easy to control what exactly will be unmapped.

By introducing a lockdep map to tie them all together we allow lockdep
to see a lot more dependencies, without having to actually hit them
in a single challchain while testing.

On Jason's suggestion this is is rolled out for both
invalidate_range_start and invalidate_range_end. They both have the
same calling context, hence we can share the same lockdep map. Note
that the annotation for invalidate_ranage_start is outside of the
mm_has_notifiers(), to make sure lockdep is informed about all paths
leading to this context irrespective of whether mmu notifiers are
present for a given context. We don't do that on the
invalidate_range_end side to avoid paying the overhead twice, there
the lockdep annotation is pushed down behind the mm_has_notifiers()
check.

v2: Use lock_map_acquire/release() like fs_reclaim, to avoid confusion
with this being a real mutex (Chris Wilson).

v3: Rebase on top of Glisse's arg rework.

v4: Also annotate invalidate_range_end (Jason Gunthorpe)
Also annotate invalidate_range_start_nonblock, I somehow missed that
one in the first version.

Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: David Rientjes <rientjes@google.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: linux-mm@kvack.org
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
---
 include/linux/mmu_notifier.h | 8 ++++++++
 mm/mmu_notifier.c            | 9 +++++++++
 2 files changed, 17 insertions(+)

diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h
index 31aa971315a1..3f9829a1f32e 100644
--- a/include/linux/mmu_notifier.h
+++ b/include/linux/mmu_notifier.h
@@ -42,6 +42,10 @@ enum mmu_notifier_event {
 
 #ifdef CONFIG_MMU_NOTIFIER
 
+#ifdef CONFIG_LOCKDEP
+extern struct lockdep_map __mmu_notifier_invalidate_range_start_map;
+#endif
+
 /*
  * The mmu notifier_mm structure is allocated and installed in
  * mm->mmu_notifier_mm inside the mm_take_all_locks() protected
@@ -341,19 +345,23 @@ static inline void mmu_notifier_change_pte(struct mm_struct *mm,
 static inline void
 mmu_notifier_invalidate_range_start(struct mmu_notifier_range *range)
 {
+	lock_map_acquire(&__mmu_notifier_invalidate_range_start_map);
 	if (mm_has_notifiers(range->mm)) {
 		range->flags |= MMU_NOTIFIER_RANGE_BLOCKABLE;
 		__mmu_notifier_invalidate_range_start(range);
 	}
+	lock_map_release(&__mmu_notifier_invalidate_range_start_map);
 }
 
 static inline int
 mmu_notifier_invalidate_range_start_nonblock(struct mmu_notifier_range *range)
 {
+	lock_map_acquire(&__mmu_notifier_invalidate_range_start_map);
 	if (mm_has_notifiers(range->mm)) {
 		range->flags &= ~MMU_NOTIFIER_RANGE_BLOCKABLE;
 		return __mmu_notifier_invalidate_range_start(range);
 	}
+	lock_map_release(&__mmu_notifier_invalidate_range_start_map);
 	return 0;
 }
 
diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c
index d76ea27e2bbb..d48d3b2abd68 100644
--- a/mm/mmu_notifier.c
+++ b/mm/mmu_notifier.c
@@ -21,6 +21,13 @@
 /* global SRCU for all MMs */
 DEFINE_STATIC_SRCU(srcu);
 
+#ifdef CONFIG_LOCKDEP
+struct lockdep_map __mmu_notifier_invalidate_range_start_map = {
+	.name = "mmu_notifier_invalidate_range_start"
+};
+EXPORT_SYMBOL_GPL(__mmu_notifier_invalidate_range_start_map);
+#endif
+
 /*
  * This function allows mmu_notifier::release callback to delay a call to
  * a function that will free appropriate resources. The function must be
@@ -197,6 +204,7 @@ void __mmu_notifier_invalidate_range_end(struct mmu_notifier_range *range,
 	struct mmu_notifier *mn;
 	int id;
 
+	lock_map_acquire(&__mmu_notifier_invalidate_range_start_map);
 	id = srcu_read_lock(&srcu);
 	hlist_for_each_entry_rcu(mn, &range->mm->mmu_notifier_mm->list, hlist) {
 		/*
@@ -220,6 +228,7 @@ void __mmu_notifier_invalidate_range_end(struct mmu_notifier_range *range,
 			mn->ops->invalidate_range_end(mn, range);
 	}
 	srcu_read_unlock(&srcu, id);
+	lock_map_release(&__mmu_notifier_invalidate_range_start_map);
 }
 EXPORT_SYMBOL_GPL(__mmu_notifier_invalidate_range_end);
 
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/5] mm, notifier: Prime lockdep
  2019-08-26 20:14 [PATCH 0/5] mmu notifer debug annotations Daniel Vetter
  2019-08-26 20:14 ` [PATCH 1/5] mm, notifier: Add a lockdep map for invalidate_range_start/end Daniel Vetter
@ 2019-08-26 20:14 ` Daniel Vetter
  2019-08-26 20:14 ` [PATCH 3/5] kernel.h: Add non_block_start/end() Daniel Vetter
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 15+ messages in thread
From: Daniel Vetter @ 2019-08-26 20:14 UTC (permalink / raw)
  To: LKML
  Cc: Linux MM, DRI Development, Daniel Vetter, Jason Gunthorpe,
	Chris Wilson, Andrew Morton, David Rientjes,
	Jérôme Glisse, Michal Hocko, Christian König,
	Greg Kroah-Hartman, Mike Rapoport, Jason Gunthorpe,
	Daniel Vetter

We want to teach lockdep that mmu notifiers can be called from direct
reclaim paths, since on many CI systems load might never reach that
level (e.g. when just running fuzzer or small functional tests).

Motivated by a discussion with Jason.

I've put the annotation into mmu_notifier_register since only when we
have mmu notifiers registered is there any point in teaching lockdep
about them. Also, we already have a kmalloc(, GFP_KERNEL), so this is
safe.

Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: David Rientjes <rientjes@google.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: linux-mm@kvack.org
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
---
 mm/mmu_notifier.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c
index d48d3b2abd68..0523555933c9 100644
--- a/mm/mmu_notifier.c
+++ b/mm/mmu_notifier.c
@@ -259,6 +259,13 @@ int __mmu_notifier_register(struct mmu_notifier *mn, struct mm_struct *mm)
 	lockdep_assert_held_write(&mm->mmap_sem);
 	BUG_ON(atomic_read(&mm->mm_users) <= 0);
 
+	if (IS_ENABLED(CONFIG_LOCKDEP)) {
+		fs_reclaim_acquire(GFP_KERNEL);
+		lock_map_acquire(&__mmu_notifier_invalidate_range_start_map);
+		lock_map_release(&__mmu_notifier_invalidate_range_start_map);
+		fs_reclaim_release(GFP_KERNEL);
+	}
+
 	mn->mm = mm;
 	mn->users = 1;
 
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/5] kernel.h: Add non_block_start/end()
  2019-08-26 20:14 [PATCH 0/5] mmu notifer debug annotations Daniel Vetter
  2019-08-26 20:14 ` [PATCH 1/5] mm, notifier: Add a lockdep map for invalidate_range_start/end Daniel Vetter
  2019-08-26 20:14 ` [PATCH 2/5] mm, notifier: Prime lockdep Daniel Vetter
@ 2019-08-26 20:14 ` Daniel Vetter
  2019-08-27 22:50   ` Jason Gunthorpe
  2019-08-28 11:43   ` Michal Hocko
  2019-08-26 20:14 ` [PATCH 4/5] mm, notifier: Catch sleeping/blocking for !blockable Daniel Vetter
                   ` (3 subsequent siblings)
  6 siblings, 2 replies; 15+ messages in thread
From: Daniel Vetter @ 2019-08-26 20:14 UTC (permalink / raw)
  To: LKML
  Cc: Linux MM, DRI Development, Daniel Vetter, Jason Gunthorpe,
	Peter Zijlstra, Ingo Molnar, Andrew Morton, Michal Hocko,
	David Rientjes, Christian König, Jérôme Glisse,
	Masahiro Yamada, Wei Wang, Andy Shevchenko, Thomas Gleixner,
	Jann Horn, Feng Tang, Kees Cook, Randy Dunlap, Daniel Vetter

In some special cases we must not block, but there's not a
spinlock, preempt-off, irqs-off or similar critical section already
that arms the might_sleep() debug checks. Add a non_block_start/end()
pair to annotate these.

This will be used in the oom paths of mmu-notifiers, where blocking is
not allowed to make sure there's forward progress. Quoting Michal:

"The notifier is called from quite a restricted context - oom_reaper -
which shouldn't depend on any locks or sleepable conditionals. The code
should be swift as well but we mostly do care about it to make a forward
progress. Checking for sleepable context is the best thing we could come
up with that would describe these demands at least partially."

Peter also asked whether we want to catch spinlocks on top, but Michal
said those are less of a problem because spinlocks can't have an
indirect dependency upon the page allocator and hence close the loop
with the oom reaper.

Suggested by Michal Hocko.

v2:
- Improve commit message (Michal)
- Also check in schedule, not just might_sleep (Peter)

v3: It works better when I actually squash in the fixup I had lying
around :-/

v4: Pick the suggestion from Andrew Morton to give non_block_start/end
some good kerneldoc comments. I added that other blocking calls like
wait_event pose similar issues, since that's the other example we
discussed.

Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: David Rientjes <rientjes@google.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: linux-mm@kvack.org
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Wei Wang <wvw@google.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jann Horn <jannh@google.com>
Cc: Feng Tang <feng.tang@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: linux-kernel@vger.kernel.org
Acked-by: Christian König <christian.koenig@amd.com> (v1)
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
---
 include/linux/kernel.h | 25 ++++++++++++++++++++++++-
 include/linux/sched.h  |  4 ++++
 kernel/sched/core.c    | 19 ++++++++++++++-----
 3 files changed, 42 insertions(+), 6 deletions(-)

diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index 4fa360a13c1e..82f84cfe372f 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -217,7 +217,9 @@ extern void __cant_sleep(const char *file, int line, int preempt_offset);
  * might_sleep - annotation for functions that can sleep
  *
  * this macro will print a stack trace if it is executed in an atomic
- * context (spinlock, irq-handler, ...).
+ * context (spinlock, irq-handler, ...). Additional sections where blocking is
+ * not allowed can be annotated with non_block_start() and non_block_end()
+ * pairs.
  *
  * This is a useful debugging help to be able to catch problems early and not
  * be bitten later when the calling function happens to sleep when it is not
@@ -233,6 +235,25 @@ extern void __cant_sleep(const char *file, int line, int preempt_offset);
 # define cant_sleep() \
 	do { __cant_sleep(__FILE__, __LINE__, 0); } while (0)
 # define sched_annotate_sleep()	(current->task_state_change = 0)
+/**
+ * non_block_start - annotate the start of section where sleeping is prohibited
+ *
+ * This is on behalf of the oom reaper, specifically when it is calling the mmu
+ * notifiers. The problem is that if the notifier were to block on, for example,
+ * mutex_lock() and if the process which holds that mutex were to perform a
+ * sleeping memory allocation, the oom reaper is now blocked on completion of
+ * that memory allocation. Other blocking calls like wait_event() pose similar
+ * issues.
+ */
+# define non_block_start() \
+	do { current->non_block_count++; } while (0)
+/**
+ * non_block_end - annotate the end of section where sleeping is prohibited
+ *
+ * Closes a section opened by non_block_start().
+ */
+# define non_block_end() \
+	do { WARN_ON(current->non_block_count-- == 0); } while (0)
 #else
   static inline void ___might_sleep(const char *file, int line,
 				   int preempt_offset) { }
@@ -241,6 +262,8 @@ extern void __cant_sleep(const char *file, int line, int preempt_offset);
 # define might_sleep() do { might_resched(); } while (0)
 # define cant_sleep() do { } while (0)
 # define sched_annotate_sleep() do { } while (0)
+# define non_block_start() do { } while (0)
+# define non_block_end() do { } while (0)
 #endif
 
 #define might_sleep_if(cond) do { if (cond) might_sleep(); } while (0)
diff --git a/include/linux/sched.h b/include/linux/sched.h
index b6ec130dff9b..e8bb965f5019 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -980,6 +980,10 @@ struct task_struct {
 	struct mutex_waiter		*blocked_on;
 #endif
 
+#ifdef CONFIG_DEBUG_ATOMIC_SLEEP
+	int				non_block_count;
+#endif
+
 #ifdef CONFIG_TRACE_IRQFLAGS
 	unsigned int			irq_events;
 	unsigned long			hardirq_enable_ip;
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 45dceec209f4..0d01c7994a9a 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3752,13 +3752,22 @@ static noinline void __schedule_bug(struct task_struct *prev)
 /*
  * Various schedule()-time debugging checks and statistics:
  */
-static inline void schedule_debug(struct task_struct *prev)
+static inline void schedule_debug(struct task_struct *prev, bool preempt)
 {
 #ifdef CONFIG_SCHED_STACK_END_CHECK
 	if (task_stack_end_corrupted(prev))
 		panic("corrupted stack end detected inside scheduler\n");
 #endif
 
+#ifdef CONFIG_DEBUG_ATOMIC_SLEEP
+	if (!preempt && prev->state && prev->non_block_count) {
+		printk(KERN_ERR "BUG: scheduling in a non-blocking section: %s/%d/%i\n",
+			prev->comm, prev->pid, prev->non_block_count);
+		dump_stack();
+		add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
+	}
+#endif
+
 	if (unlikely(in_atomic_preempt_off())) {
 		__schedule_bug(prev);
 		preempt_count_set(PREEMPT_DISABLED);
@@ -3870,7 +3879,7 @@ static void __sched notrace __schedule(bool preempt)
 	rq = cpu_rq(cpu);
 	prev = rq->curr;
 
-	schedule_debug(prev);
+	schedule_debug(prev, preempt);
 
 	if (sched_feat(HRTICK))
 		hrtick_clear(rq);
@@ -6641,7 +6650,7 @@ void ___might_sleep(const char *file, int line, int preempt_offset)
 	rcu_sleep_check();
 
 	if ((preempt_count_equals(preempt_offset) && !irqs_disabled() &&
-	     !is_idle_task(current)) ||
+	     !is_idle_task(current) && !current->non_block_count) ||
 	    system_state == SYSTEM_BOOTING || system_state > SYSTEM_RUNNING ||
 	    oops_in_progress)
 		return;
@@ -6657,8 +6666,8 @@ void ___might_sleep(const char *file, int line, int preempt_offset)
 		"BUG: sleeping function called from invalid context at %s:%d\n",
 			file, line);
 	printk(KERN_ERR
-		"in_atomic(): %d, irqs_disabled(): %d, pid: %d, name: %s\n",
-			in_atomic(), irqs_disabled(),
+		"in_atomic(): %d, irqs_disabled(): %d, non_block: %d, pid: %d, name: %s\n",
+			in_atomic(), irqs_disabled(), current->non_block_count,
 			current->pid, current->comm);
 
 	if (task_stack_end_corrupted(current))
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 4/5] mm, notifier: Catch sleeping/blocking for !blockable
  2019-08-26 20:14 [PATCH 0/5] mmu notifer debug annotations Daniel Vetter
                   ` (2 preceding siblings ...)
  2019-08-26 20:14 ` [PATCH 3/5] kernel.h: Add non_block_start/end() Daniel Vetter
@ 2019-08-26 20:14 ` Daniel Vetter
  2019-08-26 20:14 ` [PATCH 5/5] mm, notifier: annotate with might_sleep() Daniel Vetter
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 15+ messages in thread
From: Daniel Vetter @ 2019-08-26 20:14 UTC (permalink / raw)
  To: LKML
  Cc: Linux MM, DRI Development, Daniel Vetter, Jason Gunthorpe,
	Andrew Morton, Michal Hocko, David Rientjes,
	Christian König, Jérôme Glisse, Daniel Vetter

We need to make sure implementations don't cheat and don't have a
possible schedule/blocking point deeply burried where review can't
catch it.

I'm not sure whether this is the best way to make sure all the
might_sleep() callsites trigger, and it's a bit ugly in the code flow.
But it gets the job done.

Inspired by an i915 patch series which did exactly that, because the
rules haven't been entirely clear to us.

v2: Use the shiny new non_block_start/end annotations instead of
abusing preempt_disable/enable.

v3: Rebase on top of Glisse's arg rework.

v4: Rebase on top of more Glisse rework.

v5: Also annotate invalidate_range_end in the same style. I hope I got
Jason's request for this right.

Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: David Rientjes <rientjes@google.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: linux-mm@kvack.org
Reviewed-by: Christian König <christian.koenig@amd.com> (v1)
Reviewed-by: Jérôme Glisse <jglisse@redhat.com> (v4)
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
---
 mm/mmu_notifier.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c
index 0523555933c9..b17f3fd3779b 100644
--- a/mm/mmu_notifier.c
+++ b/mm/mmu_notifier.c
@@ -181,7 +181,13 @@ int __mmu_notifier_invalidate_range_start(struct mmu_notifier_range *range)
 	id = srcu_read_lock(&srcu);
 	hlist_for_each_entry_rcu(mn, &range->mm->mmu_notifier_mm->list, hlist) {
 		if (mn->ops->invalidate_range_start) {
-			int _ret = mn->ops->invalidate_range_start(mn, range);
+			int _ret;
+
+			if (!mmu_notifier_range_blockable(range))
+				non_block_start();
+			_ret = mn->ops->invalidate_range_start(mn, range);
+			if (!mmu_notifier_range_blockable(range))
+				non_block_end();
 			if (_ret) {
 				pr_info("%pS callback failed with %d in %sblockable context.\n",
 					mn->ops->invalidate_range_start, _ret,
@@ -224,8 +230,13 @@ void __mmu_notifier_invalidate_range_end(struct mmu_notifier_range *range,
 			mn->ops->invalidate_range(mn, range->mm,
 						  range->start,
 						  range->end);
-		if (mn->ops->invalidate_range_end)
+		if (mn->ops->invalidate_range_end) {
+			if (!mmu_notifier_range_blockable(range))
+				non_block_start();
 			mn->ops->invalidate_range_end(mn, range);
+			if (!mmu_notifier_range_blockable(range))
+				non_block_end();
+		}
 	}
 	srcu_read_unlock(&srcu, id);
 	lock_map_release(&__mmu_notifier_invalidate_range_start_map);
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 5/5] mm, notifier: annotate with might_sleep()
  2019-08-26 20:14 [PATCH 0/5] mmu notifer debug annotations Daniel Vetter
                   ` (3 preceding siblings ...)
  2019-08-26 20:14 ` [PATCH 4/5] mm, notifier: Catch sleeping/blocking for !blockable Daniel Vetter
@ 2019-08-26 20:14 ` Daniel Vetter
  2019-08-27 23:04 ` [PATCH 0/5] mmu notifer debug annotations Jason Gunthorpe
  2019-09-05 14:49 ` Jason Gunthorpe
  6 siblings, 0 replies; 15+ messages in thread
From: Daniel Vetter @ 2019-08-26 20:14 UTC (permalink / raw)
  To: LKML
  Cc: Linux MM, DRI Development, Daniel Vetter, Jason Gunthorpe,
	Andrew Morton, Michal Hocko, David Rientjes,
	Christian König, Jérôme Glisse, Daniel Vetter

Since mmu notifiers don't exist for more processes, but could block in
interesting places, add some annotations. This should help make sure
core mm keeps up its end of the mmu notifier contract.

The checks here are outside of all notifier checks because of that.
They compile away without CONFIG_DEBUG_ATOMIC_SLEEP.

Suggested by Jason.

Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: David Rientjes <rientjes@google.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: linux-mm@kvack.org
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
---
 include/linux/mmu_notifier.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h
index 3f9829a1f32e..8b71813417e7 100644
--- a/include/linux/mmu_notifier.h
+++ b/include/linux/mmu_notifier.h
@@ -345,6 +345,8 @@ static inline void mmu_notifier_change_pte(struct mm_struct *mm,
 static inline void
 mmu_notifier_invalidate_range_start(struct mmu_notifier_range *range)
 {
+	might_sleep();
+
 	lock_map_acquire(&__mmu_notifier_invalidate_range_start_map);
 	if (mm_has_notifiers(range->mm)) {
 		range->flags |= MMU_NOTIFIER_RANGE_BLOCKABLE;
@@ -368,6 +370,9 @@ mmu_notifier_invalidate_range_start_nonblock(struct mmu_notifier_range *range)
 static inline void
 mmu_notifier_invalidate_range_end(struct mmu_notifier_range *range)
 {
+	if (mmu_notifier_range_blockable(range))
+		might_sleep();
+
 	if (mm_has_notifiers(range->mm))
 		__mmu_notifier_invalidate_range_end(range, false);
 }
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/5] kernel.h: Add non_block_start/end()
  2019-08-26 20:14 ` [PATCH 3/5] kernel.h: Add non_block_start/end() Daniel Vetter
@ 2019-08-27 22:50   ` Jason Gunthorpe
  2019-08-28 18:33     ` Daniel Vetter
  2019-08-28 11:43   ` Michal Hocko
  1 sibling, 1 reply; 15+ messages in thread
From: Jason Gunthorpe @ 2019-08-27 22:50 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: LKML, Linux MM, DRI Development, Peter Zijlstra, Ingo Molnar,
	Andrew Morton, Michal Hocko, David Rientjes,
	Christian König, Jérôme Glisse, Masahiro Yamada,
	Wei Wang, Andy Shevchenko, Thomas Gleixner, Jann Horn, Feng Tang,
	Kees Cook, Randy Dunlap, Daniel Vetter

> diff --git a/include/linux/kernel.h b/include/linux/kernel.h
> index 4fa360a13c1e..82f84cfe372f 100644
> +++ b/include/linux/kernel.h
> @@ -217,7 +217,9 @@ extern void __cant_sleep(const char *file, int line, int preempt_offset);
>   * might_sleep - annotation for functions that can sleep
>   *
>   * this macro will print a stack trace if it is executed in an atomic
> - * context (spinlock, irq-handler, ...).
> + * context (spinlock, irq-handler, ...). Additional sections where blocking is
> + * not allowed can be annotated with non_block_start() and non_block_end()
> + * pairs.
>   *
>   * This is a useful debugging help to be able to catch problems early and not
>   * be bitten later when the calling function happens to sleep when it is not
> @@ -233,6 +235,25 @@ extern void __cant_sleep(const char *file, int line, int preempt_offset);
>  # define cant_sleep() \
>  	do { __cant_sleep(__FILE__, __LINE__, 0); } while (0)
>  # define sched_annotate_sleep()	(current->task_state_change = 0)
> +/**
> + * non_block_start - annotate the start of section where sleeping is prohibited
> + *
> + * This is on behalf of the oom reaper, specifically when it is calling the mmu
> + * notifiers. The problem is that if the notifier were to block on, for example,
> + * mutex_lock() and if the process which holds that mutex were to perform a
> + * sleeping memory allocation, the oom reaper is now blocked on completion of
> + * that memory allocation. Other blocking calls like wait_event() pose similar
> + * issues.
> + */
> +# define non_block_start() \
> +	do { current->non_block_count++; } while (0)
> +/**
> + * non_block_end - annotate the end of section where sleeping is prohibited
> + *
> + * Closes a section opened by non_block_start().
> + */
> +# define non_block_end() \
> +	do { WARN_ON(current->non_block_count-- == 0); } while (0)

check-patch does not like these, and I agree

#101: FILE: include/linux/kernel.h:248:
+# define non_block_start() \
+	do { current->non_block_count++; } while (0)

/tmp/tmp1spfxufy/0006-kernel-h-Add-non_block_start-end-.patch:108: WARNING: Single statement macros should not use a do {} while (0) loop
#108: FILE: include/linux/kernel.h:255:
+# define non_block_end() \
+	do { WARN_ON(current->non_block_count-- == 0); } while (0)

Please use a static inline?

Also, can we get one more ack on this patch?

Jason

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/5] mmu notifer debug annotations
  2019-08-26 20:14 [PATCH 0/5] mmu notifer debug annotations Daniel Vetter
                   ` (4 preceding siblings ...)
  2019-08-26 20:14 ` [PATCH 5/5] mm, notifier: annotate with might_sleep() Daniel Vetter
@ 2019-08-27 23:04 ` Jason Gunthorpe
  2019-09-05 14:49 ` Jason Gunthorpe
  6 siblings, 0 replies; 15+ messages in thread
From: Jason Gunthorpe @ 2019-08-27 23:04 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: LKML, Linux MM, DRI Development

On Mon, Aug 26, 2019 at 10:14:20PM +0200, Daniel Vetter wrote:
> Hi all,
> 
> Next round. Changes:
> 
> - I kept the two lockdep annotations patches since when I rebased this
>   before retesting linux-next didn't yet have them. Otherwise unchanged
>   except for a trivial conflict.
> 
> - Ack from Peter Z. on the kernel.h patch.
> 
> - Added annotations for non_block to invalidate_range_end. I can't test
>   that readily since i915 doesn't use it.
> 
> - Added might_sleep annotations to also make sure the mm side keeps up
>   it's side of the contract here around what's allowed and what's not.
> 
> Comments, feedback, review as usual very much appreciated.
> 
> 
> Daniel Vetter (5):
>   mm, notifier: Add a lockdep map for invalidate_range_start/end
>   mm, notifier: Prime lockdep
>   mm, notifier: annotate with might_sleep()

I took these ones to hmm.git as they have a small conflict with hmm's
changes.

>   kernel.h: Add non_block_start/end()
>   mm, notifier: Catch sleeping/blocking for !blockable

Lets see about the checkpatch warning and review on these two please

Thanks,
Jason

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/5] kernel.h: Add non_block_start/end()
  2019-08-26 20:14 ` [PATCH 3/5] kernel.h: Add non_block_start/end() Daniel Vetter
  2019-08-27 22:50   ` Jason Gunthorpe
@ 2019-08-28 11:43   ` Michal Hocko
  1 sibling, 0 replies; 15+ messages in thread
From: Michal Hocko @ 2019-08-28 11:43 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: LKML, Linux MM, DRI Development, Jason Gunthorpe, Peter Zijlstra,
	Ingo Molnar, Andrew Morton, David Rientjes, Christian König,
	Jérôme Glisse, Masahiro Yamada, Wei Wang,
	Andy Shevchenko, Thomas Gleixner, Jann Horn, Feng Tang,
	Kees Cook, Randy Dunlap, Daniel Vetter

On Mon 26-08-19 22:14:23, Daniel Vetter wrote:
> In some special cases we must not block, but there's not a
> spinlock, preempt-off, irqs-off or similar critical section already
> that arms the might_sleep() debug checks. Add a non_block_start/end()
> pair to annotate these.
> 
> This will be used in the oom paths of mmu-notifiers, where blocking is
> not allowed to make sure there's forward progress. Quoting Michal:
> 
> "The notifier is called from quite a restricted context - oom_reaper -
> which shouldn't depend on any locks or sleepable conditionals. The code
> should be swift as well but we mostly do care about it to make a forward
> progress. Checking for sleepable context is the best thing we could come
> up with that would describe these demands at least partially."
> 
> Peter also asked whether we want to catch spinlocks on top, but Michal
> said those are less of a problem because spinlocks can't have an
> indirect dependency upon the page allocator and hence close the loop
> with the oom reaper.
> 
> Suggested by Michal Hocko.
> 
> v2:
> - Improve commit message (Michal)
> - Also check in schedule, not just might_sleep (Peter)
> 
> v3: It works better when I actually squash in the fixup I had lying
> around :-/
> 
> v4: Pick the suggestion from Andrew Morton to give non_block_start/end
> some good kerneldoc comments. I added that other blocking calls like
> wait_event pose similar issues, since that's the other example we
> discussed.
> 
> Cc: Jason Gunthorpe <jgg@ziepe.ca>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: David Rientjes <rientjes@google.com>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: "Jérôme Glisse" <jglisse@redhat.com>
> Cc: linux-mm@kvack.org
> Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
> Cc: Wei Wang <wvw@google.com>
> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Jann Horn <jannh@google.com>
> Cc: Feng Tang <feng.tang@intel.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Randy Dunlap <rdunlap@infradead.org>
> Cc: linux-kernel@vger.kernel.org
> Acked-by: Christian König <christian.koenig@amd.com> (v1)
> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>

Acked-by: Michal Hocko <mhocko@suse.com>

Thanks and sorry for being mostly silent/slow in discussions here.
ETOOBUSY.

> ---
>  include/linux/kernel.h | 25 ++++++++++++++++++++++++-
>  include/linux/sched.h  |  4 ++++
>  kernel/sched/core.c    | 19 ++++++++++++++-----
>  3 files changed, 42 insertions(+), 6 deletions(-)
> 
> diff --git a/include/linux/kernel.h b/include/linux/kernel.h
> index 4fa360a13c1e..82f84cfe372f 100644
> --- a/include/linux/kernel.h
> +++ b/include/linux/kernel.h
> @@ -217,7 +217,9 @@ extern void __cant_sleep(const char *file, int line, int preempt_offset);
>   * might_sleep - annotation for functions that can sleep
>   *
>   * this macro will print a stack trace if it is executed in an atomic
> - * context (spinlock, irq-handler, ...).
> + * context (spinlock, irq-handler, ...). Additional sections where blocking is
> + * not allowed can be annotated with non_block_start() and non_block_end()
> + * pairs.
>   *
>   * This is a useful debugging help to be able to catch problems early and not
>   * be bitten later when the calling function happens to sleep when it is not
> @@ -233,6 +235,25 @@ extern void __cant_sleep(const char *file, int line, int preempt_offset);
>  # define cant_sleep() \
>  	do { __cant_sleep(__FILE__, __LINE__, 0); } while (0)
>  # define sched_annotate_sleep()	(current->task_state_change = 0)
> +/**
> + * non_block_start - annotate the start of section where sleeping is prohibited
> + *
> + * This is on behalf of the oom reaper, specifically when it is calling the mmu
> + * notifiers. The problem is that if the notifier were to block on, for example,
> + * mutex_lock() and if the process which holds that mutex were to perform a
> + * sleeping memory allocation, the oom reaper is now blocked on completion of
> + * that memory allocation. Other blocking calls like wait_event() pose similar
> + * issues.
> + */
> +# define non_block_start() \
> +	do { current->non_block_count++; } while (0)
> +/**
> + * non_block_end - annotate the end of section where sleeping is prohibited
> + *
> + * Closes a section opened by non_block_start().
> + */
> +# define non_block_end() \
> +	do { WARN_ON(current->non_block_count-- == 0); } while (0)
>  #else
>    static inline void ___might_sleep(const char *file, int line,
>  				   int preempt_offset) { }
> @@ -241,6 +262,8 @@ extern void __cant_sleep(const char *file, int line, int preempt_offset);
>  # define might_sleep() do { might_resched(); } while (0)
>  # define cant_sleep() do { } while (0)
>  # define sched_annotate_sleep() do { } while (0)
> +# define non_block_start() do { } while (0)
> +# define non_block_end() do { } while (0)
>  #endif
>  
>  #define might_sleep_if(cond) do { if (cond) might_sleep(); } while (0)
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index b6ec130dff9b..e8bb965f5019 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -980,6 +980,10 @@ struct task_struct {
>  	struct mutex_waiter		*blocked_on;
>  #endif
>  
> +#ifdef CONFIG_DEBUG_ATOMIC_SLEEP
> +	int				non_block_count;
> +#endif
> +
>  #ifdef CONFIG_TRACE_IRQFLAGS
>  	unsigned int			irq_events;
>  	unsigned long			hardirq_enable_ip;
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 45dceec209f4..0d01c7994a9a 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -3752,13 +3752,22 @@ static noinline void __schedule_bug(struct task_struct *prev)
>  /*
>   * Various schedule()-time debugging checks and statistics:
>   */
> -static inline void schedule_debug(struct task_struct *prev)
> +static inline void schedule_debug(struct task_struct *prev, bool preempt)
>  {
>  #ifdef CONFIG_SCHED_STACK_END_CHECK
>  	if (task_stack_end_corrupted(prev))
>  		panic("corrupted stack end detected inside scheduler\n");
>  #endif
>  
> +#ifdef CONFIG_DEBUG_ATOMIC_SLEEP
> +	if (!preempt && prev->state && prev->non_block_count) {
> +		printk(KERN_ERR "BUG: scheduling in a non-blocking section: %s/%d/%i\n",
> +			prev->comm, prev->pid, prev->non_block_count);
> +		dump_stack();
> +		add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
> +	}
> +#endif
> +
>  	if (unlikely(in_atomic_preempt_off())) {
>  		__schedule_bug(prev);
>  		preempt_count_set(PREEMPT_DISABLED);
> @@ -3870,7 +3879,7 @@ static void __sched notrace __schedule(bool preempt)
>  	rq = cpu_rq(cpu);
>  	prev = rq->curr;
>  
> -	schedule_debug(prev);
> +	schedule_debug(prev, preempt);
>  
>  	if (sched_feat(HRTICK))
>  		hrtick_clear(rq);
> @@ -6641,7 +6650,7 @@ void ___might_sleep(const char *file, int line, int preempt_offset)
>  	rcu_sleep_check();
>  
>  	if ((preempt_count_equals(preempt_offset) && !irqs_disabled() &&
> -	     !is_idle_task(current)) ||
> +	     !is_idle_task(current) && !current->non_block_count) ||
>  	    system_state == SYSTEM_BOOTING || system_state > SYSTEM_RUNNING ||
>  	    oops_in_progress)
>  		return;
> @@ -6657,8 +6666,8 @@ void ___might_sleep(const char *file, int line, int preempt_offset)
>  		"BUG: sleeping function called from invalid context at %s:%d\n",
>  			file, line);
>  	printk(KERN_ERR
> -		"in_atomic(): %d, irqs_disabled(): %d, pid: %d, name: %s\n",
> -			in_atomic(), irqs_disabled(),
> +		"in_atomic(): %d, irqs_disabled(): %d, non_block: %d, pid: %d, name: %s\n",
> +			in_atomic(), irqs_disabled(), current->non_block_count,
>  			current->pid, current->comm);
>  
>  	if (task_stack_end_corrupted(current))
> -- 
> 2.23.0
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/5] kernel.h: Add non_block_start/end()
  2019-08-27 22:50   ` Jason Gunthorpe
@ 2019-08-28 18:33     ` Daniel Vetter
  2019-08-28 18:43       ` Jason Gunthorpe
  0 siblings, 1 reply; 15+ messages in thread
From: Daniel Vetter @ 2019-08-28 18:33 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: LKML, Linux MM, DRI Development, Peter Zijlstra, Ingo Molnar,
	Andrew Morton, Michal Hocko, David Rientjes,
	Christian König, Jérôme Glisse, Masahiro Yamada,
	Wei Wang, Andy Shevchenko, Thomas Gleixner, Jann Horn, Feng Tang,
	Kees Cook, Randy Dunlap, Daniel Vetter

On Wed, Aug 28, 2019 at 12:50 AM Jason Gunthorpe <jgg@ziepe.ca> wrote:
>
> > diff --git a/include/linux/kernel.h b/include/linux/kernel.h
> > index 4fa360a13c1e..82f84cfe372f 100644
> > +++ b/include/linux/kernel.h
> > @@ -217,7 +217,9 @@ extern void __cant_sleep(const char *file, int line, int preempt_offset);
> >   * might_sleep - annotation for functions that can sleep
> >   *
> >   * this macro will print a stack trace if it is executed in an atomic
> > - * context (spinlock, irq-handler, ...).
> > + * context (spinlock, irq-handler, ...). Additional sections where blocking is
> > + * not allowed can be annotated with non_block_start() and non_block_end()
> > + * pairs.
> >   *
> >   * This is a useful debugging help to be able to catch problems early and not
> >   * be bitten later when the calling function happens to sleep when it is not
> > @@ -233,6 +235,25 @@ extern void __cant_sleep(const char *file, int line, int preempt_offset);
> >  # define cant_sleep() \
> >       do { __cant_sleep(__FILE__, __LINE__, 0); } while (0)
> >  # define sched_annotate_sleep()      (current->task_state_change = 0)
> > +/**
> > + * non_block_start - annotate the start of section where sleeping is prohibited
> > + *
> > + * This is on behalf of the oom reaper, specifically when it is calling the mmu
> > + * notifiers. The problem is that if the notifier were to block on, for example,
> > + * mutex_lock() and if the process which holds that mutex were to perform a
> > + * sleeping memory allocation, the oom reaper is now blocked on completion of
> > + * that memory allocation. Other blocking calls like wait_event() pose similar
> > + * issues.
> > + */
> > +# define non_block_start() \
> > +     do { current->non_block_count++; } while (0)
> > +/**
> > + * non_block_end - annotate the end of section where sleeping is prohibited
> > + *
> > + * Closes a section opened by non_block_start().
> > + */
> > +# define non_block_end() \
> > +     do { WARN_ON(current->non_block_count-- == 0); } while (0)
>
> check-patch does not like these, and I agree
>
> #101: FILE: include/linux/kernel.h:248:
> +# define non_block_start() \
> +       do { current->non_block_count++; } while (0)
>
> /tmp/tmp1spfxufy/0006-kernel-h-Add-non_block_start-end-.patch:108: WARNING: Single statement macros should not use a do {} while (0) loop
> #108: FILE: include/linux/kernel.h:255:
> +# define non_block_end() \
> +       do { WARN_ON(current->non_block_count-- == 0); } while (0)
>
> Please use a static inline?

We need get_current() plus the task_struct, so this gets real messy
real fast. Not even sure which header this would fit in, or whether
I'd need to create a new one. You're insisting on this or respinning
with the do { } while (0) dropped ok.

Thanks, Daniel

> Also, can we get one more ack on this patch?
>
> Jason



-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/5] kernel.h: Add non_block_start/end()
  2019-08-28 18:33     ` Daniel Vetter
@ 2019-08-28 18:43       ` Jason Gunthorpe
  2019-08-28 18:56         ` Daniel Vetter
  0 siblings, 1 reply; 15+ messages in thread
From: Jason Gunthorpe @ 2019-08-28 18:43 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: LKML, Linux MM, DRI Development, Peter Zijlstra, Ingo Molnar,
	Andrew Morton, Michal Hocko, David Rientjes,
	Christian König, Jérôme Glisse, Masahiro Yamada,
	Wei Wang, Andy Shevchenko, Thomas Gleixner, Jann Horn, Feng Tang,
	Kees Cook, Randy Dunlap, Daniel Vetter

On Wed, Aug 28, 2019 at 08:33:13PM +0200, Daniel Vetter wrote:
> On Wed, Aug 28, 2019 at 12:50 AM Jason Gunthorpe <jgg@ziepe.ca> wrote:
> >
> > > diff --git a/include/linux/kernel.h b/include/linux/kernel.h
> > > index 4fa360a13c1e..82f84cfe372f 100644
> > > +++ b/include/linux/kernel.h
> > > @@ -217,7 +217,9 @@ extern void __cant_sleep(const char *file, int line, int preempt_offset);
> > >   * might_sleep - annotation for functions that can sleep
> > >   *
> > >   * this macro will print a stack trace if it is executed in an atomic
> > > - * context (spinlock, irq-handler, ...).
> > > + * context (spinlock, irq-handler, ...). Additional sections where blocking is
> > > + * not allowed can be annotated with non_block_start() and non_block_end()
> > > + * pairs.
> > >   *
> > >   * This is a useful debugging help to be able to catch problems early and not
> > >   * be bitten later when the calling function happens to sleep when it is not
> > > @@ -233,6 +235,25 @@ extern void __cant_sleep(const char *file, int line, int preempt_offset);
> > >  # define cant_sleep() \
> > >       do { __cant_sleep(__FILE__, __LINE__, 0); } while (0)
> > >  # define sched_annotate_sleep()      (current->task_state_change = 0)
> > > +/**
> > > + * non_block_start - annotate the start of section where sleeping is prohibited
> > > + *
> > > + * This is on behalf of the oom reaper, specifically when it is calling the mmu
> > > + * notifiers. The problem is that if the notifier were to block on, for example,
> > > + * mutex_lock() and if the process which holds that mutex were to perform a
> > > + * sleeping memory allocation, the oom reaper is now blocked on completion of
> > > + * that memory allocation. Other blocking calls like wait_event() pose similar
> > > + * issues.
> > > + */
> > > +# define non_block_start() \
> > > +     do { current->non_block_count++; } while (0)
> > > +/**
> > > + * non_block_end - annotate the end of section where sleeping is prohibited
> > > + *
> > > + * Closes a section opened by non_block_start().
> > > + */
> > > +# define non_block_end() \
> > > +     do { WARN_ON(current->non_block_count-- == 0); } while (0)
> >
> > check-patch does not like these, and I agree
> >
> > #101: FILE: include/linux/kernel.h:248:
> > +# define non_block_start() \
> > +       do { current->non_block_count++; } while (0)
> >
> > /tmp/tmp1spfxufy/0006-kernel-h-Add-non_block_start-end-.patch:108: WARNING: Single statement macros should not use a do {} while (0) loop
> > #108: FILE: include/linux/kernel.h:255:
> > +# define non_block_end() \
> > +       do { WARN_ON(current->non_block_count-- == 0); } while (0)
> >
> > Please use a static inline?
> 
> We need get_current() plus the task_struct, so this gets real messy
> real fast. Not even sure which header this would fit in, or whether
> I'd need to create a new one. You're insisting on this or respinning
> with the do { } while (0) dropped ok.

My prefernce is always a static inline, but if the headers are so
twisty we need to use #define to solve a missing include, then I
wouldn't insist on it.

If dropping do while is the only change then I can edit it in..
I think we have the acks now

Jason

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/5] kernel.h: Add non_block_start/end()
  2019-08-28 18:43       ` Jason Gunthorpe
@ 2019-08-28 18:56         ` Daniel Vetter
  2019-09-03  7:28           ` Daniel Vetter
  0 siblings, 1 reply; 15+ messages in thread
From: Daniel Vetter @ 2019-08-28 18:56 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: LKML, Linux MM, DRI Development, Peter Zijlstra, Ingo Molnar,
	Andrew Morton, Michal Hocko, David Rientjes,
	Christian König, Jérôme Glisse, Masahiro Yamada,
	Wei Wang, Andy Shevchenko, Thomas Gleixner, Jann Horn, Feng Tang,
	Kees Cook, Randy Dunlap, Daniel Vetter

On Wed, Aug 28, 2019 at 8:43 PM Jason Gunthorpe <jgg@ziepe.ca> wrote:
> On Wed, Aug 28, 2019 at 08:33:13PM +0200, Daniel Vetter wrote:
> > On Wed, Aug 28, 2019 at 12:50 AM Jason Gunthorpe <jgg@ziepe.ca> wrote:
> > >
> > > > diff --git a/include/linux/kernel.h b/include/linux/kernel.h
> > > > index 4fa360a13c1e..82f84cfe372f 100644
> > > > +++ b/include/linux/kernel.h
> > > > @@ -217,7 +217,9 @@ extern void __cant_sleep(const char *file, int line, int preempt_offset);
> > > >   * might_sleep - annotation for functions that can sleep
> > > >   *
> > > >   * this macro will print a stack trace if it is executed in an atomic
> > > > - * context (spinlock, irq-handler, ...).
> > > > + * context (spinlock, irq-handler, ...). Additional sections where blocking is
> > > > + * not allowed can be annotated with non_block_start() and non_block_end()
> > > > + * pairs.
> > > >   *
> > > >   * This is a useful debugging help to be able to catch problems early and not
> > > >   * be bitten later when the calling function happens to sleep when it is not
> > > > @@ -233,6 +235,25 @@ extern void __cant_sleep(const char *file, int line, int preempt_offset);
> > > >  # define cant_sleep() \
> > > >       do { __cant_sleep(__FILE__, __LINE__, 0); } while (0)
> > > >  # define sched_annotate_sleep()      (current->task_state_change = 0)
> > > > +/**
> > > > + * non_block_start - annotate the start of section where sleeping is prohibited
> > > > + *
> > > > + * This is on behalf of the oom reaper, specifically when it is calling the mmu
> > > > + * notifiers. The problem is that if the notifier were to block on, for example,
> > > > + * mutex_lock() and if the process which holds that mutex were to perform a
> > > > + * sleeping memory allocation, the oom reaper is now blocked on completion of
> > > > + * that memory allocation. Other blocking calls like wait_event() pose similar
> > > > + * issues.
> > > > + */
> > > > +# define non_block_start() \
> > > > +     do { current->non_block_count++; } while (0)
> > > > +/**
> > > > + * non_block_end - annotate the end of section where sleeping is prohibited
> > > > + *
> > > > + * Closes a section opened by non_block_start().
> > > > + */
> > > > +# define non_block_end() \
> > > > +     do { WARN_ON(current->non_block_count-- == 0); } while (0)
> > >
> > > check-patch does not like these, and I agree
> > >
> > > #101: FILE: include/linux/kernel.h:248:
> > > +# define non_block_start() \
> > > +       do { current->non_block_count++; } while (0)
> > >
> > > /tmp/tmp1spfxufy/0006-kernel-h-Add-non_block_start-end-.patch:108: WARNING: Single statement macros should not use a do {} while (0) loop
> > > #108: FILE: include/linux/kernel.h:255:
> > > +# define non_block_end() \
> > > +       do { WARN_ON(current->non_block_count-- == 0); } while (0)
> > >
> > > Please use a static inline?
> >
> > We need get_current() plus the task_struct, so this gets real messy
> > real fast. Not even sure which header this would fit in, or whether
> > I'd need to create a new one. You're insisting on this or respinning
> > with the do { } while (0) dropped ok.
>
> My prefernce is always a static inline, but if the headers are so
> twisty we need to use #define to solve a missing include, then I
> wouldn't insist on it.

Cleanest would be a new header I guess, together with might_sleep().
But moving that is a bit much I think, there's almost 500 callers of
that one from a quick git grep

> If dropping do while is the only change then I can edit it in..
> I think we have the acks now

Yeah sounds simplest, thanks.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/5] kernel.h: Add non_block_start/end()
  2019-08-28 18:56         ` Daniel Vetter
@ 2019-09-03  7:28           ` Daniel Vetter
  2019-09-03  7:36             ` Jason Gunthorpe
  0 siblings, 1 reply; 15+ messages in thread
From: Daniel Vetter @ 2019-09-03  7:28 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: LKML, Linux MM, DRI Development, Peter Zijlstra, Ingo Molnar,
	Andrew Morton, Michal Hocko, David Rientjes,
	Christian König, Jérôme Glisse, Masahiro Yamada,
	Wei Wang, Andy Shevchenko, Thomas Gleixner, Jann Horn, Feng Tang,
	Kees Cook, Randy Dunlap, Daniel Vetter

On Wed, Aug 28, 2019 at 8:56 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> On Wed, Aug 28, 2019 at 8:43 PM Jason Gunthorpe <jgg@ziepe.ca> wrote:
> > On Wed, Aug 28, 2019 at 08:33:13PM +0200, Daniel Vetter wrote:
> > > On Wed, Aug 28, 2019 at 12:50 AM Jason Gunthorpe <jgg@ziepe.ca> wrote:
> > > >
> > > > > diff --git a/include/linux/kernel.h b/include/linux/kernel.h
> > > > > index 4fa360a13c1e..82f84cfe372f 100644
> > > > > +++ b/include/linux/kernel.h
> > > > > @@ -217,7 +217,9 @@ extern void __cant_sleep(const char *file, int line, int preempt_offset);
> > > > >   * might_sleep - annotation for functions that can sleep
> > > > >   *
> > > > >   * this macro will print a stack trace if it is executed in an atomic
> > > > > - * context (spinlock, irq-handler, ...).
> > > > > + * context (spinlock, irq-handler, ...). Additional sections where blocking is
> > > > > + * not allowed can be annotated with non_block_start() and non_block_end()
> > > > > + * pairs.
> > > > >   *
> > > > >   * This is a useful debugging help to be able to catch problems early and not
> > > > >   * be bitten later when the calling function happens to sleep when it is not
> > > > > @@ -233,6 +235,25 @@ extern void __cant_sleep(const char *file, int line, int preempt_offset);
> > > > >  # define cant_sleep() \
> > > > >       do { __cant_sleep(__FILE__, __LINE__, 0); } while (0)
> > > > >  # define sched_annotate_sleep()      (current->task_state_change = 0)
> > > > > +/**
> > > > > + * non_block_start - annotate the start of section where sleeping is prohibited
> > > > > + *
> > > > > + * This is on behalf of the oom reaper, specifically when it is calling the mmu
> > > > > + * notifiers. The problem is that if the notifier were to block on, for example,
> > > > > + * mutex_lock() and if the process which holds that mutex were to perform a
> > > > > + * sleeping memory allocation, the oom reaper is now blocked on completion of
> > > > > + * that memory allocation. Other blocking calls like wait_event() pose similar
> > > > > + * issues.
> > > > > + */
> > > > > +# define non_block_start() \
> > > > > +     do { current->non_block_count++; } while (0)
> > > > > +/**
> > > > > + * non_block_end - annotate the end of section where sleeping is prohibited
> > > > > + *
> > > > > + * Closes a section opened by non_block_start().
> > > > > + */
> > > > > +# define non_block_end() \
> > > > > +     do { WARN_ON(current->non_block_count-- == 0); } while (0)
> > > >
> > > > check-patch does not like these, and I agree
> > > >
> > > > #101: FILE: include/linux/kernel.h:248:
> > > > +# define non_block_start() \
> > > > +       do { current->non_block_count++; } while (0)
> > > >
> > > > /tmp/tmp1spfxufy/0006-kernel-h-Add-non_block_start-end-.patch:108: WARNING: Single statement macros should not use a do {} while (0) loop
> > > > #108: FILE: include/linux/kernel.h:255:
> > > > +# define non_block_end() \
> > > > +       do { WARN_ON(current->non_block_count-- == 0); } while (0)
> > > >
> > > > Please use a static inline?
> > >
> > > We need get_current() plus the task_struct, so this gets real messy
> > > real fast. Not even sure which header this would fit in, or whether
> > > I'd need to create a new one. You're insisting on this or respinning
> > > with the do { } while (0) dropped ok.
> >
> > My prefernce is always a static inline, but if the headers are so
> > twisty we need to use #define to solve a missing include, then I
> > wouldn't insist on it.
>
> Cleanest would be a new header I guess, together with might_sleep().
> But moving that is a bit much I think, there's almost 500 callers of
> that one from a quick git grep
>
> > If dropping do while is the only change then I can edit it in..
> > I think we have the acks now
>
> Yeah sounds simplest, thanks.

Hi Jason,

Do you expect me to resend now, or do you plan to do the patchwork
appeasement when applying? I've seen you merged the other patches
(thanks!), but not these two here.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/5] kernel.h: Add non_block_start/end()
  2019-09-03  7:28           ` Daniel Vetter
@ 2019-09-03  7:36             ` Jason Gunthorpe
  0 siblings, 0 replies; 15+ messages in thread
From: Jason Gunthorpe @ 2019-09-03  7:36 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: LKML, Linux MM, DRI Development, Peter Zijlstra, Ingo Molnar,
	Andrew Morton, Michal Hocko, David Rientjes,
	Christian König, Jérôme Glisse, Masahiro Yamada,
	Wei Wang, Andy Shevchenko, Thomas Gleixner, Jann Horn, Feng Tang,
	Kees Cook, Randy Dunlap, Daniel Vetter

On Tue, Sep 03, 2019 at 09:28:23AM +0200, Daniel Vetter wrote:

> > Cleanest would be a new header I guess, together with might_sleep().
> > But moving that is a bit much I think, there's almost 500 callers of
> > that one from a quick git grep
> >
> > > If dropping do while is the only change then I can edit it in..
> > > I think we have the acks now
> >
> > Yeah sounds simplest, thanks.
> 
> Hi Jason,
> 
> Do you expect me to resend now, or do you plan to do the patchwork
> appeasement when applying? I've seen you merged the other patches
> (thanks!), but not these two here.

Sorry, I didn't get to this before I started travelling, and deferred
it since we were having linux-next related problems with hmm.git. I
hope to do it today.

I will fix it up as promised

Thanks,
Jason

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/5] mmu notifer debug annotations
  2019-08-26 20:14 [PATCH 0/5] mmu notifer debug annotations Daniel Vetter
                   ` (5 preceding siblings ...)
  2019-08-27 23:04 ` [PATCH 0/5] mmu notifer debug annotations Jason Gunthorpe
@ 2019-09-05 14:49 ` Jason Gunthorpe
  6 siblings, 0 replies; 15+ messages in thread
From: Jason Gunthorpe @ 2019-09-05 14:49 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: LKML, Linux MM, DRI Development

On Mon, Aug 26, 2019 at 10:14:20PM +0200, Daniel Vetter wrote:
> Hi all,
> 
> Next round. Changes:
> 
> - I kept the two lockdep annotations patches since when I rebased this
>   before retesting linux-next didn't yet have them. Otherwise unchanged
>   except for a trivial conflict.
> 
> - Ack from Peter Z. on the kernel.h patch.
> 
> - Added annotations for non_block to invalidate_range_end. I can't test
>   that readily since i915 doesn't use it.
> 
> - Added might_sleep annotations to also make sure the mm side keeps up
>   it's side of the contract here around what's allowed and what's not.
> 
> Comments, feedback, review as usual very much appreciated.
> 
> 
> Daniel Vetter (5):
>   kernel.h: Add non_block_start/end()
>   mm, notifier: Catch sleeping/blocking for !blockable

These two applied to hmm.git, with the small check patch edit, thanks!

Jason

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2019-09-05 14:49 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-26 20:14 [PATCH 0/5] mmu notifer debug annotations Daniel Vetter
2019-08-26 20:14 ` [PATCH 1/5] mm, notifier: Add a lockdep map for invalidate_range_start/end Daniel Vetter
2019-08-26 20:14 ` [PATCH 2/5] mm, notifier: Prime lockdep Daniel Vetter
2019-08-26 20:14 ` [PATCH 3/5] kernel.h: Add non_block_start/end() Daniel Vetter
2019-08-27 22:50   ` Jason Gunthorpe
2019-08-28 18:33     ` Daniel Vetter
2019-08-28 18:43       ` Jason Gunthorpe
2019-08-28 18:56         ` Daniel Vetter
2019-09-03  7:28           ` Daniel Vetter
2019-09-03  7:36             ` Jason Gunthorpe
2019-08-28 11:43   ` Michal Hocko
2019-08-26 20:14 ` [PATCH 4/5] mm, notifier: Catch sleeping/blocking for !blockable Daniel Vetter
2019-08-26 20:14 ` [PATCH 5/5] mm, notifier: annotate with might_sleep() Daniel Vetter
2019-08-27 23:04 ` [PATCH 0/5] mmu notifer debug annotations Jason Gunthorpe
2019-09-05 14:49 ` Jason Gunthorpe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).