From: Daniel Vetter <daniel.vetter@ffwll.ch> To: LKML <linux-kernel@vger.kernel.org> Cc: "Linux MM" <linux-mm@kvack.org>, "DRI Development" <dri-devel@lists.freedesktop.org>, "Intel Graphics Development" <intel-gfx@lists.freedesktop.org>, "Daniel Vetter" <daniel.vetter@ffwll.ch>, "Jason Gunthorpe" <jgg@ziepe.ca>, "Peter Zijlstra" <peterz@infradead.org>, "Ingo Molnar" <mingo@redhat.com>, "Andrew Morton" <akpm@linux-foundation.org>, "Michal Hocko" <mhocko@suse.com>, "David Rientjes" <rientjes@google.com>, "Christian König" <christian.koenig@amd.com>, "Jérôme Glisse" <jglisse@redhat.com>, "Masahiro Yamada" <yamada.masahiro@socionext.com>, "Wei Wang" <wvw@google.com>, "Andy Shevchenko" <andriy.shevchenko@linux.intel.com>, "Thomas Gleixner" <tglx@linutronix.de>, "Jann Horn" <jannh@google.com>, "Feng Tang" <feng.tang@intel.com>, "Kees Cook" <keescook@chromium.org>, "Randy Dunlap" <rdunlap@infradead.org>, "Daniel Vetter" <daniel.vetter@intel.com> Subject: [PATCH 3/4] kernel.h: Add non_block_start/end() Date: Tue, 20 Aug 2019 10:19:01 +0200 [thread overview] Message-ID: <20190820081902.24815-4-daniel.vetter@ffwll.ch> (raw) In-Reply-To: <20190820081902.24815-1-daniel.vetter@ffwll.ch> In some special cases we must not block, but there's not a spinlock, preempt-off, irqs-off or similar critical section already that arms the might_sleep() debug checks. Add a non_block_start/end() pair to annotate these. This will be used in the oom paths of mmu-notifiers, where blocking is not allowed to make sure there's forward progress. Quoting Michal: "The notifier is called from quite a restricted context - oom_reaper - which shouldn't depend on any locks or sleepable conditionals. The code should be swift as well but we mostly do care about it to make a forward progress. Checking for sleepable context is the best thing we could come up with that would describe these demands at least partially." Peter also asked whether we want to catch spinlocks on top, but Michal said those are less of a problem because spinlocks can't have an indirect dependency upon the page allocator and hence close the loop with the oom reaper. Suggested by Michal Hocko. v2: - Improve commit message (Michal) - Also check in schedule, not just might_sleep (Peter) v3: It works better when I actually squash in the fixup I had lying around :-/ v4: Pick the suggestion from Andrew Morton to give non_block_start/end some good kerneldoc comments. I added that other blocking calls like wait_event pose similar issues, since that's the other example we discussed. Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Michal Hocko <mhocko@suse.com> Cc: David Rientjes <rientjes@google.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: "Jérôme Glisse" <jglisse@redhat.com> Cc: linux-mm@kvack.org Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Wei Wang <wvw@google.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jann Horn <jannh@google.com> Cc: Feng Tang <feng.tang@intel.com> Cc: Kees Cook <keescook@chromium.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: linux-kernel@vger.kernel.org Acked-by: Christian König <christian.koenig@amd.com> (v1) Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> --- include/linux/kernel.h | 25 ++++++++++++++++++++++++- include/linux/sched.h | 4 ++++ kernel/sched/core.c | 19 ++++++++++++++----- 3 files changed, 42 insertions(+), 6 deletions(-) diff --git a/include/linux/kernel.h b/include/linux/kernel.h index 4fa360a13c1e..82f84cfe372f 100644 --- a/include/linux/kernel.h +++ b/include/linux/kernel.h @@ -217,7 +217,9 @@ extern void __cant_sleep(const char *file, int line, int preempt_offset); * might_sleep - annotation for functions that can sleep * * this macro will print a stack trace if it is executed in an atomic - * context (spinlock, irq-handler, ...). + * context (spinlock, irq-handler, ...). Additional sections where blocking is + * not allowed can be annotated with non_block_start() and non_block_end() + * pairs. * * This is a useful debugging help to be able to catch problems early and not * be bitten later when the calling function happens to sleep when it is not @@ -233,6 +235,25 @@ extern void __cant_sleep(const char *file, int line, int preempt_offset); # define cant_sleep() \ do { __cant_sleep(__FILE__, __LINE__, 0); } while (0) # define sched_annotate_sleep() (current->task_state_change = 0) +/** + * non_block_start - annotate the start of section where sleeping is prohibited + * + * This is on behalf of the oom reaper, specifically when it is calling the mmu + * notifiers. The problem is that if the notifier were to block on, for example, + * mutex_lock() and if the process which holds that mutex were to perform a + * sleeping memory allocation, the oom reaper is now blocked on completion of + * that memory allocation. Other blocking calls like wait_event() pose similar + * issues. + */ +# define non_block_start() \ + do { current->non_block_count++; } while (0) +/** + * non_block_end - annotate the end of section where sleeping is prohibited + * + * Closes a section opened by non_block_start(). + */ +# define non_block_end() \ + do { WARN_ON(current->non_block_count-- == 0); } while (0) #else static inline void ___might_sleep(const char *file, int line, int preempt_offset) { } @@ -241,6 +262,8 @@ extern void __cant_sleep(const char *file, int line, int preempt_offset); # define might_sleep() do { might_resched(); } while (0) # define cant_sleep() do { } while (0) # define sched_annotate_sleep() do { } while (0) +# define non_block_start() do { } while (0) +# define non_block_end() do { } while (0) #endif #define might_sleep_if(cond) do { if (cond) might_sleep(); } while (0) diff --git a/include/linux/sched.h b/include/linux/sched.h index 9f51932bd543..c5630f3dca1f 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -974,6 +974,10 @@ struct task_struct { struct mutex_waiter *blocked_on; #endif +#ifdef CONFIG_DEBUG_ATOMIC_SLEEP + int non_block_count; +#endif + #ifdef CONFIG_TRACE_IRQFLAGS unsigned int irq_events; unsigned long hardirq_enable_ip; diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 2b037f195473..57245770d6cc 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -3700,13 +3700,22 @@ static noinline void __schedule_bug(struct task_struct *prev) /* * Various schedule()-time debugging checks and statistics: */ -static inline void schedule_debug(struct task_struct *prev) +static inline void schedule_debug(struct task_struct *prev, bool preempt) { #ifdef CONFIG_SCHED_STACK_END_CHECK if (task_stack_end_corrupted(prev)) panic("corrupted stack end detected inside scheduler\n"); #endif +#ifdef CONFIG_DEBUG_ATOMIC_SLEEP + if (!preempt && prev->state && prev->non_block_count) { + printk(KERN_ERR "BUG: scheduling in a non-blocking section: %s/%d/%i\n", + prev->comm, prev->pid, prev->non_block_count); + dump_stack(); + add_taint(TAINT_WARN, LOCKDEP_STILL_OK); + } +#endif + if (unlikely(in_atomic_preempt_off())) { __schedule_bug(prev); preempt_count_set(PREEMPT_DISABLED); @@ -3813,7 +3822,7 @@ static void __sched notrace __schedule(bool preempt) rq = cpu_rq(cpu); prev = rq->curr; - schedule_debug(prev); + schedule_debug(prev, preempt); if (sched_feat(HRTICK)) hrtick_clear(rq); @@ -6570,7 +6579,7 @@ void ___might_sleep(const char *file, int line, int preempt_offset) rcu_sleep_check(); if ((preempt_count_equals(preempt_offset) && !irqs_disabled() && - !is_idle_task(current)) || + !is_idle_task(current) && !current->non_block_count) || system_state == SYSTEM_BOOTING || system_state > SYSTEM_RUNNING || oops_in_progress) return; @@ -6586,8 +6595,8 @@ void ___might_sleep(const char *file, int line, int preempt_offset) "BUG: sleeping function called from invalid context at %s:%d\n", file, line); printk(KERN_ERR - "in_atomic(): %d, irqs_disabled(): %d, pid: %d, name: %s\n", - in_atomic(), irqs_disabled(), + "in_atomic(): %d, irqs_disabled(): %d, non_block: %d, pid: %d, name: %s\n", + in_atomic(), irqs_disabled(), current->non_block_count, current->pid, current->comm); if (task_stack_end_corrupted(current)) -- 2.23.0.rc1
WARNING: multiple messages have this Message-ID (diff)
From: Daniel Vetter <daniel.vetter@ffwll.ch> To: LKML <linux-kernel@vger.kernel.org> Cc: "Feng Tang" <feng.tang@intel.com>, "Michal Hocko" <mhocko@suse.com>, "Kees Cook" <keescook@chromium.org>, "Linux MM" <linux-mm@kvack.org>, "Peter Zijlstra" <peterz@infradead.org>, "Daniel Vetter" <daniel.vetter@ffwll.ch>, "Intel Graphics Development" <intel-gfx@lists.freedesktop.org>, "Jann Horn" <jannh@google.com>, "DRI Development" <dri-devel@lists.freedesktop.org>, "Jason Gunthorpe" <jgg@ziepe.ca>, "Jérôme Glisse" <jglisse@redhat.com>, "Ingo Molnar" <mingo@redhat.com>, "Thomas Gleixner" <tglx@linutronix.de>, "Randy Dunlap" <rdunlap@infradead.org>, "David Rientjes" <rientjes@google.com>, "Wei Wang" <wvw@google.com>, "Daniel Vetter" <daniel.vetter@intel.com>, "Andrew Morton" <akpm@linux-foundation.org>, "Andy Shevchenko" <andriy.shevchenko@linux.intel.com>, "Christian König" <christian.koenig@amd.com>, "Masahiro Yamada" <yamada.masahiro@socionext.com> Subject: [PATCH 3/4] kernel.h: Add non_block_start/end() Date: Tue, 20 Aug 2019 10:19:01 +0200 [thread overview] Message-ID: <20190820081902.24815-4-daniel.vetter@ffwll.ch> (raw) In-Reply-To: <20190820081902.24815-1-daniel.vetter@ffwll.ch> In some special cases we must not block, but there's not a spinlock, preempt-off, irqs-off or similar critical section already that arms the might_sleep() debug checks. Add a non_block_start/end() pair to annotate these. This will be used in the oom paths of mmu-notifiers, where blocking is not allowed to make sure there's forward progress. Quoting Michal: "The notifier is called from quite a restricted context - oom_reaper - which shouldn't depend on any locks or sleepable conditionals. The code should be swift as well but we mostly do care about it to make a forward progress. Checking for sleepable context is the best thing we could come up with that would describe these demands at least partially." Peter also asked whether we want to catch spinlocks on top, but Michal said those are less of a problem because spinlocks can't have an indirect dependency upon the page allocator and hence close the loop with the oom reaper. Suggested by Michal Hocko. v2: - Improve commit message (Michal) - Also check in schedule, not just might_sleep (Peter) v3: It works better when I actually squash in the fixup I had lying around :-/ v4: Pick the suggestion from Andrew Morton to give non_block_start/end some good kerneldoc comments. I added that other blocking calls like wait_event pose similar issues, since that's the other example we discussed. Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Michal Hocko <mhocko@suse.com> Cc: David Rientjes <rientjes@google.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: "Jérôme Glisse" <jglisse@redhat.com> Cc: linux-mm@kvack.org Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Wei Wang <wvw@google.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jann Horn <jannh@google.com> Cc: Feng Tang <feng.tang@intel.com> Cc: Kees Cook <keescook@chromium.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: linux-kernel@vger.kernel.org Acked-by: Christian König <christian.koenig@amd.com> (v1) Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> --- include/linux/kernel.h | 25 ++++++++++++++++++++++++- include/linux/sched.h | 4 ++++ kernel/sched/core.c | 19 ++++++++++++++----- 3 files changed, 42 insertions(+), 6 deletions(-) diff --git a/include/linux/kernel.h b/include/linux/kernel.h index 4fa360a13c1e..82f84cfe372f 100644 --- a/include/linux/kernel.h +++ b/include/linux/kernel.h @@ -217,7 +217,9 @@ extern void __cant_sleep(const char *file, int line, int preempt_offset); * might_sleep - annotation for functions that can sleep * * this macro will print a stack trace if it is executed in an atomic - * context (spinlock, irq-handler, ...). + * context (spinlock, irq-handler, ...). Additional sections where blocking is + * not allowed can be annotated with non_block_start() and non_block_end() + * pairs. * * This is a useful debugging help to be able to catch problems early and not * be bitten later when the calling function happens to sleep when it is not @@ -233,6 +235,25 @@ extern void __cant_sleep(const char *file, int line, int preempt_offset); # define cant_sleep() \ do { __cant_sleep(__FILE__, __LINE__, 0); } while (0) # define sched_annotate_sleep() (current->task_state_change = 0) +/** + * non_block_start - annotate the start of section where sleeping is prohibited + * + * This is on behalf of the oom reaper, specifically when it is calling the mmu + * notifiers. The problem is that if the notifier were to block on, for example, + * mutex_lock() and if the process which holds that mutex were to perform a + * sleeping memory allocation, the oom reaper is now blocked on completion of + * that memory allocation. Other blocking calls like wait_event() pose similar + * issues. + */ +# define non_block_start() \ + do { current->non_block_count++; } while (0) +/** + * non_block_end - annotate the end of section where sleeping is prohibited + * + * Closes a section opened by non_block_start(). + */ +# define non_block_end() \ + do { WARN_ON(current->non_block_count-- == 0); } while (0) #else static inline void ___might_sleep(const char *file, int line, int preempt_offset) { } @@ -241,6 +262,8 @@ extern void __cant_sleep(const char *file, int line, int preempt_offset); # define might_sleep() do { might_resched(); } while (0) # define cant_sleep() do { } while (0) # define sched_annotate_sleep() do { } while (0) +# define non_block_start() do { } while (0) +# define non_block_end() do { } while (0) #endif #define might_sleep_if(cond) do { if (cond) might_sleep(); } while (0) diff --git a/include/linux/sched.h b/include/linux/sched.h index 9f51932bd543..c5630f3dca1f 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -974,6 +974,10 @@ struct task_struct { struct mutex_waiter *blocked_on; #endif +#ifdef CONFIG_DEBUG_ATOMIC_SLEEP + int non_block_count; +#endif + #ifdef CONFIG_TRACE_IRQFLAGS unsigned int irq_events; unsigned long hardirq_enable_ip; diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 2b037f195473..57245770d6cc 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -3700,13 +3700,22 @@ static noinline void __schedule_bug(struct task_struct *prev) /* * Various schedule()-time debugging checks and statistics: */ -static inline void schedule_debug(struct task_struct *prev) +static inline void schedule_debug(struct task_struct *prev, bool preempt) { #ifdef CONFIG_SCHED_STACK_END_CHECK if (task_stack_end_corrupted(prev)) panic("corrupted stack end detected inside scheduler\n"); #endif +#ifdef CONFIG_DEBUG_ATOMIC_SLEEP + if (!preempt && prev->state && prev->non_block_count) { + printk(KERN_ERR "BUG: scheduling in a non-blocking section: %s/%d/%i\n", + prev->comm, prev->pid, prev->non_block_count); + dump_stack(); + add_taint(TAINT_WARN, LOCKDEP_STILL_OK); + } +#endif + if (unlikely(in_atomic_preempt_off())) { __schedule_bug(prev); preempt_count_set(PREEMPT_DISABLED); @@ -3813,7 +3822,7 @@ static void __sched notrace __schedule(bool preempt) rq = cpu_rq(cpu); prev = rq->curr; - schedule_debug(prev); + schedule_debug(prev, preempt); if (sched_feat(HRTICK)) hrtick_clear(rq); @@ -6570,7 +6579,7 @@ void ___might_sleep(const char *file, int line, int preempt_offset) rcu_sleep_check(); if ((preempt_count_equals(preempt_offset) && !irqs_disabled() && - !is_idle_task(current)) || + !is_idle_task(current) && !current->non_block_count) || system_state == SYSTEM_BOOTING || system_state > SYSTEM_RUNNING || oops_in_progress) return; @@ -6586,8 +6595,8 @@ void ___might_sleep(const char *file, int line, int preempt_offset) "BUG: sleeping function called from invalid context at %s:%d\n", file, line); printk(KERN_ERR - "in_atomic(): %d, irqs_disabled(): %d, pid: %d, name: %s\n", - in_atomic(), irqs_disabled(), + "in_atomic(): %d, irqs_disabled(): %d, non_block: %d, pid: %d, name: %s\n", + in_atomic(), irqs_disabled(), current->non_block_count, current->pid, current->comm); if (task_stack_end_corrupted(current)) -- 2.23.0.rc1 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
next prev parent reply other threads:[~2019-08-20 8:19 UTC|newest] Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-08-20 8:18 [PATCH 0/4] mmu notifier debug annotations/checks Daniel Vetter 2019-08-20 8:18 ` [PATCH 1/4] mm, notifier: Add a lockdep map for invalidate_range_start/end Daniel Vetter 2019-08-20 13:31 ` Jason Gunthorpe 2019-08-20 8:19 ` [PATCH 2/4] mm, notifier: Prime lockdep Daniel Vetter 2019-08-20 13:31 ` Jason Gunthorpe 2019-08-20 8:19 ` Daniel Vetter [this message] 2019-08-20 8:19 ` [PATCH 3/4] kernel.h: Add non_block_start/end() Daniel Vetter 2019-08-20 20:24 ` Daniel Vetter 2019-08-20 20:24 ` Daniel Vetter 2019-08-22 23:14 ` Andrew Morton 2019-08-22 23:14 ` Andrew Morton 2019-08-23 8:34 ` Daniel Vetter 2019-08-23 8:34 ` Daniel Vetter 2019-08-23 12:12 ` Jason Gunthorpe 2019-08-23 12:12 ` Jason Gunthorpe 2019-08-23 12:22 ` Peter Zijlstra 2019-08-23 12:22 ` Peter Zijlstra 2019-08-23 13:42 ` Daniel Vetter 2019-08-23 13:42 ` Daniel Vetter 2019-08-23 14:06 ` Peter Zijlstra 2019-08-23 14:06 ` Peter Zijlstra 2019-08-23 15:15 ` Daniel Vetter 2019-08-23 15:15 ` Daniel Vetter 2019-08-23 8:48 ` Peter Zijlstra 2019-08-23 8:48 ` Peter Zijlstra 2019-08-20 8:19 ` [PATCH 4/4] mm, notifier: Catch sleeping/blocking for !blockable Daniel Vetter 2019-08-20 13:34 ` Jason Gunthorpe 2019-08-20 15:18 ` Daniel Vetter 2019-08-20 15:27 ` Jason Gunthorpe 2019-08-21 9:34 ` Daniel Vetter 2019-08-21 9:34 ` Daniel Vetter 2019-08-21 15:41 ` Daniel Vetter 2019-08-21 15:41 ` Daniel Vetter 2019-08-21 16:16 ` Jason Gunthorpe 2019-08-22 8:42 ` Daniel Vetter 2019-08-22 8:42 ` Daniel Vetter 2019-08-22 14:24 ` Jason Gunthorpe 2019-08-22 14:27 ` Daniel Vetter 2019-08-22 14:27 ` Daniel Vetter 2019-08-20 11:15 ` ✗ Fi.CI.CHECKPATCH: warning for mmu notifier debug annotations/checks Patchwork 2019-08-20 12:33 ` ✓ Fi.CI.BAT: success " Patchwork 2019-08-20 18:14 ` ✗ Fi.CI.IGT: failure " Patchwork
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20190820081902.24815-4-daniel.vetter@ffwll.ch \ --to=daniel.vetter@ffwll.ch \ --cc=akpm@linux-foundation.org \ --cc=andriy.shevchenko@linux.intel.com \ --cc=christian.koenig@amd.com \ --cc=daniel.vetter@intel.com \ --cc=dri-devel@lists.freedesktop.org \ --cc=feng.tang@intel.com \ --cc=intel-gfx@lists.freedesktop.org \ --cc=jannh@google.com \ --cc=jgg@ziepe.ca \ --cc=jglisse@redhat.com \ --cc=keescook@chromium.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=mhocko@suse.com \ --cc=mingo@redhat.com \ --cc=peterz@infradead.org \ --cc=rdunlap@infradead.org \ --cc=rientjes@google.com \ --cc=tglx@linutronix.de \ --cc=wvw@google.com \ --cc=yamada.masahiro@socionext.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.