linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] SCHED: attribute page lock and waitqueue functions as sched
@ 2021-11-03 18:47 Jimmy Shiu
  2021-11-03 18:58 ` Matthew Wilcox
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Jimmy Shiu @ 2021-11-03 18:47 UTC (permalink / raw)
  To: mingo
  Cc: jimmyshiu, joaodias, wvw, Minchan Kim, Will McVicker,
	Peter Zijlstra, Juri Lelli, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Andrew Morton,
	Matthew Wilcox (Oracle),
	David Howells, Vlastimil Babka, William Kucharski,
	Kirill A. Shutemov, Andreas Gruenbacher, linux-kernel, linux-mm

trace_sched_blocked_trace in CFS is really useful for debugging via
trace because it tell where the process was stuck on callstack.

For example,
           <...>-6143  ( 6136) [005] d..2    50.278987: sched_blocked_reason: pid=6136 iowait=0 caller=SyS_mprotect+0x88/0x208
           <...>-6136  ( 6136) [005] d..2    50.278990: sched_blocked_reason: pid=6142 iowait=0 caller=do_page_fault+0x1f4/0x3b0
           <...>-6142  ( 6136) [006] d..2    50.278996: sched_blocked_reason: pid=6144 iowait=0 caller=SyS_prctl+0x52c/0xb58
           <...>-6144  ( 6136) [006] d..2    50.279007: sched_blocked_reason: pid=6136 iowait=0 caller=vm_mmap_pgoff+0x74/0x104

However, sometime it gives pointless information like this.
    RenderThread-2322  ( 1805) [006] d.s3    50.319046: sched_blocked_reason: pid=6136 iowait=1 caller=__lock_page_killable+0x17c/0x220
     logd.writer-594   (  587) [002] d.s3    50.334011: sched_blocked_reason: pid=6126 iowait=1 caller=wait_on_page_bit+0x194/0x208
  kworker/u16:13-333   (  333) [007] d.s4    50.343161: sched_blocked_reason: pid=6136 iowait=1 caller=__lock_page_killable+0x17c/0x220

Such wait_on_page_bit, __lock_page_killable are pointless because it doesn't
carry on higher information to identify the callstack.

The reason is page_lock and waitqueue are special synchronization method unlike
other normal locks(mutex, spinlock).
Let's mark them as "__sched" so get_wchan which used in trace_sched_blocked_trace
could detect it and skip them. It will produce more meaningful callstack
function like this.

           <...>-2867  ( 1068) [002] d.h4   124.209701: sched_blocked_reason: pid=329 iowait=0 caller=worker_thread+0x378/0x470
           <...>-2867  ( 1068) [002] d.s3   124.209763: sched_blocked_reason: pid=8454 iowait=1 caller=__filemap_fdatawait_range+0xa0/0x104
           <...>-2867  ( 1068) [002] d.s4   124.209803: sched_blocked_reason: pid=869 iowait=0 caller=worker_thread+0x378/0x470
 ScreenDecoratio-2364  ( 1867) [002] d.s3   124.209973: sched_blocked_reason: pid=8454 iowait=1 caller=f2fs_wait_on_page_writeback+0x84/0xcc
 ScreenDecoratio-2364  ( 1867) [002] d.s4   124.209986: sched_blocked_reason: pid=869 iowait=0 caller=worker_thread+0x378/0x470
           <...>-329   (  329) [000] d..3   124.210435: sched_blocked_reason: pid=538 iowait=0 caller=worker_thread+0x378/0x470
  kworker/u16:13-538   (  538) [007] d..3   124.210450: sched_blocked_reason: pid=6 iowait=0 caller=worker_thread+0x378/0x470

Test: build pass and boot to home.
Bug: 144961676
Bug: 144713689
Bug: 172212772
Signed-off-by: Minchan Kim <minchan@google.com>
Signed-off-by: Jimmy Shiu <jimmyshiu@google.com>
(cherry picked from commit 1e4de875d9e0cfaccf5131bcc709ae8646cdc168)
Signed-off-by: Will McVicker <willmcvicker@google.com>
---
 include/linux/pagemap.h | 17 +++++++++--------
 kernel/sched/wait.c     |  8 +++++---
 mm/filemap.c            | 14 +++++++-------
 3 files changed, 21 insertions(+), 18 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index db2c3e3eb1cf..12e82ff7686a 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -15,6 +15,7 @@
 #include <linux/bitops.h>
 #include <linux/hardirq.h> /* for in_interrupt() */
 #include <linux/hugetlb_inline.h>
+#include <linux/sched/debug.h>
 
 struct pagevec;
 
@@ -687,7 +688,7 @@ static inline void folio_lock(struct folio *folio)
 /*
  * lock_page may only be called if we have the page's inode pinned.
  */
-static inline void lock_page(struct page *page)
+static inline __sched void lock_page(struct page *page)
 {
 	struct folio *folio;
 	might_sleep();
@@ -697,7 +698,7 @@ static inline void lock_page(struct page *page)
 		__folio_lock(folio);
 }
 
-static inline int folio_lock_killable(struct folio *folio)
+static inline __sched int folio_lock_killable(struct folio *folio)
 {
 	might_sleep();
 	if (!folio_trylock(folio))
@@ -710,7 +711,7 @@ static inline int folio_lock_killable(struct folio *folio)
  * signals.  It returns 0 if it locked the page and -EINTR if it was
  * killed while waiting.
  */
-static inline int lock_page_killable(struct page *page)
+static inline __sched int lock_page_killable(struct page *page)
 {
 	return folio_lock_killable(page_folio(page));
 }
@@ -722,7 +723,7 @@ static inline int lock_page_killable(struct page *page)
  * Return value and mmap_lock implications depend on flags; see
  * __folio_lock_or_retry().
  */
-static inline bool lock_page_or_retry(struct page *page, struct mm_struct *mm,
+static inline __sched bool lock_page_or_retry(struct page *page, struct mm_struct *mm,
 				     unsigned int flags)
 {
 	struct folio *folio;
@@ -746,25 +747,25 @@ int folio_wait_bit_killable(struct folio *folio, int bit_nr);
  * ie with increased "page->count" so that the folio won't
  * go away during the wait..
  */
-static inline void folio_wait_locked(struct folio *folio)
+static inline __sched void folio_wait_locked(struct folio *folio)
 {
 	if (folio_test_locked(folio))
 		folio_wait_bit(folio, PG_locked);
 }
 
-static inline int folio_wait_locked_killable(struct folio *folio)
+static inline __sched int folio_wait_locked_killable(struct folio *folio)
 {
 	if (!folio_test_locked(folio))
 		return 0;
 	return folio_wait_bit_killable(folio, PG_locked);
 }
 
-static inline void wait_on_page_locked(struct page *page)
+static inline __sched void wait_on_page_locked(struct page *page)
 {
 	folio_wait_locked(page_folio(page));
 }
 
-static inline int wait_on_page_locked_killable(struct page *page)
+static inline __sched int wait_on_page_locked_killable(struct page *page)
 {
 	return folio_wait_locked_killable(page_folio(page));
 }
diff --git a/kernel/sched/wait.c b/kernel/sched/wait.c
index 76577d1642a5..a5975579a741 100644
--- a/kernel/sched/wait.c
+++ b/kernel/sched/wait.c
@@ -404,7 +404,8 @@ void finish_wait(struct wait_queue_head *wq_head, struct wait_queue_entry *wq_en
 }
 EXPORT_SYMBOL(finish_wait);
 
-int autoremove_wake_function(struct wait_queue_entry *wq_entry, unsigned mode, int sync, void *key)
+__sched int autoremove_wake_function(struct wait_queue_entry *wq_entry, unsigned int mode,
+				     int sync, void *key)
 {
 	int ret = default_wake_function(wq_entry, mode, sync, key);
 
@@ -440,7 +441,7 @@ static inline bool is_kthread_should_stop(void)
  * }						smp_mb(); // C
  * remove_wait_queue(&wq_head, &wait);		wq_entry->flags |= WQ_FLAG_WOKEN;
  */
-long wait_woken(struct wait_queue_entry *wq_entry, unsigned mode, long timeout)
+__sched long wait_woken(struct wait_queue_entry *wq_entry, unsigned int mode, long timeout)
 {
 	/*
 	 * The below executes an smp_mb(), which matches with the full barrier
@@ -465,7 +466,8 @@ long wait_woken(struct wait_queue_entry *wq_entry, unsigned mode, long timeout)
 }
 EXPORT_SYMBOL(wait_woken);
 
-int woken_wake_function(struct wait_queue_entry *wq_entry, unsigned mode, int sync, void *key)
+__sched int woken_wake_function(struct wait_queue_entry *wq_entry, unsigned int mode,
+				int sync, void *key)
 {
 	/* Pairs with the smp_store_mb() in wait_woken(). */
 	smp_mb(); /* C */
diff --git a/mm/filemap.c b/mm/filemap.c
index bfcef6ff7a27..ad4268ee1bf1 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1271,7 +1271,7 @@ static inline bool folio_trylock_flag(struct folio *folio, int bit_nr,
 /* How many times do we accept lock stealing from under a waiter? */
 int sysctl_page_lock_unfairness = 5;
 
-static inline int folio_wait_bit_common(struct folio *folio, int bit_nr,
+static inline __sched int folio_wait_bit_common(struct folio *folio, int bit_nr,
 		int state, enum behavior behavior)
 {
 	wait_queue_head_t *q = folio_waitqueue(folio);
@@ -1411,13 +1411,13 @@ static inline int folio_wait_bit_common(struct folio *folio, int bit_nr,
 	return wait->flags & WQ_FLAG_WOKEN ? 0 : -EINTR;
 }
 
-void folio_wait_bit(struct folio *folio, int bit_nr)
+__sched void folio_wait_bit(struct folio *folio, int bit_nr)
 {
 	folio_wait_bit_common(folio, bit_nr, TASK_UNINTERRUPTIBLE, SHARED);
 }
 EXPORT_SYMBOL(folio_wait_bit);
 
-int folio_wait_bit_killable(struct folio *folio, int bit_nr)
+__sched int folio_wait_bit_killable(struct folio *folio, int bit_nr)
 {
 	return folio_wait_bit_common(folio, bit_nr, TASK_KILLABLE, SHARED);
 }
@@ -1628,21 +1628,21 @@ EXPORT_SYMBOL_GPL(page_endio);
  * __folio_lock - Get a lock on the folio, assuming we need to sleep to get it.
  * @folio: The folio to lock
  */
-void __folio_lock(struct folio *folio)
+__sched void __folio_lock(struct folio *folio)
 {
 	folio_wait_bit_common(folio, PG_locked, TASK_UNINTERRUPTIBLE,
 				EXCLUSIVE);
 }
 EXPORT_SYMBOL(__folio_lock);
 
-int __folio_lock_killable(struct folio *folio)
+__sched int __folio_lock_killable(struct folio *folio)
 {
 	return folio_wait_bit_common(folio, PG_locked, TASK_KILLABLE,
 					EXCLUSIVE);
 }
 EXPORT_SYMBOL_GPL(__folio_lock_killable);
 
-static int __folio_lock_async(struct folio *folio, struct wait_page_queue *wait)
+static __sched int __folio_lock_async(struct folio *folio, struct wait_page_queue *wait)
 {
 	struct wait_queue_head *q = folio_waitqueue(folio);
 	int ret = 0;
@@ -1679,7 +1679,7 @@ static int __folio_lock_async(struct folio *folio, struct wait_page_queue *wait)
  * If neither ALLOW_RETRY nor KILLABLE are set, will always return true
  * with the folio locked and the mmap_lock unperturbed.
  */
-bool __folio_lock_or_retry(struct folio *folio, struct mm_struct *mm,
+__sched bool __folio_lock_or_retry(struct folio *folio, struct mm_struct *mm,
 			 unsigned int flags)
 {
 	if (fault_flag_allow_retry_first(flags)) {
-- 
2.34.0.rc0.344.g81b53c2807-goog


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] SCHED: attribute page lock and waitqueue functions as sched
  2021-11-03 18:47 [PATCH] SCHED: attribute page lock and waitqueue functions as sched Jimmy Shiu
@ 2021-11-03 18:58 ` Matthew Wilcox
  2021-11-03 19:10 ` Matthew Wilcox
  2021-11-04  9:16 ` Mel Gorman
  2 siblings, 0 replies; 4+ messages in thread
From: Matthew Wilcox @ 2021-11-03 18:58 UTC (permalink / raw)
  To: Jimmy Shiu
  Cc: mingo, joaodias, wvw, Minchan Kim, Will McVicker, Peter Zijlstra,
	Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt,
	Ben Segall, Mel Gorman, Daniel Bristot de Oliveira,
	Andrew Morton, David Howells, Vlastimil Babka, William Kucharski,
	Kirill A. Shutemov, Andreas Gruenbacher, linux-kernel, linux-mm

On Thu, Nov 04, 2021 at 02:47:03AM +0800, Jimmy Shiu wrote:
> Bug: 144961676
> Bug: 144713689
> Bug: 172212772

A bug number is meaningless without knowing which bug tracker they
refer to.  I suggest contemplating the meaning of the 'U' in URL.

> Signed-off-by: Minchan Kim <minchan@google.com>
> Signed-off-by: Jimmy Shiu <jimmyshiu@google.com>
> (cherry picked from commit 1e4de875d9e0cfaccf5131bcc709ae8646cdc168)

what tree is that commit ID in?  I suggest it's meaningless and
should be removed.

> @@ -687,7 +688,7 @@ static inline void folio_lock(struct folio *folio)
>  /*
>   * lock_page may only be called if we have the page's inode pinned.
>   */
> -static inline void lock_page(struct page *page)
> +static inline __sched void lock_page(struct page *page)

Why do you need to tag a static inline function as __sched?  This would
be the only place where that is done.

> +++ b/kernel/sched/wait.c
> @@ -404,7 +404,8 @@ void finish_wait(struct wait_queue_head *wq_head, struct wait_queue_entry *wq_en
>  }
>  EXPORT_SYMBOL(finish_wait);
>  
> -int autoremove_wake_function(struct wait_queue_entry *wq_entry, unsigned mode, int sync, void *key)
> +__sched int autoremove_wake_function(struct wait_queue_entry *wq_entry, unsigned int mode,
> +				     int sync, void *key)
>  {

This function doesn't sleep.  Why does it need to be tagged __sched?

> @@ -440,7 +441,7 @@ static inline bool is_kthread_should_stop(void)
>   * }						smp_mb(); // C
>   * remove_wait_queue(&wq_head, &wait);		wq_entry->flags |= WQ_FLAG_WOKEN;
>   */
> -long wait_woken(struct wait_queue_entry *wq_entry, unsigned mode, long timeout)
> +__sched long wait_woken(struct wait_queue_entry *wq_entry, unsigned int mode, long timeout)

This one makes sense, but you've extended the length of the line past 80
columns, and it seems like it should be a separate patch with its own
justification.

> +__sched int woken_wake_function(struct wait_queue_entry *wq_entry, unsigned int mode,
> +				int sync, void *key)

This doesn't seem to sleep either?

> +++ b/mm/filemap.c
> @@ -1271,7 +1271,7 @@ static inline bool folio_trylock_flag(struct folio *folio, int bit_nr,
>  /* How many times do we accept lock stealing from under a waiter? */
>  int sysctl_page_lock_unfairness = 5;
>  
> -static inline int folio_wait_bit_common(struct folio *folio, int bit_nr,
> +static inline __sched int folio_wait_bit_common(struct folio *folio, int bit_nr,
>  		int state, enum behavior behavior)
>  {
>  	wait_queue_head_t *q = folio_waitqueue(folio);
> @@ -1411,13 +1411,13 @@ static inline int folio_wait_bit_common(struct folio *folio, int bit_nr,
>  	return wait->flags & WQ_FLAG_WOKEN ? 0 : -EINTR;
>  }
>  
> -void folio_wait_bit(struct folio *folio, int bit_nr)
> +__sched void folio_wait_bit(struct folio *folio, int bit_nr)
>  {
>  	folio_wait_bit_common(folio, bit_nr, TASK_UNINTERRUPTIBLE, SHARED);
>  }
>  EXPORT_SYMBOL(folio_wait_bit);
>  
> -int folio_wait_bit_killable(struct folio *folio, int bit_nr)
> +__sched int folio_wait_bit_killable(struct folio *folio, int bit_nr)
>  {
>  	return folio_wait_bit_common(folio, bit_nr, TASK_KILLABLE, SHARED);
>  }
> @@ -1628,21 +1628,21 @@ EXPORT_SYMBOL_GPL(page_endio);
>   * __folio_lock - Get a lock on the folio, assuming we need to sleep to get it.
>   * @folio: The folio to lock
>   */
> -void __folio_lock(struct folio *folio)
> +__sched void __folio_lock(struct folio *folio)
>  {
>  	folio_wait_bit_common(folio, PG_locked, TASK_UNINTERRUPTIBLE,
>  				EXCLUSIVE);
>  }
>  EXPORT_SYMBOL(__folio_lock);
>  
> -int __folio_lock_killable(struct folio *folio)
> +__sched int __folio_lock_killable(struct folio *folio)
>  {
>  	return folio_wait_bit_common(folio, PG_locked, TASK_KILLABLE,
>  					EXCLUSIVE);
>  }
>  EXPORT_SYMBOL_GPL(__folio_lock_killable);
>  
> -static int __folio_lock_async(struct folio *folio, struct wait_page_queue *wait)
> +static __sched int __folio_lock_async(struct folio *folio, struct wait_page_queue *wait)
>  {
>  	struct wait_queue_head *q = folio_waitqueue(folio);
>  	int ret = 0;
> @@ -1679,7 +1679,7 @@ static int __folio_lock_async(struct folio *folio, struct wait_page_queue *wait)
>   * If neither ALLOW_RETRY nor KILLABLE are set, will always return true
>   * with the folio locked and the mmap_lock unperturbed.
>   */
> -bool __folio_lock_or_retry(struct folio *folio, struct mm_struct *mm,
> +__sched bool __folio_lock_or_retry(struct folio *folio, struct mm_struct *mm,
>  			 unsigned int flags)
>  {
>  	if (fault_flag_allow_retry_first(flags)) {
> -- 
> 2.34.0.rc0.344.g81b53c2807-goog
> 
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] SCHED: attribute page lock and waitqueue functions as sched
  2021-11-03 18:47 [PATCH] SCHED: attribute page lock and waitqueue functions as sched Jimmy Shiu
  2021-11-03 18:58 ` Matthew Wilcox
@ 2021-11-03 19:10 ` Matthew Wilcox
  2021-11-04  9:16 ` Mel Gorman
  2 siblings, 0 replies; 4+ messages in thread
From: Matthew Wilcox @ 2021-11-03 19:10 UTC (permalink / raw)
  To: Jimmy Shiu
  Cc: mingo, joaodias, wvw, Minchan Kim, Will McVicker, Peter Zijlstra,
	Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt,
	Ben Segall, Mel Gorman, Daniel Bristot de Oliveira,
	Andrew Morton, David Howells, Vlastimil Babka, William Kucharski,
	Kirill A. Shutemov, Andreas Gruenbacher, linux-kernel, linux-mm

On Thu, Nov 04, 2021 at 02:47:03AM +0800, Jimmy Shiu wrote:
> +++ b/mm/filemap.c
> @@ -1271,7 +1271,7 @@ static inline bool folio_trylock_flag(struct folio *folio, int bit_nr,
>  /* How many times do we accept lock stealing from under a waiter? */
>  int sysctl_page_lock_unfairness = 5;
>  
> -static inline int folio_wait_bit_common(struct folio *folio, int bit_nr,
> +static inline __sched int folio_wait_bit_common(struct folio *folio, int bit_nr,

It's not clear to me whether folio_wait_bit_common() needs to be marked
as __sched or whether marking its callers is sufficient, but I'm pretty
sure you forgot to mark put_and_wait_on_page_locked() as __sched, which
is an important one as it's now where we wait for readahead to finish
when reading a file.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] SCHED: attribute page lock and waitqueue functions as sched
  2021-11-03 18:47 [PATCH] SCHED: attribute page lock and waitqueue functions as sched Jimmy Shiu
  2021-11-03 18:58 ` Matthew Wilcox
  2021-11-03 19:10 ` Matthew Wilcox
@ 2021-11-04  9:16 ` Mel Gorman
  2 siblings, 0 replies; 4+ messages in thread
From: Mel Gorman @ 2021-11-04  9:16 UTC (permalink / raw)
  To: Jimmy Shiu
  Cc: mingo, joaodias, wvw, Minchan Kim, Will McVicker, Peter Zijlstra,
	Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt,
	Ben Segall, Daniel Bristot de Oliveira, Andrew Morton,
	Matthew Wilcox (Oracle),
	David Howells, Vlastimil Babka, William Kucharski,
	Kirill A. Shutemov, Andreas Gruenbacher, linux-kernel, linux-mm

On Thu, Nov 04, 2021 at 02:47:03AM +0800, Jimmy Shiu wrote:
> trace_sched_blocked_trace in CFS is really useful for debugging via
> trace because it tell where the process was stuck on callstack.
> 
> For example,
>            <...>-6143  ( 6136) [005] d..2    50.278987: sched_blocked_reason: pid=6136 iowait=0 caller=SyS_mprotect+0x88/0x208
>            <...>-6136  ( 6136) [005] d..2    50.278990: sched_blocked_reason: pid=6142 iowait=0 caller=do_page_fault+0x1f4/0x3b0
>            <...>-6142  ( 6136) [006] d..2    50.278996: sched_blocked_reason: pid=6144 iowait=0 caller=SyS_prctl+0x52c/0xb58
>            <...>-6144  ( 6136) [006] d..2    50.279007: sched_blocked_reason: pid=6136 iowait=0 caller=vm_mmap_pgoff+0x74/0x104
> 
> However, sometime it gives pointless information like this.
>     RenderThread-2322  ( 1805) [006] d.s3    50.319046: sched_blocked_reason: pid=6136 iowait=1 caller=__lock_page_killable+0x17c/0x220
>      logd.writer-594   (  587) [002] d.s3    50.334011: sched_blocked_reason: pid=6126 iowait=1 caller=wait_on_page_bit+0x194/0x208
>   kworker/u16:13-333   (  333) [007] d.s4    50.343161: sched_blocked_reason: pid=6136 iowait=1 caller=__lock_page_killable+0x17c/0x220
> 

/me peers at coffee

Maybe I'm blind but aside from Matthew's concerns, this
appears to rely on an Android patch that is not upstream
https://android.googlesource.com/kernel/msm/+/c9f00aa0e25e397533c198a0fcf6246715f99a7b%5E!/

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-11-04  9:17 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-03 18:47 [PATCH] SCHED: attribute page lock and waitqueue functions as sched Jimmy Shiu
2021-11-03 18:58 ` Matthew Wilcox
2021-11-03 19:10 ` Matthew Wilcox
2021-11-04  9:16 ` Mel Gorman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).