linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [linux-next] kcompactd0 stuck in a CPU-burning loop
@ 2019-01-28  8:57 Sergey Senozhatsky
  2019-01-28  9:18 ` Vlastimil Babka
  0 siblings, 1 reply; 3+ messages in thread
From: Sergey Senozhatsky @ 2019-01-28  8:57 UTC (permalink / raw)
  To: Michal Hocko, Vlastimil Babka
  Cc: Tetsuo Handa, Andrew Morton, Johannes Weiner, linux-kernel,
	linux-mm, Sergey Senozhatsky

Hello,

next-20190125

kcompactd0 is spinning on something, burning CPUs in the meantime:

 %CPU         TIME+      COMMAND
 100.0   0.0  34:04.20 R [kcompactd0]

Not sure I know how to reproduce it; so am probably not going to
be a very helpful tester.

I tried to ftrace kcompactd0 PID, and I see the same path all over
the tracing file:

 2)   0.119 us    |    unlock_page();
 2)   0.109 us    |    unlock_page();
 2)   0.096 us    |    compaction_free();
 2)   0.104 us    |    ___might_sleep();
 2)   0.121 us    |    compaction_alloc();
 2)   0.111 us    |    page_mapped();
 2)   0.105 us    |    page_mapped();
 2)               |    move_to_new_page() {
 2)   0.102 us    |      page_mapping();
 2)               |      buffer_migrate_page_norefs() {
 2)               |        __buffer_migrate_page() {
 2)               |          expected_page_refs() {
 2)   0.118 us    |            page_mapping();
 2)   0.321 us    |          }
 2)               |          __might_sleep() {
 2)   0.122 us    |            ___might_sleep();
 2)   0.332 us    |          }
 2)               |          _raw_spin_lock() {
 2)   0.115 us    |            preempt_count_add();
 2)   0.321 us    |          }
 2)               |          _raw_spin_unlock() {
 2)   0.114 us    |            preempt_count_sub();
 2)   0.321 us    |          }
 2)               |          invalidate_bh_lrus() {
 2)               |            on_each_cpu_cond() {
 2)               |              on_each_cpu_cond_mask() {
 2)               |                __might_sleep() {
 2)   0.114 us    |                  ___might_sleep();
 2)   0.316 us    |                }
 2)   0.109 us    |                preempt_count_add();
 2)   0.128 us    |                has_bh_in_lru();
 2)   0.105 us    |                has_bh_in_lru();
 2)   0.124 us    |                has_bh_in_lru();
 2)   0.103 us    |                has_bh_in_lru();
 2)   0.125 us    |                has_bh_in_lru();
 2)   0.105 us    |                has_bh_in_lru();
 2)   0.123 us    |                has_bh_in_lru();
 2)   0.107 us    |                has_bh_in_lru();
 2)               |                on_each_cpu_mask() {
 2)   0.104 us    |                  preempt_count_add();
 2)   0.110 us    |                  smp_call_function_many();
 2)   0.105 us    |                  preempt_count_sub();
 2)   0.764 us    |                }
 2)   0.116 us    |                preempt_count_sub();
 2)   3.676 us    |              }
 2)   3.889 us    |            }
 2)   4.087 us    |          }
 2)               |          _raw_spin_lock() {
 2)   0.112 us    |            preempt_count_add();
 2)   0.315 us    |          }
 2)               |          _raw_spin_unlock() {
 2)   0.108 us    |            preempt_count_sub();
 2)   0.309 us    |          }
 2)               |          unlock_buffer() {
 2)               |            wake_up_bit() {
 2)   0.118 us    |              __wake_up_bit();
 2)   0.317 us    |            }
 2)   0.513 us    |          }
 2)   7.440 us    |        }
 2)   7.643 us    |      }
 2)   8.070 us    |    }


PG migration fails a lot:

pgmigrate_success 111063
pgmigrate_fail 269841559
compact_migrate_scanned 536253365
compact_free_scanned 360889
compact_isolated 270072733
compact_stall 0
compact_fail 0
compact_success 0
compact_daemon_wake 56
compact_daemon_migrate_scanned 536253365
compact_daemon_free_scanned 360889

Let me know if I can help with anything else. I'll keep the the box alive
for a while, but will have to power it off eventually.

	-ss

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [linux-next] kcompactd0 stuck in a CPU-burning loop
  2019-01-28  8:57 [linux-next] kcompactd0 stuck in a CPU-burning loop Sergey Senozhatsky
@ 2019-01-28  9:18 ` Vlastimil Babka
  2019-01-28 10:51   ` Sergey Senozhatsky
  0 siblings, 1 reply; 3+ messages in thread
From: Vlastimil Babka @ 2019-01-28  9:18 UTC (permalink / raw)
  To: Sergey Senozhatsky, Michal Hocko
  Cc: Tetsuo Handa, Andrew Morton, Johannes Weiner, linux-kernel,
	linux-mm, Sergey Senozhatsky

On 1/28/19 9:57 AM, Sergey Senozhatsky wrote:
> Hello,
> 
> next-20190125
> 
> kcompactd0 is spinning on something, burning CPUs in the meantime:

Hi, could you check/add this to the earlier thread? Thanks.

https://lore.kernel.org/lkml/20190126200005.GB27513@amd/T/#u

> 
>  %CPU         TIME+      COMMAND
>  100.0   0.0  34:04.20 R [kcompactd0]
> 
> Not sure I know how to reproduce it; so am probably not going to
> be a very helpful tester.
> 
> I tried to ftrace kcompactd0 PID, and I see the same path all over
> the tracing file:
> 
>  2)   0.119 us    |    unlock_page();
>  2)   0.109 us    |    unlock_page();
>  2)   0.096 us    |    compaction_free();
>  2)   0.104 us    |    ___might_sleep();
>  2)   0.121 us    |    compaction_alloc();
>  2)   0.111 us    |    page_mapped();
>  2)   0.105 us    |    page_mapped();
>  2)               |    move_to_new_page() {
>  2)   0.102 us    |      page_mapping();
>  2)               |      buffer_migrate_page_norefs() {
>  2)               |        __buffer_migrate_page() {
>  2)               |          expected_page_refs() {
>  2)   0.118 us    |            page_mapping();
>  2)   0.321 us    |          }
>  2)               |          __might_sleep() {
>  2)   0.122 us    |            ___might_sleep();
>  2)   0.332 us    |          }
>  2)               |          _raw_spin_lock() {
>  2)   0.115 us    |            preempt_count_add();
>  2)   0.321 us    |          }
>  2)               |          _raw_spin_unlock() {
>  2)   0.114 us    |            preempt_count_sub();
>  2)   0.321 us    |          }
>  2)               |          invalidate_bh_lrus() {
>  2)               |            on_each_cpu_cond() {
>  2)               |              on_each_cpu_cond_mask() {
>  2)               |                __might_sleep() {
>  2)   0.114 us    |                  ___might_sleep();
>  2)   0.316 us    |                }
>  2)   0.109 us    |                preempt_count_add();
>  2)   0.128 us    |                has_bh_in_lru();
>  2)   0.105 us    |                has_bh_in_lru();
>  2)   0.124 us    |                has_bh_in_lru();
>  2)   0.103 us    |                has_bh_in_lru();
>  2)   0.125 us    |                has_bh_in_lru();
>  2)   0.105 us    |                has_bh_in_lru();
>  2)   0.123 us    |                has_bh_in_lru();
>  2)   0.107 us    |                has_bh_in_lru();
>  2)               |                on_each_cpu_mask() {
>  2)   0.104 us    |                  preempt_count_add();
>  2)   0.110 us    |                  smp_call_function_many();
>  2)   0.105 us    |                  preempt_count_sub();
>  2)   0.764 us    |                }
>  2)   0.116 us    |                preempt_count_sub();
>  2)   3.676 us    |              }
>  2)   3.889 us    |            }
>  2)   4.087 us    |          }
>  2)               |          _raw_spin_lock() {
>  2)   0.112 us    |            preempt_count_add();
>  2)   0.315 us    |          }
>  2)               |          _raw_spin_unlock() {
>  2)   0.108 us    |            preempt_count_sub();
>  2)   0.309 us    |          }
>  2)               |          unlock_buffer() {
>  2)               |            wake_up_bit() {
>  2)   0.118 us    |              __wake_up_bit();
>  2)   0.317 us    |            }
>  2)   0.513 us    |          }
>  2)   7.440 us    |        }
>  2)   7.643 us    |      }
>  2)   8.070 us    |    }
> 
> 
> PG migration fails a lot:
> 
> pgmigrate_success 111063
> pgmigrate_fail 269841559
> compact_migrate_scanned 536253365
> compact_free_scanned 360889
> compact_isolated 270072733
> compact_stall 0
> compact_fail 0
> compact_success 0
> compact_daemon_wake 56
> compact_daemon_migrate_scanned 536253365
> compact_daemon_free_scanned 360889
> 
> Let me know if I can help with anything else. I'll keep the the box alive
> for a while, but will have to power it off eventually.
> 
> 	-ss
> 


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [linux-next] kcompactd0 stuck in a CPU-burning loop
  2019-01-28  9:18 ` Vlastimil Babka
@ 2019-01-28 10:51   ` Sergey Senozhatsky
  0 siblings, 0 replies; 3+ messages in thread
From: Sergey Senozhatsky @ 2019-01-28 10:51 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Sergey Senozhatsky, Michal Hocko, Tetsuo Handa, Andrew Morton,
	Johannes Weiner, linux-kernel, linux-mm, Sergey Senozhatsky,
	Jan Kara

On (01/28/19 10:18), Vlastimil Babka wrote:
> On 1/28/19 9:57 AM, Sergey Senozhatsky wrote:
> > Hello,
> > 
> > next-20190125
> > 
> > kcompactd0 is spinning on something, burning CPUs in the meantime:
> 
> Hi, could you check/add this to the earlier thread? Thanks.
> 
> https://lore.kernel.org/lkml/20190126200005.GB27513@amd/T/#u

Hi,

Will reply here.
Thanks for  the link, Vlastimil.

Will "test" Jan's patch (don't have a reproducer yet).
So far, I can confirm that

	echo 3 > /proc/sys/vm/drop_caches

mentioned in that thread does "solve" the issue.

	-ss

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2019-01-28 10:51 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-28  8:57 [linux-next] kcompactd0 stuck in a CPU-burning loop Sergey Senozhatsky
2019-01-28  9:18 ` Vlastimil Babka
2019-01-28 10:51   ` Sergey Senozhatsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).