From: Qian Cai <quic_qiancai@quicinc.com>
To: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Mel Gorman <mgorman@techsingularity.net>,
Andrew Morton <akpm@linux-foundation.org>,
Nicolas Saenz Julienne <nsaenzju@redhat.com>,
Marcelo Tosatti <mtosatti@redhat.com>,
Vlastimil Babka <vbabka@suse.cz>,
Michal Hocko <mhocko@kernel.org>,
LKML <linux-kernel@vger.kernel.org>,
Linux-MM <linux-mm@kvack.org>, <kafai@fb.com>,
<kpsingh@kernel.org>
Subject: Re: [PATCH 0/6] Drain remote per-cpu directly v3
Date: Thu, 19 May 2022 09:29:45 -0400 [thread overview]
Message-ID: <YoZGSd6yQL3EP8tk@qian> (raw)
In-Reply-To: <20220518171503.GQ1790663@paulmck-ThinkPad-P17-Gen-1>
On Wed, May 18, 2022 at 10:15:03AM -0700, Paul E. McKenney wrote:
> So does this python script somehow change the tracing state? (It does
> not look to me like it does, but I could easily be missing something.)
No, I don't think so either. It pretty much just offline memory sections
one at a time.
> Either way, is there something else waiting for these RCU flavors?
> (There should not be.) Nevertheless, if so, there should be
> a synchronize_rcu_tasks(), synchronize_rcu_tasks_rude(), or
> synchronize_rcu_tasks_trace() on some other blocked task's stack
> somewhere.
There are only three blocked tasks when this happens. The kmemleak_scan()
is just the victim waiting for the locks taken by the stucking
offline_pages()->synchronize_rcu() task.
task:kmemleak state:D stack:25824 pid: 1033 ppid: 2 flags:0x00000008
Call trace:
__switch_to
__schedule
schedule
percpu_rwsem_wait
__percpu_down_read
percpu_down_read.constprop.0
get_online_mems
kmemleak_scan
kmemleak_scan_thread
kthread
ret_from_fork
task:cppc_fie state:D stack:23472 pid: 1848 ppid: 2 flags:0x00000008
Call trace:
__switch_to
__schedule
lockdep_recursion
task:tee state:D stack:24816 pid:16733 ppid: 16732 flags:0x0000020c
Call trace:
__switch_to
__schedule
schedule
schedule_timeout
__wait_for_common
wait_for_completion
__wait_rcu_gp
synchronize_rcu
lru_cache_disable
__alloc_contig_migrate_range
isolate_single_pageblock
start_isolate_page_range
offline_pages
memory_subsys_offline
device_offline
online_store
dev_attr_store
sysfs_kf_write
kernfs_fop_write_iter
new_sync_write
vfs_write
ksys_write
__arm64_sys_write
invoke_syscall
el0_svc_common.constprop.0
do_el0_svc
el0_svc
el0t_64_sync_handler
el0t_64_sync
> Or maybe something sleeps waiting for an RCU Tasks * callback to
> be invoked. In that case (and in the above case, for that matter),
> at least one of these pointers would be non-NULL on some CPU:
>
> 1. rcu_tasks__percpu.cblist.head
> 2. rcu_tasks_rude__percpu.cblist.head
> 3. rcu_tasks_trace__percpu.cblist.head
>
> The ->func field of the pointed-to structure contains a pointer to
> the callback function, which will help work out what is going on.
> (Most likely a wakeup being lost or not provided.)
What would be some of the easy ways to find out those? I can't see anything
interesting from the output of sysrq-t.
> Alternatively, if your system has hundreds of thousands of tasks and
> you have attached BPF programs to short-lived socket structures and you
> don't yet have the workaround, then you can see hangs. (I am working on a
> longer-term fix.) In the short term, applying the workaround is the right
> thing to do. (Adding a couple of the BPF guys on CC for their thoughts.)
The system is pretty much idle after a fresh reboot. The only workload is
to run the script.
next prev parent reply other threads:[~2022-05-19 13:29 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-05-12 8:50 [PATCH 0/6] Drain remote per-cpu directly v3 Mel Gorman
2022-05-12 8:50 ` [PATCH 1/6] mm/page_alloc: Add page->buddy_list and page->pcp_list Mel Gorman
2022-05-13 11:59 ` Nicolas Saenz Julienne
2022-05-19 9:36 ` Vlastimil Babka
2022-05-12 8:50 ` [PATCH 2/6] mm/page_alloc: Use only one PCP list for THP-sized allocations Mel Gorman
2022-05-19 9:45 ` Vlastimil Babka
2022-05-12 8:50 ` [PATCH 3/6] mm/page_alloc: Split out buddy removal code from rmqueue into separate helper Mel Gorman
2022-05-13 12:01 ` Nicolas Saenz Julienne
2022-05-19 9:52 ` Vlastimil Babka
2022-05-23 16:09 ` Qais Yousef
2022-05-24 11:55 ` Mel Gorman
2022-05-25 11:23 ` Qais Yousef
2022-05-12 8:50 ` [PATCH 4/6] mm/page_alloc: Remove unnecessary page == NULL check in rmqueue Mel Gorman
2022-05-13 12:03 ` Nicolas Saenz Julienne
2022-05-19 10:57 ` Vlastimil Babka
2022-05-19 12:13 ` Mel Gorman
2022-05-19 12:26 ` Vlastimil Babka
2022-05-12 8:50 ` [PATCH 5/6] mm/page_alloc: Protect PCP lists with a spinlock Mel Gorman
2022-05-13 12:22 ` Nicolas Saenz Julienne
2022-05-12 8:50 ` [PATCH 6/6] mm/page_alloc: Remotely drain per-cpu lists Mel Gorman
2022-05-12 19:37 ` Andrew Morton
2022-05-13 15:04 ` Mel Gorman
2022-05-13 15:19 ` Nicolas Saenz Julienne
2022-05-13 18:23 ` Mel Gorman
2022-05-17 12:57 ` Mel Gorman
2022-05-12 19:43 ` [PATCH 0/6] Drain remote per-cpu directly v3 Andrew Morton
2022-05-13 14:23 ` Mel Gorman
2022-05-13 19:38 ` Andrew Morton
2022-05-16 10:53 ` Mel Gorman
2022-05-13 12:24 ` Nicolas Saenz Julienne
2022-05-17 23:35 ` Qian Cai
2022-05-18 12:51 ` Mel Gorman
2022-05-18 16:27 ` Qian Cai
2022-05-18 17:15 ` Paul E. McKenney
2022-05-19 13:29 ` Qian Cai [this message]
2022-05-19 19:15 ` Paul E. McKenney
2022-05-19 21:05 ` Qian Cai
2022-05-19 21:29 ` Paul E. McKenney
2022-05-18 17:26 ` Marcelo Tosatti
2022-05-18 17:44 ` Marcelo Tosatti
2022-05-18 18:01 ` Nicolas Saenz Julienne
2022-05-26 17:19 ` Qian Cai
2022-05-27 8:39 ` Mel Gorman
2022-05-27 12:58 ` Qian Cai
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YoZGSd6yQL3EP8tk@qian \
--to=quic_qiancai@quicinc.com \
--cc=akpm@linux-foundation.org \
--cc=kafai@fb.com \
--cc=kpsingh@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@techsingularity.net \
--cc=mhocko@kernel.org \
--cc=mtosatti@redhat.com \
--cc=nsaenzju@redhat.com \
--cc=paulmck@kernel.org \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).