From: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
To: Michal Hocko <mhocko@kernel.org>
Cc: Dmitry Vyukov <dvyukov@google.com>,
Andrew Morton <akpm@linux-foundation.org>,
David Rientjes <rientjes@google.com>,
syzbot <syzbot+f0fc7f62e88b1de99af3@syzkaller.appspotmail.com>,
'Dmitry Vyukov' via syzkaller-upstream-moderation
<syzkaller-upstream-moderation@googlegroups.com>,
linux-mm <linux-mm@kvack.org>
Subject: Re: [PATCH] mm, oom: Introduce time limit for dump_tasks duration.
Date: Fri, 7 Sep 2018 19:20:18 +0900 [thread overview]
Message-ID: <bccbf1fd-76a5-2eb5-af3c-96c76cd826e5@i-love.sakura.ne.jp> (raw)
In-Reply-To: <20180907082745.GB19621@dhcp22.suse.cz>
On 2018/09/07 17:27, Michal Hocko wrote:
> On Fri 07-09-18 05:58:06, Tetsuo Handa wrote:
>> On 2018/09/06 23:39, Michal Hocko wrote:
>>>>>> I know /proc/sys/vm/oom_dump_tasks . Showing some entries while not always
>>>>>> printing all entries might be helpful.
>>>>>
>>>>> Not really. It could be more confusing than helpful. The main purpose of
>>>>> the listing is to double check the list to understand the oom victim
>>>>> selection. If you have a partial list you simply cannot do that.
>>>>
>>>> It serves as a safeguard for avoiding RCU stall warnings.
>>>>
>>>>>
>>>>> If the iteration takes too long and I can imagine it does with zillions
>>>>> of tasks then the proper way around it is either release the lock
>>>>> periodically after N tasks is processed or outright skip the whole thing
>>>>> if there are too many tasks. The first option is obviously tricky to
>>>>> prevent from duplicate entries or other artifacts.
>>>>>
>>>>
>>>> Can we add rcu_lock_break() like check_hung_uninterruptible_tasks() does?
>>>
>>> This would be a better variant of your timeout based approach. But it
>>> can still produce an incomplete task list so it still consumes a lot of
>>> resources to print a long list of tasks potentially while that list is not
>>> useful for any evaluation. Maybe that is good enough. I don't know. I
>>> would generally recommend to disable the whole thing with workloads with
>>> many tasks though.
>>>
>>
>> The "safeguard" is useful when there are _unexpectedly_ many tasks (like
>> syzbot in this case). Why not to allow those who want to avoid lockup to
>> avoid lockup rather than forcing them to disable the whole thing?
>
> So you get an rcu lockup splat and what? Unless you have panic_on_rcu_stall
> then this should be recoverable thing (assuming we cannot really
> livelock as described by Dmitry).
>
syzbot is getting hung task panic (140 seconds) because one dump_tasks() from
out_of_memory() consumes 52 seconds on a 2 CPU machine because we have only
cond_resched() which can yield CPU resource to tasks which need CPU resource.
This is similar to a bug shown below.
[upstream] INFO: task hung in fsnotify_mark_destroy_workfn
https://syzkaller.appspot.com/bug?id=0e75779a6f0faac461510c6330514e8f0e893038
[upstream] INFO: task hung in fsnotify_connector_destroy_workfn
https://syzkaller.appspot.com/bug?id=aa11d2d767f3750ef9a40d156a149e9cfa735b73
Continuing printk() until khungtaskd fires is a stupid behavior.
next prev parent reply other threads:[~2018-09-07 10:54 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <0000000000004a6b700575178b5a@google.com>
[not found] ` <CACT4Y+aPRGUqAdJCMDWM=Zcy8ZQcHyrsB1ZuWS4VB_+wvLfeaQ@mail.gmail.com>
2018-09-05 10:53 ` INFO: task hung in ext4_da_get_block_prep Tetsuo Handa
2018-09-05 11:06 ` Dmitry Vyukov
2018-09-06 5:53 ` Tetsuo Handa
2018-09-06 9:54 ` Dmitry Vyukov
2018-09-06 10:58 ` [PATCH] mm, oom: Introduce time limit for dump_tasks duration Tetsuo Handa
2018-09-06 11:07 ` Dmitry Vyukov
2018-09-06 11:25 ` Tetsuo Handa
2018-09-06 11:23 ` Michal Hocko
2018-09-06 11:40 ` Tetsuo Handa
2018-09-06 11:53 ` Michal Hocko
2018-09-06 12:08 ` Dmitry Vyukov
2018-09-06 12:16 ` Michal Hocko
2018-09-11 16:37 ` Oleg Nesterov
2018-09-12 16:45 ` Oleg Nesterov
2018-09-06 13:45 ` Tetsuo Handa
2018-09-06 14:39 ` Michal Hocko
2018-09-06 20:58 ` Tetsuo Handa
2018-09-07 8:27 ` Michal Hocko
2018-09-07 9:36 ` Dmitry Vyukov
2018-09-07 10:49 ` Tetsuo Handa
2018-09-07 11:08 ` Michal Hocko
2018-09-08 14:00 ` Dmitry Vyukov
2018-09-10 14:36 ` Dmitry Vyukov
2018-09-07 10:20 ` Tetsuo Handa [this message]
2019-03-03 11:33 ` INFO: task hung in ext4_da_get_block_prep syzbot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bccbf1fd-76a5-2eb5-af3c-96c76cd826e5@i-love.sakura.ne.jp \
--to=penguin-kernel@i-love.sakura.ne.jp \
--cc=akpm@linux-foundation.org \
--cc=dvyukov@google.com \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=rientjes@google.com \
--cc=syzbot+f0fc7f62e88b1de99af3@syzkaller.appspotmail.com \
--cc=syzkaller-upstream-moderation@googlegroups.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).