linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: <Peter.Enderborg@sony.com>
To: <christophe.leroy@csgroup.eu>, <wim@linux-watchdog.org>,
	<linux@roeck-us.net>, <akpm@linux-foundation.org>,
	<linux-watchdog@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	<linux-mm@kvack.org>, <shakeelb@google.com>
Subject: Re: [RFC PATCH] watchdog: Adding softwatchdog
Date: Sat, 24 Apr 2021 13:04:13 +0000	[thread overview]
Message-ID: <d5db6789-78a6-3e48-6383-190c4696e682@sony.com> (raw)
In-Reply-To: <ac949d08-72ff-edf6-6526-fdc9ad602631@csgroup.eu>

On 4/24/21 2:21 PM, Christophe Leroy wrote:
>
>
> Le 24/04/2021 à 12:25, Peter Enderborg a écrit :
>> This is not a rebooting watchdog. It's function is to take other
>> actions than a hard reboot. On many complex system there is some
>> kind of manager that monitor and take action on slow systems.
>> Android has it's lowmemorykiller (lmkd), desktops has earlyoom.
>> This watchdog can be used to help monitor to preform some basic
>> action to keep the monitor running.
>>
>> It can also be used standalone. This add a policy that is
>> killing the process with highest oom_score_adj and using
>> oom functions to it quickly. I think it is a good usecase
>> for the patch. Memory siuations can be problematic for
>> software that monitor system, but other prolicys can
>> should also be possible. Like picking tasks from a memcg, or
>> specific UID's or what ever is low priority.
>
>
> I'm nore sure I understand the reasoning behind the choice of oom logic to decide which task to kill.
>
This is not using oom logic to pick a task to kill, it is using oom functions to free resources fast.

The oom is also to slow. So there are userspace solutions to start removing processes before it starts to slow down.

In for example Ubuntu and Fedora a process called earlyoom is running. On Android there is lmkd. However
allocation can be huge fast. For example starting a camera. So what then can happen is that the service that
is there to remove applications that is not needed can get starved. They do a lot of operations to that needs
memory and by this they also get slow.  In worst case it can cause a oom. Oom kills things randomly and
it will cause a android phone to reboot if it kills wrong things. When it get slow it can't kick the wd and
we can free up resources from within kernel. To get current version to work there is very high margins wasting
a lot of memory to be "safe".


> Usually a watchdog will detect if a task is using 100% of the CPU time. If such a task exists, it is the one running, not another one that has huge amount of memory allocated by spends like 1% of CPU time.
>
Watchdogs detects that you does not feed it. 
> So if there is a task to kill by a watchdog, I would say it is the current task.


Current task?  We usually have many cpu's. But the idea is that you should easily write a policy for that if that is what you want.


>
>
>
> Another remark: you are using regular timers as far as I understand. I remember having problems with that in the past, it required the use of hrtimers. I can't remember the details exactly but you can look at
> commit https://github.com/linuxppc/linux/commit/1ff688209


That I definitely need to look in to.


> Christophe


  reply	other threads:[~2021-04-24 13:05 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-24 10:25 [RFC PATCH] watchdog: Adding softwatchdog Peter Enderborg
2021-04-24 10:25 ` Peter Enderborg
2021-04-24 12:21   ` Christophe Leroy
2021-04-24 13:04     ` Peter.Enderborg [this message]
2021-04-24 14:41   ` Guenter Roeck
2021-04-24 15:23     ` Tetsuo Handa
2021-04-24 16:19       ` peter enderborg
2021-04-25  1:08         ` Tetsuo Handa
2021-04-25  6:42           ` peter enderborg
2021-04-25  8:05           ` peter enderborg
2021-04-24 15:27     ` Peter.Enderborg
2021-04-24 17:07       ` Guenter Roeck
2021-04-24 17:20         ` Peter.Enderborg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d5db6789-78a6-3e48-6383-190c4696e682@sony.com \
    --to=peter.enderborg@sony.com \
    --cc=akpm@linux-foundation.org \
    --cc=christophe.leroy@csgroup.eu \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-watchdog@vger.kernel.org \
    --cc=linux@roeck-us.net \
    --cc=shakeelb@google.com \
    --cc=wim@linux-watchdog.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).