linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: Shawn Landden <slandden@gmail.com>
Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org, linux-api@vger.kernel.org, mhocko@kernel.org,
	willy@infradead.org
Subject: Re: [RFC v4] It is common for services to be stateless around their main event loop. If a process sets PR_SET_IDLE to PR_IDLE_MODE_KILLME then it signals to the kernel that epoll_wait() and friends may not complete, and the kernel may send SIGKILL if resources get tight.
Date: Tue, 21 Nov 2017 10:14:41 +0100 (CET)	[thread overview]
Message-ID: <alpine.DEB.2.20.1711210948530.1782@nanos> (raw)
In-Reply-To: <20171121051639.1228-1-slandden@gmail.com>

On Mon, 20 Nov 2017, Shawn Landden wrote:

Please use a short and comprehensible subject line and do not pack a full
sentence into it. The sentence wants to be in the change log body.

> +static DECLARE_WAIT_QUEUE_HEAD(oom_target);
> +
> +/* Clean up after a EPOLL_KILLME process quits.
> + * Called by kernel/exit.c.

It's hardly called by kernel/exit.c and aside of that multi line comments
are formatted like this:

/*
 * ....
 * ....
 */

> + */
> +void exit_oom_target(void)
> +{
> +	DECLARE_WAITQUEUE(wait, current);
> +
> +	remove_wait_queue(&oom_target, &wait);

This is completely pointless, really. It does:

     	INIT_LIST_HEAD(&wait.entry);

	spin_lock_irqsave(&oom_target->lock, flags);
	list_del(&wait->entry);
	spin_lock_irqrestore(&oom_target->lock, flags);

IOW. It's a NOOP. What are you trying to achieve?

> +}
> +
> +inline struct wait_queue_head *oom_target_get_wait()
> +{
> +	return &oom_target;

This wrapper is useless.

> +}
> +
>  #ifdef CONFIG_NUMA
>  /**
>   * has_intersects_mems_allowed() - check task eligiblity for kill
> @@ -994,6 +1013,18 @@ int unregister_oom_notifier(struct notifier_block *nb)
>  }
>  EXPORT_SYMBOL_GPL(unregister_oom_notifier);
>  
> +int oom_target_callback(wait_queue_entry_t *wait, unsigned mode, int sync, void *key)
> +{
> +	struct task_struct *ts = wait->private;
> +
> +	/* We use SIGKILL instead of the oom killer
> +	 * so as to cleanly interrupt ep_poll()

Huch? oom_killer uses SIGKILL as well, it just does it correctly.

> +	 */
> +	pr_debug("Killing pid %u from prctl(PR_SET_IDLE) death row.\n", ts->pid);
> +	send_sig(SIGKILL, ts, 1);
> +	return 0;
> +}
> +
>  /**
>   * out_of_memory - kill the "best" process when we run out of memory
>   * @oc: pointer to struct oom_control
> @@ -1007,6 +1038,7 @@ bool out_of_memory(struct oom_control *oc)
>  {
>  	unsigned long freed = 0;
>  	enum oom_constraint constraint = CONSTRAINT_NONE;
> +	wait_queue_head_t *w;
>  
>  	if (oom_killer_disabled)
>  		return false;
> @@ -1056,6 +1088,17 @@ bool out_of_memory(struct oom_control *oc)
>  		return true;
>  	}
>  
> +	/*
> +	 * Check death row for current memcg or global.
> +	 */
> +	if (!is_memcg_oom(oc)) {
> +		w = oom_target_get_wait();
> +		if (waitqueue_active(w)) {
> +			wake_up(w);
> +			return true;
> +		}
> +	}

Why on earth do you need that extra wait_queue magic?

You completely fail to explain in your empty changelog why the existing
oom hinting infrastructure is not sufficient.

If you can explain why, then there is no reason to have this side
channel. Extend/fix the current hinting mechanism and be done with it.

Thanks,

	tglx

  parent reply	other threads:[~2017-11-21  9:14 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-01  5:32 [RFC] EPOLL_KILLME: New flag to epoll_wait() that subscribes process to death row (new syscall) Shawn Landden
2017-11-01 14:04 ` Matthew Wilcox
2017-11-01 15:16 ` Colin Walters
2017-11-01 15:22   ` Colin Walters
2017-11-03  9:22     ` peter enderborg
     [not found]   ` <CA+49okox_Hvg-dGyjZc3u0qLz1S=LJjS4-WT6SxQ9qfPyp6BjQ@mail.gmail.com>
2017-11-01 19:37     ` Colin Walters
2017-11-02 15:24       ` Shawn Paul Landden
2017-11-01 22:10 ` Tetsuo Handa
2017-11-02 15:45 ` Michal Hocko
2017-11-03  6:35 ` [RFC v2] prctl: prctl(PR_SET_IDLE, PR_IDLE_MODE_KILLME), for stateless idle loops Shawn Landden
2017-11-03  9:09   ` Michal Hocko
2017-11-18 20:33     ` Shawn Landden
     [not found]     ` <CA+49okqZ8CME0EN1xS_cCTc5Q-fGRreg0makhzNNuRpGs3mjfw@mail.gmail.com>
2017-11-19  4:19       ` Matthew Wilcox
2017-11-20  8:35       ` Michal Hocko
2017-11-21  4:48         ` Shawn Landden
2017-11-21  7:05           ` Michal Hocko
2017-11-15 21:11   ` Pavel Machek
2017-11-21  4:49   ` [RFC v3] It is common for services to be stateless around their main event loop. If a process sets PR_SET_IDLE to PR_IDLE_MODE_KILLME then it signals to the kernel that epoll_wait() and friends may not complete, and the kernel may send SIGKILL if resources get tight Shawn Landden
2017-11-21  4:56     ` Shawn Landden
2017-11-21  5:16     ` [RFC v4] " Shawn Landden
2017-11-21  5:26       ` Shawn Landden
2017-11-21  9:14       ` Thomas Gleixner [this message]
2017-11-22 10:29   ` [RFC v2] prctl: prctl(PR_SET_IDLE, PR_IDLE_MODE_KILLME), for stateless idle loops peter enderborg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.20.1711210948530.1782@nanos \
    --to=tglx@linutronix.de \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=slandden@gmail.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).