All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
	rientjes@google.com, linux-mm@kvack.org, oleg@redhat.com
Subject: Re: [PATCH v3] mm,oom: speed up select_bad_process() loop.
Date: Thu, 19 May 2016 08:53:29 +0200	[thread overview]
Message-ID: <20160519065329.GA26110@dhcp22.suse.cz> (raw)
In-Reply-To: <20160518140932.6643b963e8d3fc49ff64df8d@linux-foundation.org>

On Wed 18-05-16 14:09:32, Andrew Morton wrote:
> On Wed, 18 May 2016 16:15:45 +0200 Michal Hocko <mhocko@kernel.org> wrote:
> 
> > > This patch adds a counter to signal_struct for tracking how many
> > > TIF_MEMDIE threads are in a given thread group, and check it at
> > > oom_scan_process_thread() so that select_bad_process() can use
> > > for_each_process() rather than for_each_process_thread().
> > 
> > OK, this looks correct. Strictly speaking the patch is missing any note
> > on _why_ this is needed or an improvement. I would add something like
> > the following:
> > "
> > Although the original code was correct it was quite inefficient because
> > each thread group was scanned num_threads times which can be a lot
> > especially with processes with many threads. Even though the OOM is
> > extremely cold path it is always good to be as effective as possible
> > when we are inside rcu_read_lock() - aka unpreemptible context.
> > "
> 
> This sounds quite rubbery to me.  Lots of code calls
> for_each_process_thread() and presumably that isn't causing problems. 

Yeah, many paths call for_each_process_thread but they are
O(num_threads) while this is O(num_threads^2).

> We're bloating up the signal_struct to solve some problem on a
> rarely-called slowpath with no evidence that there is actually a
> problem to be solved.

signal_struct has some holes[1] so we can stitch it there. Long term I
would like to to move this logic into the mm_struct, it would be just
larger surgery I guess.

[1]

die__process_function: tag not supported (INVALID)!
struct signal_struct {
	atomic_t                   sigcnt;               /*     0     4 */
	atomic_t                   live;                 /*     4     4 */
	int                        nr_threads;           /*     8     4 */

	/* XXX 4 bytes hole, try to pack */

	struct list_head           thread_head;          /*    16    16 */
	wait_queue_head_t          wait_chldexit;        /*    32    24 */
	struct task_struct *       curr_target;          /*    56     8 */
	/* --- cacheline 1 boundary (64 bytes) --- */
	struct sigpending          shared_pending;       /*    64    24 */
	int                        group_exit_code;      /*    88     4 */
	int                        notify_count;         /*    92     4 */
	struct task_struct *       group_exit_task;      /*    96     8 */
	int                        group_stop_count;     /*   104     4 */
	unsigned int               flags;                /*   108     4 */
	unsigned int               is_child_subreaper:1; /*   112:31  4 */
	unsigned int               has_child_subreaper:1; /*   112:30  4 */

	/* XXX 30 bits hole, try to pack */

	int                        posix_timer_id;       /*   116     4 */
	struct list_head           posix_timers;         /*   120    16 */
	/* --- cacheline 2 boundary (128 bytes) was 8 bytes ago --- */
	struct hrtimer             real_timer;           /*   136    88 */
	/* --- cacheline 3 boundary (192 bytes) was 32 bytes ago --- */
	struct pid *               leader_pid;           /*   224     8 */
	ktime_t                    it_real_incr;         /*   232     8 */
	struct cpu_itimer          it[2];                /*   240    48 */
	/* --- cacheline 4 boundary (256 bytes) was 32 bytes ago --- */
	struct thread_group_cputimer cputimer;           /*   288    32 */
	/* --- cacheline 5 boundary (320 bytes) --- */
	struct task_cputime        cputime_expires;      /*   320    24 */
	atomic_t                   tick_dep_mask;        /*   344     4 */

	/* XXX 4 bytes hole, try to pack */

	struct list_head           cpu_timers[3];        /*   352    48 */
	/* --- cacheline 6 boundary (384 bytes) was 16 bytes ago --- */
	struct pid *               tty_old_pgrp;         /*   400     8 */
	int                        leader;               /*   408     4 */

	/* XXX 4 bytes hole, try to pack */

	struct tty_struct *        tty;                  /*   416     8 */
	struct autogroup *         autogroup;            /*   424     8 */
	seqlock_t                  stats_lock;           /*   432     8 */
	cputime_t                  utime;                /*   440     8 */
	/* --- cacheline 7 boundary (448 bytes) --- */
	cputime_t                  stime;                /*   448     8 */
	cputime_t                  cutime;               /*   456     8 */
	cputime_t                  cstime;               /*   464     8 */
	cputime_t                  gtime;                /*   472     8 */
	cputime_t                  cgtime;               /*   480     8 */
	struct prev_cputime        prev_cputime;         /*   488    24 */
	/* --- cacheline 8 boundary (512 bytes) --- */
	long unsigned int          nvcsw;                /*   512     8 */
	long unsigned int          nivcsw;               /*   520     8 */
	long unsigned int          cnvcsw;               /*   528     8 */
	long unsigned int          cnivcsw;              /*   536     8 */
	long unsigned int          min_flt;              /*   544     8 */
	long unsigned int          maj_flt;              /*   552     8 */
	long unsigned int          cmin_flt;             /*   560     8 */
	long unsigned int          cmaj_flt;             /*   568     8 */
	/* --- cacheline 9 boundary (576 bytes) --- */
	long unsigned int          inblock;              /*   576     8 */
	long unsigned int          oublock;              /*   584     8 */
	long unsigned int          cinblock;             /*   592     8 */
	long unsigned int          coublock;             /*   600     8 */
	long unsigned int          maxrss;               /*   608     8 */
	long unsigned int          cmaxrss;              /*   616     8 */
	struct task_io_accounting  ioac;                 /*   624    56 */
	/* --- cacheline 10 boundary (640 bytes) was 40 bytes ago --- */
	long long unsigned int     sum_sched_runtime;    /*   680     8 */
	struct rlimit              rlim[16];             /*   688   256 */
	/* --- cacheline 14 boundary (896 bytes) was 48 bytes ago --- */
	struct taskstats *         stats;                /*   944     8 */
	oom_flags_t                oom_flags;            /*   952     4 */
	short int                  oom_score_adj;        /*   956     2 */
	short int                  oom_score_adj_min;    /*   958     2 */
	/* --- cacheline 15 boundary (960 bytes) --- */
	struct mutex               cred_guard_mutex;     /*   960    40 */

	/* size: 1000, cachelines: 16, members: 58 */
	/* sum members: 988, holes: 3, sum holes: 12 */
	/* bit holes: 1, sum bit holes: 30 bits */
	/* last cacheline: 40 bytes */
};
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2016-05-19  6:53 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-18 12:20 [PATCH v2] mm,oom: speed up select_bad_process() loop Tetsuo Handa
2016-05-18 12:51 ` Michal Hocko
2016-05-18 13:30   ` [PATCH v3] " Tetsuo Handa
2016-05-18 14:15     ` Michal Hocko
2016-05-18 21:09       ` Andrew Morton
2016-05-19  6:53         ` Michal Hocko [this message]
2016-05-19  7:17           ` Michal Hocko
2016-05-19  8:17             ` Michal Hocko
2016-05-20  1:50           ` Oleg Nesterov
2016-05-20  2:13             ` Oleg Nesterov
2016-05-20  6:42             ` Michal Hocko
2016-05-20 22:04               ` Oleg Nesterov
2016-05-18 14:44     ` Tetsuo Handa
2016-05-20  7:50     ` Michal Hocko
2016-05-20 11:51       ` Tetsuo Handa
2016-05-20 12:09         ` Michal Hocko
2016-05-20 13:41           ` Tetsuo Handa
2016-05-20 13:44             ` Tetsuo Handa
2016-05-20 15:23             ` Michal Hocko
2016-05-20 15:56               ` Tetsuo Handa
2016-05-23  7:55                 ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160519065329.GA26110@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=linux-mm@kvack.org \
    --cc=oleg@redhat.com \
    --cc=penguin-kernel@I-love.SAKURA.ne.jp \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.