linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Anton Vorontsov <anton.vorontsov@linaro.org>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: "Oleg Nesterov" <oleg@redhat.com>,
	"Arve Hjønnevåg" <arve@android.com>,
	"KOSAKI Motohiro" <kosaki.motohiro@gmail.com>,
	"Greg Kroah-Hartman" <gregkh@suse.de>,
	"San Mehat" <san@google.com>, "Colin Cross" <ccross@android.com>,
	linux-kernel@vger.kernel.org, kernel-team@android.com,
	linaro-kernel@lists.linaro.org
Subject: Re: [PATCH 1/3] procfs: Export next_tgid(), move it to kernel/pid.c
Date: Wed, 1 Feb 2012 08:19:17 +0400	[thread overview]
Message-ID: <20120201041917.GA22369@oksana.dev.rtsoft.ru> (raw)
In-Reply-To: <m1r4ygfxlz.fsf@fess.ebiederm.org>

On Mon, Jan 30, 2012 at 05:51:20PM -0800, Eric W. Biederman wrote:
[...]
> > Yes, in LMK driver we don't need to be accurate. I probably could use
> > rcu_read_lock, but the plan was in not holding any global locks (in
> > this case the rcu) at all, instead I'd like to hold just a reference
> > of the task, which the driver is analyzing at this time. Once we decide
> > (to kill or not to kill the task), we either send a signal (and drop
> > the reference) or just drop the reference.
> 
> rcu_read_lock unless it is implemented wrong is free from a lock
> perspective.  rcu_read_lock only touches local state.
> 
> >From the look of your loop it already does a walk through the entire
> process list so it looks to me like playing games with get_task_struct
> and put_task_struct are going to be much more expensive.
> 
> proc grabs task references because we can't hold the rcu_read_lock
> over a copy_to_user because that is a sleeping function.
> 
> You don't call anything that sleeps so rcu_read_lock should be
> sufficient.

I'll just repeat what I told to Paul E. McKenney:

[...] the locking part wasn't my concern at all. As I said before,
LMK (low memory killer) itself is not important, and we don't care
about its overhead, unless it blocks another kernel activity --
which is my main concern.

So, reader part is not interesting in sense of overhead or
efficiency.

The interesting questions are:

1. Can the kernel create processes while LMK traverses the list?
2. Can the kernel free processes while LMK traverses the list?

Looking into kernel/fork.c:copy_process(), it does this:

- Takes a write lock on tasklist_lock;
- Uses list_add_tail_rcu() to add a task.

So, with current LMK driver (it grabs the tasklist_lock), it seems
that the kernel won't able to create processes while LMK traverse the
tasks.

Looking into kernel/exit.c:release_task(), it does this:

- Takes a write lock on tasklist_lock;
- Deletes the task from the list using list_del_rcu()
- Releases tasklist_lock;
- Issues call_rcu(&p->rcu, delayed_put_task_struct), which
  then actually completely frees the task;

So, with the current LMK driver, kernel won't able to release
processes while LMK traverse the processes, because LMK takes
the tasklist_lock.

By using rcu_read_lock() we would solve "1.", i.e. kernel will able
to create processes, but we still won't able to free processes (well,
for the most part we will, except that we'll only free memory after
LMK finishes its traverse).

(I still may be wrong in my findings, please correct me if so.)


Note the these aren't huge or catastrophic issues or something alike.
It is just that during LMK review people pointed out that grabbing
the tasklist lock for LMK is 'too much', and we'd better find a
better way.

rcu_read_lock() might be a good compromise.

Thanks,

-- 
Anton Vorontsov
Email: cbouatmailru@gmail.com

  reply	other threads:[~2012-02-01  4:19 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-01-30  1:13 [PATCH 1/3] procfs: Export next_tgid(), move it to kernel/pid.c Anton Vorontsov
2012-01-30  2:22 ` Greg KH
2012-01-30 19:50   ` Anton Vorontsov
2012-01-30  3:26 ` Eric W. Biederman
2012-01-30 19:51   ` Anton Vorontsov
2012-01-30 13:43 ` Oleg Nesterov
2012-01-30 20:49   ` Anton Vorontsov
2012-01-31  1:51     ` Eric W. Biederman
2012-02-01  4:19       ` Anton Vorontsov [this message]
2012-02-01  4:37         ` Eric W. Biederman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120201041917.GA22369@oksana.dev.rtsoft.ru \
    --to=anton.vorontsov@linaro.org \
    --cc=arve@android.com \
    --cc=ccross@android.com \
    --cc=ebiederm@xmission.com \
    --cc=gregkh@suse.de \
    --cc=kernel-team@android.com \
    --cc=kosaki.motohiro@gmail.com \
    --cc=linaro-kernel@lists.linaro.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=oleg@redhat.com \
    --cc=san@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).