linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: Anton Vorontsov <anton.vorontsov@linaro.org>
Cc: "Oleg Nesterov" <oleg@redhat.com>,
	"Arve Hjønnevåg" <arve@android.com>,
	"KOSAKI Motohiro" <kosaki.motohiro@gmail.com>,
	"Greg Kroah-Hartman" <gregkh@suse.de>,
	"San Mehat" <san@google.com>, "Colin Cross" <ccross@android.com>,
	linux-kernel@vger.kernel.org, kernel-team@android.com,
	linaro-kernel@lists.linaro.org
Subject: Re: [PATCH 1/3] procfs: Export next_tgid(), move it to kernel/pid.c
Date: Tue, 31 Jan 2012 20:37:00 -0800	[thread overview]
Message-ID: <m1ipjr2mqb.fsf@fess.ebiederm.org> (raw)
In-Reply-To: <20120201041917.GA22369@oksana.dev.rtsoft.ru> (Anton Vorontsov's message of "Wed, 1 Feb 2012 08:19:17 +0400")

Anton Vorontsov <anton.vorontsov@linaro.org> writes:

> On Mon, Jan 30, 2012 at 05:51:20PM -0800, Eric W. Biederman wrote:
> [...]
>> > Yes, in LMK driver we don't need to be accurate. I probably could use
>> > rcu_read_lock, but the plan was in not holding any global locks (in
>> > this case the rcu) at all, instead I'd like to hold just a reference
>> > of the task, which the driver is analyzing at this time. Once we decide
>> > (to kill or not to kill the task), we either send a signal (and drop
>> > the reference) or just drop the reference.
>> 
>> rcu_read_lock unless it is implemented wrong is free from a lock
>> perspective.  rcu_read_lock only touches local state.
>> 
>> >From the look of your loop it already does a walk through the entire
>> process list so it looks to me like playing games with get_task_struct
>> and put_task_struct are going to be much more expensive.
>> 
>> proc grabs task references because we can't hold the rcu_read_lock
>> over a copy_to_user because that is a sleeping function.
>> 
>> You don't call anything that sleeps so rcu_read_lock should be
>> sufficient.
>
> I'll just repeat what I told to Paul E. McKenney:
>
> [...] the locking part wasn't my concern at all. As I said before,
> LMK (low memory killer) itself is not important, and we don't care
> about its overhead, unless it blocks another kernel activity --
> which is my main concern.
>
> So, reader part is not interesting in sense of overhead or
> efficiency.
>
> The interesting questions are:
>
> 1. Can the kernel create processes while LMK traverses the list?
> 2. Can the kernel free processes while LMK traverses the list?
>
> Looking into kernel/fork.c:copy_process(), it does this:
>
> - Takes a write lock on tasklist_lock;
> - Uses list_add_tail_rcu() to add a task.
>
> So, with current LMK driver (it grabs the tasklist_lock), it seems
> that the kernel won't able to create processes while LMK traverse the
> tasks.
>
> Looking into kernel/exit.c:release_task(), it does this:
>
> - Takes a write lock on tasklist_lock;
> - Deletes the task from the list using list_del_rcu()
> - Releases tasklist_lock;
> - Issues call_rcu(&p->rcu, delayed_put_task_struct), which
>   then actually completely frees the task;
>
> So, with the current LMK driver, kernel won't able to release
> processes while LMK traverse the processes, because LMK takes
> the tasklist_lock.
>
> By using rcu_read_lock() we would solve "1.", i.e. kernel will able
> to create processes, but we still won't able to free processes (well,
> for the most part we will, except that we'll only free memory after
> LMK finishes its traverse).


Correct. We will only free the task_struct after you release the
rcu_read_lock. Many of the other resources are freed before the
task_struct.  So most of the memory of a process should be freeable
even with the rcu_read_lock held.

Eric

      reply	other threads:[~2012-02-01  4:34 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-01-30  1:13 [PATCH 1/3] procfs: Export next_tgid(), move it to kernel/pid.c Anton Vorontsov
2012-01-30  2:22 ` Greg KH
2012-01-30 19:50   ` Anton Vorontsov
2012-01-30  3:26 ` Eric W. Biederman
2012-01-30 19:51   ` Anton Vorontsov
2012-01-30 13:43 ` Oleg Nesterov
2012-01-30 20:49   ` Anton Vorontsov
2012-01-31  1:51     ` Eric W. Biederman
2012-02-01  4:19       ` Anton Vorontsov
2012-02-01  4:37         ` Eric W. Biederman [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m1ipjr2mqb.fsf@fess.ebiederm.org \
    --to=ebiederm@xmission.com \
    --cc=anton.vorontsov@linaro.org \
    --cc=arve@android.com \
    --cc=ccross@android.com \
    --cc=gregkh@suse.de \
    --cc=kernel-team@android.com \
    --cc=kosaki.motohiro@gmail.com \
    --cc=linaro-kernel@lists.linaro.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=oleg@redhat.com \
    --cc=san@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).