linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Zhang Qiao <zhangqiao22@huawei.com>,
	lkml <linux-kernel@vger.kernel.org>,
	keescook@chromium.org, Peter Zijlstra <peterz@infradead.org>,
	elver@google.com, legion@kernel.org, oleg@redhat.com,
	brauner@kernel.org
Subject: Re: Question about kill a process group
Date: Thu, 12 May 2022 00:53:16 +0200	[thread overview]
Message-ID: <87bkw3y943.ffs@tglx> (raw)
In-Reply-To: <87a6bnudfy.fsf@email.froward.int.ebiederm.org>

On Wed, May 11 2022 at 13:33, Eric W. Biederman wrote:
> Thomas Gleixner <tglx@linutronix.de> writes:
>> So unless the number of PIDs for a user is limited this _is_ an
>> unpriviledged DoS vector.
>
> After having slept on this a bit it finally occurred to me the
> semi-obvious solution to this issue is to convert tasklist_lock
> from a rw-spinlock to rw-semaphore.  The challenge is finding
> the users (tty layer?) that generate signals from interrupt
> context and redirect that signal generation.

From my outdated notes where I looked at this before:

[soft]interrupt context which acquires tasklist_lock:
sysrq-e		send_sig_all()
sysrq-i         send_sig_all()
sysrq-n		normalize_rt_tasks()

tasklist_lock nesting into other locks:
   fs/fcntl.c: send_sigio(), send_sigurg()

   send_sigurg() is called from the network stack ...

Some very obscure stuff in arch/ia64/kernel/mca.c which is called
from a DIE notifier.

Plus quite a bunch of read_lock() instances which nest inside
rcu_read_lock() held sections.

This is probably incomplete, but the scope of the problem has been
greatly reduced vs. the point where I looked at it last time a couple of
years ago. But that's still a herculean task.

> Once signals holding tasklist_lock are no longer generated from
> interrupt context irqs no longer need to be disabled and
> after verifying tasklist_lock isn't held under any other spinlocks
> it can be converted to a semaphore.

Going to take a while. :)

> It won't help the signal delivery times, but it should reduce
> the effect on the rest of the system, and prevent watchdogs from
> firing.

The signal delivery time itself is the least of the worries, but this
still prevents any other operations which require tasklist_lock from
making progress for quite some time, i.e. fork/exec for unrelated
processes/users will have to wait too. So you converted the 'visible'
DoS to an 'invisible' one.

The real problem is that the scope of tasklist_lock is too broad for
most use cases. That does not change when you actually can convert it to
a rwsem. The underlying problem still persists.

Let's take a step back and look what most sane use cases (sysrq-* is not
in that category) require:

   Preventing that tasks are added or removed

Do they require that any task is added or removed? No.

   They require to prevent add/remove for the intended scope.

That's the thing we need to focus on: reducing the protection scope.

If we can segment the protection for the required scope of e.g. kill(2)
then we still can let unrelated processes/tasks make progress and just
inflict the damage on the affected portion of processes/tasks.

For example:

	read_lock(&tasklist_lock);
	for_each_process(p) {
		if (task_pid_vnr(p) > 1 &&
		    !same_thread_group(p, current)) {

			group_send_sig_info(...., p);
		}
	}
	read_unlock(&tasklist_lock);

same_thread_group() does:

   return p->signal == current->signal;

Ideally we can do:

	read_lock(&tasklist_lock);
        prevent_add_remove(current->signal);
	read_unlock(&tasklist_lock);

        rcu_read_lock();
	for_each_process(p) {
		if (task_pid_vnr(p) > 1 &&
		    !same_thread_group(p, current)) {

			group_send_sig_info(...., p);
		}
	}
        rcu_read_unlock();

        allow_add_remove(current->signal);

Where prevent_add_remove() sets a state which has to be waited for to be
cleared by anything which wants to add/remove a task in that scope or
change $relatedtask->signal until allow_add_remove() removes that
blocker. I'm sure it's way more complicated, but you get the idea.

If we find a solution to this scope reduction problem, then it will not
only squash the issue which started this discussion. This will have a
benefit in general.

Thanks,

        tglx

  reply	other threads:[~2022-05-11 22:53 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-29  8:07 Question about kill a process group Zhang Qiao
2022-04-02  2:22 ` Zhang Qiao
2022-04-13  1:56   ` Zhang Qiao
2022-04-13 15:47     ` Eric W. Biederman
2022-04-14 11:40       ` Zhang Qiao
2022-04-21 16:12         ` Eric W. Biederman
2022-04-28  2:05           ` Zhang Qiao
2022-04-28 12:33           ` Thomas Gleixner
2022-05-11 18:33             ` Eric W. Biederman
2022-05-11 22:53               ` Thomas Gleixner [this message]
2022-05-12 18:23                 ` Eric W. Biederman
2022-09-26  7:32                   ` Zhang Qiao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87bkw3y943.ffs@tglx \
    --to=tglx@linutronix.de \
    --cc=brauner@kernel.org \
    --cc=ebiederm@xmission.com \
    --cc=elver@google.com \
    --cc=keescook@chromium.org \
    --cc=legion@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=zhangqiao22@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).