kernelnewbies.kernelnewbies.org archive mirror
 help / color / mirror / Atom feed
* Read the "real_parent" field of task_struct
@ 2020-09-25 16:11 John Wood
  2020-09-30 11:59 ` Valdis Klētnieks
  0 siblings, 1 reply; 6+ messages in thread
From: John Wood @ 2020-09-25 16:11 UTC (permalink / raw)
  To: kernelnewbies; +Cc: John Wood

Hi,

I'm working in a LSM that uses the task_alloc hook to do some work. This
hook needs to check the "real_parent" field hold by the task_struct
structure. I'm very confused since navigating the source code I see many
different ways to access this field. I don't understand why every method
is used in each case. So I don't know how to implement this access in my
LSM in a secure way.

The real_parent field is defined in the task_struct structure as:

struct task_struct __rcu	*real_parent;

So, as far I can understand this pointer uses the "Read Copy Update"
feature.

Below I show some examples of different access:

--------------------------------------------------------------------------
Example 1
--------------------------------------------------------------------------

void proc_fork_connector(struct task_struct *task)
{
	[...]
	rcu_read_lock();
	parent = rcu_dereference(task->real_parent);
	ev->event_data.fork.parent_pid = parent->pid;
	ev->event_data.fork.parent_tgid = parent->tgid;
	rcu_read_unlock();
	[...]
}

Here to access the real_parent field the code uses rcu_dereference inside
the rcu_read_lock/rcu_read_unlock block.

--------------------------------------------------------------------------
Example 2
--------------------------------------------------------------------------

static struct pid *
get_children_pid(struct inode *inode, struct pid *pid_prev, loff_t pos)
{
	[...]
	read_lock(&tasklist_lock);
	[...]
		if (task && task->real_parent == start &&
		    !(list_empty(&task->sibling))) {
	[...]
	read_unlock(&tasklist_lock);
	[...]
}

Here to access the real_parent field the code reads the pointer directly
inside the read_lock/read_unlock block. No rcu block needed? Why is not
used the rcu_dereference function?

--------------------------------------------------------------------------
Example 3
--------------------------------------------------------------------------

struct task_struct init_task __aligned(L1_CACHE_BYTES) = {
	[...]
	.real_parent	= &init_task,
	.parent		= &init_task,
	[...]
	RCU_POINTER_INITIALIZER(real_cred, &init_cred),
	RCU_POINTER_INITIALIZER(cred, &init_cred),
	[...]
};

Here the initialization is directly. If the pointer is declared __rcu, the
assigment to the real_parent should be with RCU_POINTER_INITIALIZER macro?

--------------------------------------------------------------------------
Example 4
--------------------------------------------------------------------------

static void forget_original_parent(struct task_struct *father,
					struct list_head *dead)
{
	[...]
			RCU_INIT_POINTER(t->real_parent, reaper);
	[...]
}

Here the initialization uses the RCU_INIT_POINTER macro. It's not directly.

--------------------------------------------------------------------------
Example 5
--------------------------------------------------------------------------

SYSCALL_DEFINE2(setpgid, pid_t, pid, pid_t, pgid)
{
	[...]
	rcu_read_lock();

	/* From this point forward we keep holding onto the tasklist lock
	 * so that our parent does not change from under us. -DaveM
	 */
	write_lock_irq(&tasklist_lock);
	[...]

	if (same_thread_group(p->real_parent, group_leader)) {

	[...]
	write_unlock_irq(&tasklist_lock);
	rcu_read_unlock();
	[...]
}

Here to access the real_parent field the code reads the pointer directly
inside the write_lock_irq/write_unlock_irq block nested in a rcu_read_lock/
rcu_read_unlock block.

--------------------------------------------------------------------------
Example 6
--------------------------------------------------------------------------

long keyctl_session_to_parent(void)
{
	[...]
	rcu_read_lock();
	write_lock_irq(&tasklist_lock);

	[...]
	parent = rcu_dereference_protected(me->real_parent,
					   lockdep_is_held(&tasklist_lock));

	[...]
	write_unlock_irq(&tasklist_lock);
	rcu_read_unlock();
	[...]
}

But here to access the real_parent field the code uses rcu_dereference_*
inside the write_lock_irq/write_unlock_irq block nested in a rcu_read_lock/
rcu_read_unlock block. The nested blocks are the sames that in the example 5
but the access is not directly, it uses rcu_dereference_*. Why?

Extracted from the documentation:

[1] The variant rcu_dereference_protected() can be used outside of an RCU
read-side critical section as long as the usage is protected by locks
acquired by the update-side code. This variant avoids the lockdep warning
that would happen when using (for example) rcu_dereference() without
rcu_read_lock() protection. Using rcu_dereference_protected() also has
the advantage of permitting compiler optimizations that rcu_dereference()
must prohibit. The rcu_dereference_protected() variant takes a lockdep
expression to indicate which locks must be acquired by the caller. If the
indicated protection is not provided, a lockdep splat is emitted.

Moreover, why the rcu_dereference_protected is used under a rcu block?

There are more examples but are similars to the ones showed. So my question
is how to read the "real_parent" field correctly. If I can understand all
the above examples I think I will have the knowledge to implement my LSM in
a correct way.

Any help that points me to the right direction will be greatly apreciated.
Some rules to know when and why use a method or another are welcome.

Thanks in advance. Regards,
John Wood


_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Read the "real_parent" field of task_struct
  2020-09-25 16:11 Read the "real_parent" field of task_struct John Wood
@ 2020-09-30 11:59 ` Valdis Klētnieks
  2020-10-01 17:49   ` John Wood
  0 siblings, 1 reply; 6+ messages in thread
From: Valdis Klētnieks @ 2020-09-30 11:59 UTC (permalink / raw)
  To: John Wood; +Cc: kernelnewbies


[-- Attachment #1.1: Type: text/plain, Size: 1600 bytes --]

On Fri, 25 Sep 2020 18:11:42 +0200, John Wood said:

> There are more examples but are similars to the ones showed. So my question
> is how to read the "real_parent" field correctly. If I can understand all
> the above examples I think I will have the knowledge to implement my LSM in
> a correct way.

There's multiple data structures which contain a pointer to the "real_parent"
that we're interested in.  So depending on which data structure you're working
with, you'll be following a different path to the real_parent, and depending on
what locks (if any) the calling code already has, the access method may be
different.

It's similar to giving directions to where I used to live on Long Island. If
you were coming from the Boston area, you wanted to take I-95 South, then
either the Throg's Neck or Whitestone Bridge, get to I-495 (Long Island
Expressway),  get off at a given exit, take that road south, and turn left to
enter the neighborhood from the north, but if you're coming from New Jersey,
you want to get to Staten Island, Verazanno Narrows Bridge, then Southern State
Parkway to *its* exit for that same road going north, but the right turn  to enter the
neighborhood from the south isn't the same one as if you're coming from the
north (unless you prefer driving another 3/4 of a mile to turn at the same light
as if you were coming from the north...)

You end up at the same place, even though you're using different paths to get
there.  The inside of the kernel works the same way.  Which way you get to
'real_parent' depends on what data structure you are already working with.


[-- Attachment #1.2: Type: application/pgp-signature, Size: 832 bytes --]

[-- Attachment #2: Type: text/plain, Size: 170 bytes --]

_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Read the "real_parent" field of task_struct
  2020-09-30 11:59 ` Valdis Klētnieks
@ 2020-10-01 17:49   ` John Wood
  2020-10-02  0:29     ` Valdis Klētnieks
  0 siblings, 1 reply; 6+ messages in thread
From: John Wood @ 2020-10-01 17:49 UTC (permalink / raw)
  To: Valdis Klētnieks; +Cc: John Wood, kernelnewbies

On Wed, Sep 30, 2020 at 07:59:47AM -0400, Valdis Klētnieks wrote:
> On Fri, 25 Sep 2020 18:11:42 +0200, John Wood said:
>
> > There are more examples but are similars to the ones showed. So my question
> > is how to read the "real_parent" field correctly. If I can understand all
> > the above examples I think I will have the knowledge to implement my LSM in
> > a correct way.
>
> There's multiple data structures which contain a pointer to the "real_parent"
> that we're interested in.  So depending on which data structure you're working
> with, you'll be following a different path to the real_parent, and depending on
> what locks (if any) the calling code already has, the access method may be
> different.

I want to check if the real_parent field of the task_struct structure is
NULL or not. More info below.

> It's similar to giving directions to where I used to live on Long Island. If
> you were coming from the Boston area, you wanted to take I-95 South, then
> either the Throg's Neck or Whitestone Bridge, get to I-495 (Long Island
> Expressway),  get off at a given exit, take that road south, and turn left to
> enter the neighborhood from the north, but if you're coming from New Jersey,
> you want to get to Staten Island, Verazanno Narrows Bridge, then Southern State
> Parkway to *its* exit for that same road going north, but the right turn  to enter the
> neighborhood from the south isn't the same one as if you're coming from the
> north (unless you prefer driving another 3/4 of a mile to turn at the same light
> as if you were coming from the north...)
>
> You end up at the same place, even though you're using different paths to get
> there.  The inside of the kernel works the same way.  Which way you get to
> 'real_parent' depends on what data structure you are already working with.

First of all, thanks for your comments. :)

Ok, I will show some parts of my code to clarify my question, and I will add
a small description of the idea behind.

Idea: The purpose of my LSM is to detect and mitigate a fork brute force
attack. To do so, I need a hierarchy of fork processes. In other words, there
is a pointer in the task_struct structure that points to an statistics data
structure. This pointer is copied to the child process when a process forks.
This way, all the tasks that fork with the same root share the same statistics.
These statistics allow to compute the application crashing period and detect
the attack.

So, all the tasks have an statistical data (copied from its parent). When an
execve is executed a new statistical data is allocated and a new hierarchy is
created.

But there is the case when the task 0 is allocated. In this case it's not
possible to copy the parent pointer to the statistical data. The task 0 don't
have parent. In this scenario a new statistics structure need to be allocated.

The idea exposed here is not completed but it's enought to clarify my
question. I hope.

So, my question is how to read the real_parent field of task_struct in this
situation in a secure way.

Below I show some part of the code to clarify:

/**
 * Target for task_alloc hook - Fork management.
 */
static int brute_task_alloc(struct task_struct *task, unsigned long clone_flags)
{
	struct brute_stats **stats, **p_stats;
	struct task_struct *p_task;

	stats = brute_stats_ptr(task);

	/*
	   Is this read correct? I need to use rcu_dereference?
	   I need to use a rcu_read_lock/rcu_read_unlock block?
	*/
	p_task = task->real_parent;			/////// This 3 lines are
							/////// the base of
	if (likely(p_task)) {				/////// my question to
		p_stats = brute_stats_ptr(p_task);	/////// the mailing list
		if (!*p_stats)
			return -EFAULT;

		spin_lock(&(*p_stats)->lock);
		refcount_inc(&(*p_stats)->refc);
		*stats = *p_stats;
		spin_unlock(&(*p_stats)->lock);

		return 0;
	}

	/* Task 0 stats allocation */
	*stats = brute_new_stats();
	if (!*stats)
		return -ENOMEM;

	return 0;
}

More code below to clarify structures and LSM hooks:

static struct security_hook_list brute_hooks[] __lsm_ro_after_init = {
	LSM_HOOK_INIT(task_alloc, brute_task_alloc),
	LSM_HOOK_INIT(bprm_committing_creds, brute_task_execve),
	LSM_HOOK_INIT(task_free, brute_task_free),
};

static int __init brute_init(void)
{
	pr_info("Brute initialized\n");
	security_add_hooks(brute_hooks, ARRAY_SIZE(brute_hooks),
			   KBUILD_MODNAME);
	return 0;
}

DEFINE_LSM(brute) = {
	.name = KBUILD_MODNAME,
	.init = brute_init,
	.blobs = &brute_blob_sizes,
};

/**
 * struct brute_stats - Fork brute force attack statistics.
 * @lock: Lock to protect the brute_stats structure.
 * @refc: Reference counter.
 * @timestamps: Last crashes timestamps list.
 * @timestamps_size: Last crashes timestamps list size.
 *
 * This structure holds the statistical data shared by all the fork hierarchy
 * processes.
 */
struct brute_stats {
	spinlock_t lock;
	refcount_t refc;
	struct list_head timestamps;
	unsigned char timestamps_size;
};

/**
 * struct brute_timestamp - Last crashes timestamps list entry.
 * @jiffies: Crash timestamp.
 * @node: Entry list head.
 *
 * This structure holds a crash timestamp.
 */
struct brute_timestamp {
	u64 jiffies;
	struct list_head node;
};

/**
 * brute_blob_sizes - LSM blob sizes.
 *
 * To share statistical data among all the fork hierarchy processes, define a
 * pointer to the brute_stats structure as a part of the task_struct's security
 * blob.
 */
static struct lsm_blob_sizes brute_blob_sizes __lsm_ro_after_init = {
	.lbs_task = sizeof(struct brute_stats *),
};

/**
 * brute_stats_ptr() - Get the pointer to the brute_stats stucture.
 * @task: Pointer to the task that holds the statistical data.
 *
 * Return: A pointer to a pointer to the brute_stats structure.
 */
static inline struct brute_stats **brute_stats_ptr(struct task_struct *task)
{
	return task->security + brute_blob_sizes.lbs_task;
}

/**
 * brute_new_timestamp() - Allocation of a new timestamp structure.
 *
 * If the allocation is successful the timestamp is set to now.
 *
 * Return: NULL if the allocation fails. A pointer to the new allocated
 *         timestamp structure if it success.
 */
static struct brute_timestamp *brute_new_timestamp(void)
{
	struct brute_timestamp *timestamp;

	timestamp = kmalloc(sizeof(struct brute_timestamp), GFP_KERNEL);
	if (!timestamp)
		return NULL;

	timestamp->jiffies = get_jiffies_64();
	INIT_LIST_HEAD(&timestamp->node);

	return timestamp;
}

/**
 * brute_new_stats() - Allocation of a new statistics structure.
 *
 * If the allocation is successful the reference counter is set to one to
 * indicate that there will be one task that points to this structure. The last
 * crashes timestamps list is initialized with one entry set to now. This way,
 * its possible to compute the application crash period with the first fault.
 *
 * Return: NULL if the allocation fails. A pointer to the new allocated
 *         statistics structure if it success.
 */
static struct brute_stats *brute_new_stats(void)
{
	struct brute_stats *stats;
	struct brute_timestamp *timestamp;

	stats = kmalloc(sizeof(struct brute_stats), GFP_KERNEL);
	if (!stats)
		return NULL;

	timestamp = brute_new_timestamp();
	if (!timestamp) {
		kfree(stats);
		return NULL;
	}

	spin_lock_init(&stats->lock);
	refcount_set(&stats->refc, 1);
	INIT_LIST_HEAD(&stats->timestamps);
	list_add(&timestamp->node, &stats->timestamps);
	stats->timestamps_size = 1;

	return stats;
}

I hope that now is clear what I want to do. Any help will be greatly
appreciated.

Thanks,
John Wood


_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Read the "real_parent" field of task_struct
  2020-10-01 17:49   ` John Wood
@ 2020-10-02  0:29     ` Valdis Klētnieks
  2020-10-02 16:59       ` John Wood
  0 siblings, 1 reply; 6+ messages in thread
From: Valdis Klētnieks @ 2020-10-02  0:29 UTC (permalink / raw)
  To: John Wood; +Cc: kernelnewbies


[-- Attachment #1.1: Type: text/plain, Size: 1181 bytes --]

On Thu, 01 Oct 2020 19:49:02 +0200, John Wood said:

> Idea: The purpose of my LSM is to detect and mitigate a fork brute force
> attack. To do so, I need a hierarchy of fork processes. In other words, there
> is a pointer in the task_struct structure that points to an statistics data
> structure. This pointer is copied to the child process when a process forks.
> This way, all the tasks that fork with the same root share the same statistics.
> These statistics allow to compute the application crashing period and detect
> the attack.

How is this any better than applying a ulimit to the userid, and using the existing
audit subsystem for reporting the attack, which is what that subsystem was
designed for?

> But there is the case when the task 0 is allocated. In this case it's not
> possible to copy the parent pointer to the statistical data. The task 0 don't
> have parent. In this scenario a new statistics structure need to be allocated.

At that point, your LSM probably hasn't been initialized yet. If your LSM is being
called before task 0 (let alone task 1) is created, there's probably something
wonky going on.  Are you seeing this happen on an actual system?


[-- Attachment #1.2: Type: application/pgp-signature, Size: 832 bytes --]

[-- Attachment #2: Type: text/plain, Size: 170 bytes --]

_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Read the "real_parent" field of task_struct
  2020-10-02  0:29     ` Valdis Klētnieks
@ 2020-10-02 16:59       ` John Wood
  2020-10-08 17:05         ` John Wood
  0 siblings, 1 reply; 6+ messages in thread
From: John Wood @ 2020-10-02 16:59 UTC (permalink / raw)
  To: Valdis Klētnieks; +Cc: John Wood, kernelnewbies

Hi,

On Thu, Oct 01, 2020 at 08:29:58PM -0400, Valdis Klētnieks wrote:
> On Thu, 01 Oct 2020 19:49:02 +0200, John Wood said:
>
> > Idea: The purpose of my LSM is to detect and mitigate a fork brute force
> > attack. To do so, I need a hierarchy of fork processes. In other words, there
> > is a pointer in the task_struct structure that points to an statistics data
> > structure. This pointer is copied to the child process when a process forks.
> > This way, all the tasks that fork with the same root share the same statistics.
> > These statistics allow to compute the application crashing period and detect
> > the attack.
>
> How is this any better than applying a ulimit to the userid, and using the existing
> audit subsystem for reporting the attack, which is what that subsystem was
> designed for?

As far as I know, the ulimit allows to define limits (file size, max user
processes, open files, ...) but this is not what we want. We don't want to
limit any user resources. We need to detect if an application is crashing
quickly. More info can be found in this RFC:

https://lore.kernel.org/kernel-hardening/20200913072430.GA2965@ubuntu/T/

> > But there is the case when the task 0 is allocated. In this case it's not
> > possible to copy the parent pointer to the statistical data. The task 0 don't
> > have parent. In this scenario a new statistics structure need to be allocated.
>
> At that point, your LSM probably hasn't been initialized yet. If your LSM is being
> called before task 0 (let alone task 1) is created, there's probably something
> wonky going on.  Are you seeing this happen on an actual system?

The code shown in early mails has not been tested yet. I wanted to wait
until the access to the "real_parent" field of the task_struct structure was
clear to me. Anyway, good point. Thanks a lot.

I have changed the code to return an error if the allocated task don't have
parent. If the parent have statistics, this data is shared with the allocated
task. And if the parent don't have statistics, a new stats structure is
allocated for the allocated task and then shared with the parent.

static void brute_share_stats(struct brute_stats **src,
			      struct brute_stats **dst)
{
	spin_lock(&(*src)->lock);
	refcount_inc(&(*src)->refc);
	*dst = *src;
	spin_unlock(&(*src)->lock);
}

static int brute_task_alloc(struct task_struct *task, unsigned long clone_flags)
{
	struct task_struct *p_task;
	struct brute_stats **stats, **p_stats;

	p_task = task->real_parent;		/////////// <----
	if (unlikely(!p_task))			/////////// <----
		return -ESRCH;

	stats = brute_stats_ptr(task);
	p_stats = brute_stats_ptr(p_task);	/////////// <----

	if (likely(*p_stats)) {
		brute_share_stats(p_stats, stats);
		return 0;
	}

	*stats = brute_new_stats();
	if (!*stats)
		return -ENOMEM;

	brute_share_stats(stats, p_stats);
	return 0;
}

This code is very untested. And now my first question: how can I read the
real_parent field in a secure way. Do I need to use an rcu_read_lock()/
rcu_read_unlock() block? Do I need to use rcu_dereference? Do I need to
use a read_lock(&task_list_lock)/read_unlock(&task_list_lock) block?

The lines with the mark are not clear to me. Sorry.

Thanks,
John Wood


_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Read the "real_parent" field of task_struct
  2020-10-02 16:59       ` John Wood
@ 2020-10-08 17:05         ` John Wood
  0 siblings, 0 replies; 6+ messages in thread
From: John Wood @ 2020-10-08 17:05 UTC (permalink / raw)
  To: Valdis Klētnieks; +Cc: John Wood, kernelnewbies

Hi,

On Fri, Oct 02, 2020 at 06:59:22PM +0200, John Wood wrote:
>
> static void brute_share_stats(struct brute_stats **src,
> 			      struct brute_stats **dst)
> {
> 	spin_lock(&(*src)->lock);
> 	refcount_inc(&(*src)->refc);
> 	*dst = *src;
> 	spin_unlock(&(*src)->lock);
> }
>
> static int brute_task_alloc(struct task_struct *task, unsigned long clone_flags)
> {
> 	struct task_struct *p_task;
> 	struct brute_stats **stats, **p_stats;
>
> 	p_task = task->real_parent;		/////////// <----
> 	if (unlikely(!p_task))			/////////// <----
> 		return -ESRCH;
>
> 	stats = brute_stats_ptr(task);
> 	p_stats = brute_stats_ptr(p_task);	/////////// <----
>
> 	if (likely(*p_stats)) {
> 		brute_share_stats(p_stats, stats);
> 		return 0;
> 	}
>
> 	*stats = brute_new_stats();
> 	if (!*stats)
> 		return -ENOMEM;
>
> 	brute_share_stats(stats, p_stats);
> 	return 0;
> }
>
> This code is very untested.

Now the code is tested.

> And now my first question: how can I read the
> real_parent field in a secure way. Do I need to use an rcu_read_lock()/
> rcu_read_unlock() block? Do I need to use rcu_dereference? Do I need to
> use a read_lock(&task_list_lock)/read_unlock(&task_list_lock) block?
>
> The lines with the mark are not clear to me. Sorry.

Any help would be greatly appreciated. Thanks in advance.

Regards,
John Wood

_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-10-08 17:05 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-25 16:11 Read the "real_parent" field of task_struct John Wood
2020-09-30 11:59 ` Valdis Klētnieks
2020-10-01 17:49   ` John Wood
2020-10-02  0:29     ` Valdis Klētnieks
2020-10-02 16:59       ` John Wood
2020-10-08 17:05         ` John Wood

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).