First of all, thank you very much for the review! I am still learning with
many things and your feedback is very much appreciated!

>First off: On Linux, as far as I know, chroots were never meant to be a
security feature, and chroot "jails" break in a number of different ways. A
lot of effort went into making bind mounts actually secure with reasonable
performance, and I doubt that >something like this can provide anything
close to that, at least not without gigantic runtime overhead. Instead of
making people believe that it's now okay to use chroot for security, I would
very much prefer to keep the "never use this for security >purposes" warning
in the chroot() manpage and encourage people to use namespaces with bind
mounts instead.

I am aware that chroot jail should not be used for security purposes, but
reality is that it is *used* in many places at least with some intention of
getting a better security. I don't believe there is any person nowadays who
believes that chroot is secure, but because it is super light-weight (even
compare with namespaces setup) and because of legacy concerns, it is still
there. I don't also claim that this LSM make chroot *fully* secure (as you
mentioned below some important features are still missing) and it would
never be as strong as full namespace-based solution, but practical problem
with namespaces is that it isn't that easy to setup them up properly and
possibility of making configuration mistake is still quite high.
About the performance overhead, I actually don't believe it would be that
gigantic (especially when we get proper LSM stacking in place that
eliminates the need to store internal list and do searches), but of course
any LSM has performance impact, especially with the hooks on the filesystem.

And don't take me wrong, I do personally like namespaces very much. Just
usecases  I am trying to address with this are very different. 

> I think that a somewhat more formal specification of what this is supposed
to be good for might help.
Yes, I think this make sense. It would clarify on when you should use this
(in cases when you do need to use chroot only but want to make it a degree
better).

>As far as I can tell, you haven't done anything to restrict the access to
unix domain sockets or ptrace, which are probably the main tricks people
would actually use to break out of a chroot "jail". Instead, you add support
for some weird edgecase things like >"if you are privileged and inside a
chroot, you can renice processes outside the chroot".

Supporting unix domain sockets is in plans, I was looking into all GRKERNSEC
features and working them one by one. I first processed all features that
can be easily moved into existing LSM hooks, and nice was one of them. Then
I started adding new hooks for the rest of features. But at some point I
just exceeded my threshold of confidence with all the changes and hooks I am
making (unix domain sockets might need further new hooks), so I wanted to
bring this out for review earlier than later. 

What about ptrace? I am assume that we have Yama in place for this :)

>     Note: this feature currently prevents rtkit daemon from
>     normal operation.

>And it probably also prevents people from using chroot() for what it's
actually intended to do.
I would not go as far as that. How many cases apart from rtkit daemon who
sets priorities for other processes you know that would suffer from this
behaviour? 
Again, this was an easy feature that fits into existing hook and can be
fully removed if people don't see a value in it. It is certainly not the
main part of it. 
 
> +/**
> + * is_process_chrooted - process is inside chroot
> + * @task: the task_struct of the process to be checked
> + *
> + * Returns 1 if task is inside chroot.
> + */
> +static int is_process_chrooted(struct task_struct *task) {
> +	int rc = 0;
> +	struct hardchroot_info *entry;
> +
> +	rcu_read_lock();
> +	list_for_each_entry_rcu(entry, &hardchroot_infos, node) {

>Wait, what? Is this really a function that has execution time linear in the
number of running processes and that is called for every fchdir()? That
seems slow. Did you benchmark this on a system with lots of processes?
I haven't done any benchmarking on this yet, but remember that
hardchroot_infos will be quite a short list, which equals number of
processes that explicitly entered chroot (not their children, only first
parent process to do it). 
In Ubuntu and Fedora for example, after the full boot, I only had 3 such
processes in the list, so this gives the size of list I am expecting: 10-20
items absolutely max (normally under 10), not  hundreds. So, list iteration
is quite fast.
So, it doesn't depend on the number of system processes, but it does depend
on number of processes that entered chroot. 

> +/**
> + * is_same_root - check if two tasks share the same root
> + * @task1: the task_struct of the first task to be checked
> + * @task2: the task_struct of the second task to be checked
> + *
> + * Returns 1 if tasks share the same root.
> + */
> +static int is_same_root(struct task_struct *task1, struct task_struct 
> +*task2) {
> +	int rc = 0;
> +	struct hardchroot_info *entry;
> +	struct dentry *dentry1 = NULL;
> +	struct dentry *dentry2 = NULL;
> +
> +	rcu_read_lock();
> +	list_for_each_entry_rcu(entry, &hardchroot_infos, node) {
> +		if (entry->invalid)
> +			continue;
> +		if (entry->task == task1)
> +			dentry1 = entry->dentry;
> +		if (entry->task == task2)
> +			dentry2 = entry->dentry;
> +	}
> +	if (dentry1 && (dentry1 == dentry2)) {
> +		rc = 1;
> +		pr_info("HCRT: pids %d and %d have the same root\n",
> +				task_pid_nr(task1), task_pid_nr(task2));
> +	}

>You're comparing dentries, not paths? So using the same directory through
different bind mounts as root directories will still count as "same root"
even though for chroot breakouts, it's like having different roots?
Bind mounts are tricky in this case. From one side yes, it is yes like
having two different roots, but then how to handle legitimate cases when
people bind mount things into chroot to be used there?
Blocking them would make bind mounts not supported inside chroot, which is
not smth that should be done IMO.


> +/**
> + * hardchroot_path_chroot - validate chroot entry
> + * @path contains the path structure.
> + *
> + * Returns 0 if chroot is allowed, -ve on error.
> + */
> +static int hardchroot_path_chroot(const struct path *path) {
> +	int rc = 0;
> +	struct task_struct *myself = current;
> +
> +	pr_info("HCRT, chroot: inode %lu and pid %d\n",
> +			d_backing_inode(path->dentry)->i_ino,
> +			task_pid_nr(myself));
> +
> +	get_task_struct(myself);
> +	if (is_process_chrooted(myself) &&
> +		!is_inside_chroot(path->dentry, path->mnt)) {
> +		put_task_struct(myself);
> +		pr_info("HCRT, chroot denied: for inode %lu and pid %d\n",
> +				d_backing_inode(path->dentry)->i_ino,
> +				task_pid_nr(myself));
> +		return -EACCES;
> +	}
> +
> +	if (task_pid_nr(myself) > 1 &&
> +		path->dentry != init_task.fs->root.dentry &&
> +		path->dentry != myself->nsproxy->mnt_ns->root->mnt.mnt_root)
{
> +		/* task is attempting to chroot, add it to the list */
> +		rc = hardchroot_info_add(myself, path->dentry);
> +		pr_info("HCRT, chroot: adding %d to chrooted task list\n",
> +			task_pid_nr(myself));
> +	}
> +
> +	/* set the current working directory of all newly-chrooted
> +	 * processes to the the root directory of the chroot
> +	 */
> +	set_fs_pwd(myself->fs, path);

>That seems pretty broken, considering the possibility of open file
descriptors pointing to the old root or so.
Again, the main use case here is that we have a good process that enters
chroot to minimize the damage in case it gets exploited some later time in
life. 
A good process is supposed to close its file descriptors pointing to the old
root before entering chroot, isn't it?

>And again, chroot only works if the current process is privileged.
>If it is, there are probably other ways to break out.

You need to have privilege to *enter* chroot,  but if you are good process,
you will drop the privilege after you did your setup. 


> +/**
> + * hardchroot_task_unshare - check if process is
> + * allowed to unshare its namespaces
> + * @unshare_flags flags
> + * @new_fs contains the new fs_struct if created.
> + * @new_fd contains the new files_struct if created.
> + * @new_creds contains the new cred if created.
> + * @new_nsproxy contains the new nsproxy if created.
> + *
> + * Returns 0 if unshare is allowed, -ve on error.
> + */
> +static int hardchroot_task_unshare(unsigned long unshare_flags,
> +		const struct fs_struct *new_fs,
> +		const struct files_struct *new_fd,
> +		const struct cred *new_cred,
> +		const struct nsproxy *new_nsproxy)
> +{
> +	int rc = 0;
> +	struct task_struct *myself = current;
> +	const struct nsproxy *tnsproxy = new_nsproxy;
> +
> +	pr_info("HCRT, unshare: unshare_flags %lu and pid %d\n",
> +			unshare_flags, task_pid_nr(myself));
> +	if (new_fs)
> +		pr_info("HCRT, unshare: new_fs->root.dentry inode%lu\n",
> +			d_backing_inode(new_fs->root.dentry)->i_ino);
> +
> +	if (!tnsproxy)
> +		tnsproxy = myself->nsproxy;
> +
> +	if (new_fs && task_pid_nr(myself) > 1 &&
> +		new_fs->root.dentry != init_task.fs->root.dentry &&

>Hm? You're accessing the fs_struct of init_task without any explicit
locking?
I am quite new to the kernel development and locking. I was reading a lot
recently about locking to understand where it is needed and where not, but
mistakes are obviously still coming. 
I guess I would need to do smth like: task_lock(&init_task) before and
task_unlock(&init_task) after. 

> +/**
> + * hardchroot_shm_shmat - check if shmat is allowed
> + * @shp contains the shared memory structure to be modified.
> + * @shmaddr contains the address to attach memory region to.
> + * @shmflg contains the operational flags.
> + *
> + * Return 0 if allowed, -ve on error.
> + */
> +int hardchroot_shm_shmat(struct shmid_kernel *shp, char __user *shmaddr,
> +			int shmflg)
> +{
> +	int rc = 0;
> +	struct task_struct *p;
> +	struct task_struct *myself = current;
> +	u64 st;
> +	time_t ct;
> +
> +	pr_info("HCRT, shmat: shp %d, shmflg %d, pid %d\n",
> +			shp->shm_perm.id, shmflg,
> +			task_pid_nr(myself));
> +
> +	if (likely(!is_process_chrooted(myself)))
> +		return rc;
> +
> +	rcu_read_lock();
> +	read_lock(&tasklist_lock);
> +
> +	p = find_task_by_vpid(shp->shm_cprid);

>Not exactly your fault, but this is broken when PID namespaces are
involved.
Yes, this has been bothering me also...  Grsecurity has their own version of
this function (find_task_by_vpid_unrestricted) instead that tried to do
global search, I guess I should think on doing smth similar, but I am
wondering if this can be done better. 
I will look into this for the next version.