linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [heads-up] binderfs bugs galore...
@ 2019-01-17 22:40 Al Viro
  2019-01-17 22:46 ` Christian Brauner
  2019-01-18  7:57 ` Christian Brauner
  0 siblings, 2 replies; 4+ messages in thread
From: Al Viro @ 2019-01-17 22:40 UTC (permalink / raw)
  To: Christian Brauner; +Cc: Greg Kroah-Hartman, linux-fsdevel

	1) d_make_root() does iput() on failure.  IOW, if you hit that case,
you get double iput() in err_without_dentry.

	2) final dput() does iput() (unsurprisingly).  IOW, anyone hitting
err_with_dentry will get double iput() as well.

	3) iput(NULL) is (fortunately) a no-op.  And that is the only case
when iput() in there does not cause immediate trouble.

	4) failure in binderfs_fill_super() leads to a call of
deactivate_locked_super().  Which calls binderfs_kill_super(),
which does put_ipc_ns(info->ipc_ns).  IOW,
        put_ipc_ns(ipc_ns);
in failure exits of binderfs_fill_super() is also a double-put...

	5) ... if, that is, the memory that used to hold sb->s_fs_info
until the same failure exit has kfreed it has already been stomped onto,
in which case you get put_ipc_ns() done on arbitrary address that might've
never held a struct ipc_namespace instance.

	What I really don't get here is why do you go to such contortions
in the first place.  Every single thing done in err_without_dentry is wrong.
And the only thing done in err_with_dentry before it hits err_without_dentry
is pointless; deactivate_locked_super() will do that dput() just fine.
Whereas the simple return ret; (or return -ENOMEM in most of the cases)
would've worked correctly...

	Actually, looking at that a bit more,
	6) initially you have ->s_fs_info contain a pointer to
struct ipc_namespace.  Then binderfs_fill_super() tries to allocate
a struct binderfs_info instance and stick *that* in place of
->s_fs_info, moving the original value into info->ipc_ns.  That's
Not Nice(tm) towards the poor binderfs_kill_super(), which has no
fscking way to tell which kind of pointer is it going to find
there.  As it is, it assumes that replacement has been done and
drops ->ipc_ns it finds in there.  Guess what happens if you
fail on attempt to allocate struct binderfs_info in the first
place?

	7) binderfs_test_super() also assumes that all superblocks it
is asked to look at will have ->s_fs_info pointing to struct binderfs_info.
binderfs_set_super(), OTOH, sets it to struct ipc_namespace pointer it
(ultimately - sget_userns()) has been passed as 'data'.  So if two mounts
race, you'll get a false negative out of binderfs_test_super().

	8) binderfs_binder_ctl_create() is a no-op on subsequent
calls (if there had been any).  And the first call is done before
we unlock the superblock, so locking the root is completely pointless.

	9) the one thing ->s_fs_info can't be in the whole thing is NULL.
There are two assignments - one in binderfs_set_super(), where we set
it to the last argument of sget_userns() and another in binderfs_fill_super()
where we set it to 'info'.  In both cases we have that pointer dereferenced
a little bit earlier - while evaluating the next-to-last argument of the
same sget_userns() call or in setting info->root_uid in the other place.
Both would've obviously oopsed before storing NULL there.  IOW, checking
whether it's NULL is pointless both in binderfs_test_super() and in
binderfs_kill_super().

	10) your ->unlink() checks if the victim is ->control_dentry.
Your ->rename(), OTOH, does not, which renders the check in ->unlink()
trivial to bypass.

	11) speaking of the tests that are there in ->rename(),
        /* binderfs doesn't support directories. */
        if (d_is_dir(old_dentry))
                return -EPERM;
is... interesting.  If the comment is correct, the check is pointless.
And AFAICS the comment *is* correct...  Another check in there
        if (!simple_empty(new_dentry))
                return -ENOTEMPTY;
is equally pointless, for the same reason...

	12) what is the comment in binderfs_unlink() supposed to mean?
In ->unlink() we *already* have parent locked...

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [heads-up] binderfs bugs galore...
  2019-01-17 22:40 [heads-up] binderfs bugs galore Al Viro
@ 2019-01-17 22:46 ` Christian Brauner
  2019-01-18  7:57 ` Christian Brauner
  1 sibling, 0 replies; 4+ messages in thread
From: Christian Brauner @ 2019-01-17 22:46 UTC (permalink / raw)
  To: Al Viro; +Cc: Greg Kroah-Hartman, linux-fsdevel

On Thu, Jan 17, 2019 at 10:40:11PM +0000, Al Viro wrote:
<snip>

Thanks for all the comments they seem really helpful. I'm currently
travelling but I will try to address them all within the next week! Does
that sound ok to you?

Thanks!
Christian

> 	1) d_make_root() does iput() on failure.  IOW, if you hit that case,
> you get double iput() in err_without_dentry.
> 
> 	2) final dput() does iput() (unsurprisingly).  IOW, anyone hitting
> err_with_dentry will get double iput() as well.
> 
> 	3) iput(NULL) is (fortunately) a no-op.  And that is the only case
> when iput() in there does not cause immediate trouble.
> 
> 	4) failure in binderfs_fill_super() leads to a call of
> deactivate_locked_super().  Which calls binderfs_kill_super(),
> which does put_ipc_ns(info->ipc_ns).  IOW,
>         put_ipc_ns(ipc_ns);
> in failure exits of binderfs_fill_super() is also a double-put...
> 
> 	5) ... if, that is, the memory that used to hold sb->s_fs_info
> until the same failure exit has kfreed it has already been stomped onto,
> in which case you get put_ipc_ns() done on arbitrary address that might've
> never held a struct ipc_namespace instance.
> 
> 	What I really don't get here is why do you go to such contortions
> in the first place.  Every single thing done in err_without_dentry is wrong.
> And the only thing done in err_with_dentry before it hits err_without_dentry
> is pointless; deactivate_locked_super() will do that dput() just fine.
> Whereas the simple return ret; (or return -ENOMEM in most of the cases)
> would've worked correctly...
> 
> 	Actually, looking at that a bit more,
> 	6) initially you have ->s_fs_info contain a pointer to
> struct ipc_namespace.  Then binderfs_fill_super() tries to allocate
> a struct binderfs_info instance and stick *that* in place of
> ->s_fs_info, moving the original value into info->ipc_ns.  That's
> Not Nice(tm) towards the poor binderfs_kill_super(), which has no
> fscking way to tell which kind of pointer is it going to find
> there.  As it is, it assumes that replacement has been done and
> drops ->ipc_ns it finds in there.  Guess what happens if you
> fail on attempt to allocate struct binderfs_info in the first
> place?
> 
> 	7) binderfs_test_super() also assumes that all superblocks it
> is asked to look at will have ->s_fs_info pointing to struct binderfs_info.
> binderfs_set_super(), OTOH, sets it to struct ipc_namespace pointer it
> (ultimately - sget_userns()) has been passed as 'data'.  So if two mounts
> race, you'll get a false negative out of binderfs_test_super().
> 
> 	8) binderfs_binder_ctl_create() is a no-op on subsequent
> calls (if there had been any).  And the first call is done before
> we unlock the superblock, so locking the root is completely pointless.
> 
> 	9) the one thing ->s_fs_info can't be in the whole thing is NULL.
> There are two assignments - one in binderfs_set_super(), where we set
> it to the last argument of sget_userns() and another in binderfs_fill_super()
> where we set it to 'info'.  In both cases we have that pointer dereferenced
> a little bit earlier - while evaluating the next-to-last argument of the
> same sget_userns() call or in setting info->root_uid in the other place.
> Both would've obviously oopsed before storing NULL there.  IOW, checking
> whether it's NULL is pointless both in binderfs_test_super() and in
> binderfs_kill_super().
> 
> 	10) your ->unlink() checks if the victim is ->control_dentry.
> Your ->rename(), OTOH, does not, which renders the check in ->unlink()
> trivial to bypass.
> 
> 	11) speaking of the tests that are there in ->rename(),
>         /* binderfs doesn't support directories. */
>         if (d_is_dir(old_dentry))
>                 return -EPERM;
> is... interesting.  If the comment is correct, the check is pointless.
> And AFAICS the comment *is* correct...  Another check in there
>         if (!simple_empty(new_dentry))
>                 return -ENOTEMPTY;
> is equally pointless, for the same reason...
> 
> 	12) what is the comment in binderfs_unlink() supposed to mean?
> In ->unlink() we *already* have parent locked...

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [heads-up] binderfs bugs galore...
  2019-01-17 22:40 [heads-up] binderfs bugs galore Al Viro
  2019-01-17 22:46 ` Christian Brauner
@ 2019-01-18  7:57 ` Christian Brauner
  2019-01-18 20:02   ` Al Viro
  1 sibling, 1 reply; 4+ messages in thread
From: Christian Brauner @ 2019-01-18  7:57 UTC (permalink / raw)
  To: Al Viro; +Cc: Greg Kroah-Hartman, linux-fsdevel

Looking at some of the comments it seems you have been looking at the
code that is currently in git master.
Some of the bugs you're pointing at are already fixed and patches for it
are sitting in Greg's tree. :)

The most crucial source of bugs binderfs_mount() is completely gone in
Greg's tree already because I gutted it to just do:

static struct dentry *binderfs_mount(struct file_system_type *fs_type,
				     int flags, const char *dev_name,
				     void *data)
{
	return mount_nodev(fs_type, flags, data, binderfs_fill_super);
}

since we want to have each binderfs mount to be a new instance.
That means binderfs_set_super() and binderfs_test_super are gone.
So there's no weird mucking with sget_userns() and s_fs_info anymore
going on in binderfs_mount().
That means your points from 5) to 9) should be taken care of.

I have already a patch that remove unnecessary checks and an invalid
comment that you pointed out int 10) to 12).

I'm currently fixing the double put issues 1) to 5) based on your
suggestions from your private mail. Thanks for that!

Thanks, this was overall really helpful.
Christian

On Thu, Jan 17, 2019 at 10:40:11PM +0000, Al Viro wrote:
> 	1) d_make_root() does iput() on failure.  IOW, if you hit that case,
> you get double iput() in err_without_dentry.
> 
> 	2) final dput() does iput() (unsurprisingly).  IOW, anyone hitting
> err_with_dentry will get double iput() as well.
> 
> 	3) iput(NULL) is (fortunately) a no-op.  And that is the only case
> when iput() in there does not cause immediate trouble.
> 
> 	4) failure in binderfs_fill_super() leads to a call of
> deactivate_locked_super().  Which calls binderfs_kill_super(),
> which does put_ipc_ns(info->ipc_ns).  IOW,
>         put_ipc_ns(ipc_ns);
> in failure exits of binderfs_fill_super() is also a double-put...
> 
> 	5) ... if, that is, the memory that used to hold sb->s_fs_info
> until the same failure exit has kfreed it has already been stomped onto,
> in which case you get put_ipc_ns() done on arbitrary address that might've
> never held a struct ipc_namespace instance.
> 
> 	What I really don't get here is why do you go to such contortions
> in the first place.  Every single thing done in err_without_dentry is wrong.
> And the only thing done in err_with_dentry before it hits err_without_dentry
> is pointless; deactivate_locked_super() will do that dput() just fine.
> Whereas the simple return ret; (or return -ENOMEM in most of the cases)
> would've worked correctly...
> 
> 	Actually, looking at that a bit more,
> 	6) initially you have ->s_fs_info contain a pointer to
> struct ipc_namespace.  Then binderfs_fill_super() tries to allocate
> a struct binderfs_info instance and stick *that* in place of
> ->s_fs_info, moving the original value into info->ipc_ns.  That's
> Not Nice(tm) towards the poor binderfs_kill_super(), which has no
> fscking way to tell which kind of pointer is it going to find
> there.  As it is, it assumes that replacement has been done and
> drops ->ipc_ns it finds in there.  Guess what happens if you
> fail on attempt to allocate struct binderfs_info in the first
> place?
> 
> 	7) binderfs_test_super() also assumes that all superblocks it
> is asked to look at will have ->s_fs_info pointing to struct binderfs_info.
> binderfs_set_super(), OTOH, sets it to struct ipc_namespace pointer it
> (ultimately - sget_userns()) has been passed as 'data'.  So if two mounts
> race, you'll get a false negative out of binderfs_test_super().
> 
> 	8) binderfs_binder_ctl_create() is a no-op on subsequent
> calls (if there had been any).  And the first call is done before
> we unlock the superblock, so locking the root is completely pointless.
> 
> 	9) the one thing ->s_fs_info can't be in the whole thing is NULL.
> There are two assignments - one in binderfs_set_super(), where we set
> it to the last argument of sget_userns() and another in binderfs_fill_super()
> where we set it to 'info'.  In both cases we have that pointer dereferenced
> a little bit earlier - while evaluating the next-to-last argument of the
> same sget_userns() call or in setting info->root_uid in the other place.
> Both would've obviously oopsed before storing NULL there.  IOW, checking
> whether it's NULL is pointless both in binderfs_test_super() and in
> binderfs_kill_super().
> 
> 	10) your ->unlink() checks if the victim is ->control_dentry.
> Your ->rename(), OTOH, does not, which renders the check in ->unlink()
> trivial to bypass.

Thanks for catching that. I will prevent the rename for control_dentry
too.

> 
> 	11) speaking of the tests that are there in ->rename(),
>         /* binderfs doesn't support directories. */
>         if (d_is_dir(old_dentry))
>                 return -EPERM;
> is... interesting.  If the comment is correct, the check is pointless.
> And AFAICS the comment *is* correct...  Another check in there
>         if (!simple_empty(new_dentry))
>                 return -ENOTEMPTY;
> is equally pointless, for the same reason...

Thanks. The checks are removed now.

> 
> 	12) what is the comment in binderfs_unlink() supposed to mean?
> In ->unlink() we *already* have parent locked...

Comment is removed and leftover from when this did something more
"interesting". I'll remove it.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [heads-up] binderfs bugs galore...
  2019-01-18  7:57 ` Christian Brauner
@ 2019-01-18 20:02   ` Al Viro
  0 siblings, 0 replies; 4+ messages in thread
From: Al Viro @ 2019-01-18 20:02 UTC (permalink / raw)
  To: Christian Brauner; +Cc: Greg Kroah-Hartman, linux-fsdevel

On Fri, Jan 18, 2019 at 08:57:14AM +0100, Christian Brauner wrote:

> since we want to have each binderfs mount to be a new instance.
> That means binderfs_set_super() and binderfs_test_super are gone.
> So there's no weird mucking with sget_userns() and s_fs_info anymore
> going on in binderfs_mount().
> That means your points from 5) to 9) should be taken care of.
> 
> I have already a patch that remove unnecessary checks and an invalid
> comment that you pointed out int 10) to 12).
> 
> I'm currently fixing the double put issues 1) to 5) based on your
> suggestions from your private mail. Thanks for that!

OK...  As for the cleanups on failure exits, the basic idea is to
(re)use the cleanups ->kill_sb() has to do anyway.  And AFAICS for
binderfs it's as simple as
	* grab ipcns reference right after you've allocated ->s_fs_info
and stick it in there right away.
	* make all failure exits in binderfs_fill_super() plain returns
	* make binderfs_kill_super() pick ->s_fs_info, do kill_litter_super(),
then, if info wasn't NULL, drop ipcns ref and free info.
All there is to it...

We used to have tons of code duplication between the failure exits of
mount and normal behaviour on umount; that caused a lot of rarely
tested boilerplate code.  One can still follow that model (stick the
teardown on umount into ->put_super(), duplicate its parts on failure
exits in ->mount()), but it can be more convenient to use an explicit
->kill_sb() instead.  That's the reason why ->kill_sb() is always
called for anything that made it out of sget(), whether fully
set up or not.

You still need to be careful about doing initialization and teardown in
compatible orders, but usually that's not hard to do.  Sure, it's
not guaranteed to turn all failure exits into plain returns (e.g.
if one does a temporary allocation that is freed in the end on setup,
->kill_sb() won't be freeing it, so any failure exits will have
to take care of that), but the amount and complexity of cleanups
is less that way.

The same considerations, BTW, had been the reason for calling conventions
of d_make_root() - simpler cleanup on failures that way...

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-01-18 20:02 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-17 22:40 [heads-up] binderfs bugs galore Al Viro
2019-01-17 22:46 ` Christian Brauner
2019-01-18  7:57 ` Christian Brauner
2019-01-18 20:02   ` Al Viro

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).