All of lore.kernel.org
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Dave Chinner <david@fromorbit.com>,
	LSM List <linux-security-module@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Mimi Zohar <zohar@linux.vnet.ibm.com>,
	Christoph Hellwig <hch@infradead.org>,
	"Theodore Ts'o" <tytso@mit.edu>, Jan Kara <jack@suse.cz>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-integrity@vger.kernel.org
Subject: Re: [RFC PATCH 3/3] fs: detect that the i_rwsem has already been taken exclusively
Date: Sun, 1 Oct 2017 15:20:04 -0700	[thread overview]
Message-ID: <CA+55aFwD=+5trJbmuc2SL-DYcdmt2p4gq1uPBp8mznj1JYSVTg@mail.gmail.com> (raw)
In-Reply-To: <87shf2jzfr.fsf@xmission.com>

On Sun, Oct 1, 2017 at 3:06 PM, Eric W. Biederman <ebiederm@xmission.com> wrote:
>
> Unless I misread something it was being pointed out there are some vfs
> operations today on which ima writes an ima xattr as a side effect.  And
> those operations hold the i_sem.  So perhaps I am misunderstanding
> things or writing the ima xattr needs to happen at some point.  Which
> implies something like queued work.

So the issue is indeed the inode semaphore, as it is used by IMA. But
all these IMA patches to work around the issue are just horribly ugly.
One adds a VFS-layer filesystem method that most filesystems end up
not really needing (it's the same as the regular read), and other
filesystems end up then having hacks with ("oh, I don't need to take
this lock because it was already taken by the caller").

The second patch attempt avoided the need for a new filesystem method,
but added a flag in an annoying place (for the same basic logic). The
advantage is that now most filesystems don't actually need to care any
more (and the filesystems that used to care now check that flag).

There was discussion about moving the flag to a mode convenient spot,
which would have made it a lot less intrusive.

But the basic issue is that almost always when you see lock
inversions, the problem can just be fixed by doing the locking
differently instead.

And that's what I was/am pushing for.

There really are two totally different issues:

 - the integrity _measurement_.

   This one wants to be serialized, so that you don't have multiple
concurrent measurements, and the serialization fundamentally has to be
around all the IO, so this lock pretty much has to be outside the
i_sem.

 - the integrity invalidation on certain operations.

   This one fundamentally had to be inside the i_sem, since some of
the operations that cause this end up already holding the i_sem at a
VFS layer.

so you had these two different requirements (inside _and_ outside),
and the IMA approach was basically to avoid the problem by making
i_sem *the* lock, and then making the IO routines aware of it already
being held. That does solve the inside/outside issue.

But the simpler way to fix it is to simply use two locks that nest
inside each other, with i_sem nesting in the middle.  That just avoids
the problem entirely, and doesn't require anybody to ever care about
i_sem semantic changes, because i_sem semantics simply didn't change
at all.

So that's the approach I'm pushing. I admittedly haven't actually
looked at the IMA details, but from a high-level standpoint you can
basically describe it (as above) without having to care too much about
exactly what IMA even wants.

The two-lock approach does require that the operations that invalidate
the integrity measurements always only invalidate it, and don't try to
re-compute it. But I suspect that would be entirely insane anyway
(imagine a world where "setxattr" would have to read the whole file
contents in order to revalidate the integrity measurement - even if
there is nobody who even *cares*).

           Linus

WARNING: multiple messages have this Message-ID (diff)
From: torvalds@linux-foundation.org (Linus Torvalds)
To: linux-security-module@vger.kernel.org
Subject: [RFC PATCH 3/3] fs: detect that the i_rwsem has already been taken exclusively
Date: Sun, 1 Oct 2017 15:20:04 -0700	[thread overview]
Message-ID: <CA+55aFwD=+5trJbmuc2SL-DYcdmt2p4gq1uPBp8mznj1JYSVTg@mail.gmail.com> (raw)
In-Reply-To: <87shf2jzfr.fsf@xmission.com>

On Sun, Oct 1, 2017 at 3:06 PM, Eric W. Biederman <ebiederm@xmission.com> wrote:
>
> Unless I misread something it was being pointed out there are some vfs
> operations today on which ima writes an ima xattr as a side effect.  And
> those operations hold the i_sem.  So perhaps I am misunderstanding
> things or writing the ima xattr needs to happen at some point.  Which
> implies something like queued work.

So the issue is indeed the inode semaphore, as it is used by IMA. But
all these IMA patches to work around the issue are just horribly ugly.
One adds a VFS-layer filesystem method that most filesystems end up
not really needing (it's the same as the regular read), and other
filesystems end up then having hacks with ("oh, I don't need to take
this lock because it was already taken by the caller").

The second patch attempt avoided the need for a new filesystem method,
but added a flag in an annoying place (for the same basic logic). The
advantage is that now most filesystems don't actually need to care any
more (and the filesystems that used to care now check that flag).

There was discussion about moving the flag to a mode convenient spot,
which would have made it a lot less intrusive.

But the basic issue is that almost always when you see lock
inversions, the problem can just be fixed by doing the locking
differently instead.

And that's what I was/am pushing for.

There really are two totally different issues:

 - the integrity _measurement_.

   This one wants to be serialized, so that you don't have multiple
concurrent measurements, and the serialization fundamentally has to be
around all the IO, so this lock pretty much has to be outside the
i_sem.

 - the integrity invalidation on certain operations.

   This one fundamentally had to be inside the i_sem, since some of
the operations that cause this end up already holding the i_sem at a
VFS layer.

so you had these two different requirements (inside _and_ outside),
and the IMA approach was basically to avoid the problem by making
i_sem *the* lock, and then making the IO routines aware of it already
being held. That does solve the inside/outside issue.

But the simpler way to fix it is to simply use two locks that nest
inside each other, with i_sem nesting in the middle.  That just avoids
the problem entirely, and doesn't require anybody to ever care about
i_sem semantic changes, because i_sem semantics simply didn't change
at all.

So that's the approach I'm pushing. I admittedly haven't actually
looked at the IMA details, but from a high-level standpoint you can
basically describe it (as above) without having to care too much about
exactly what IMA even wants.

The two-lock approach does require that the operations that invalidate
the integrity measurements always only invalidate it, and don't try to
re-compute it. But I suspect that would be entirely insane anyway
(imagine a world where "setxattr" would have to read the whole file
contents in order to revalidate the integrity measurement - even if
there is nobody who even *cares*).

           Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2017-10-01 22:20 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-28 12:39 [RFC PATCH 0/3] define new read_iter file operation rwf flag Mimi Zohar
2017-09-28 12:39 ` Mimi Zohar
2017-09-28 12:39 ` [RFC PATCH 1/3] fs: define new read_iter " Mimi Zohar
2017-09-28 12:39   ` Mimi Zohar
2017-09-28 13:54   ` Matthew Wilcox
2017-09-28 13:54     ` Matthew Wilcox
2017-09-28 14:33     ` Mimi Zohar
2017-09-28 14:33       ` Mimi Zohar
2017-09-28 15:51     ` Linus Torvalds
2017-09-28 15:51       ` Linus Torvalds
2017-09-28 12:39 ` [RFC PATCH 2/3] integrity: use call_read_iter to calculate the file hash Mimi Zohar
2017-09-28 12:39   ` Mimi Zohar
2017-09-28 12:39 ` [RFC PATCH 3/3] fs: detect that the i_rwsem has already been taken exclusively Mimi Zohar
2017-09-28 12:39   ` Mimi Zohar
2017-09-28 22:02   ` Dave Chinner
2017-09-28 22:02     ` Dave Chinner
2017-09-28 23:39     ` Linus Torvalds
2017-09-28 23:39       ` Linus Torvalds
2017-09-29  0:12       ` Mimi Zohar
2017-09-29  0:12         ` Mimi Zohar
2017-09-29  0:12         ` Mimi Zohar
2017-09-29  0:33         ` Linus Torvalds
2017-09-29  0:33           ` Linus Torvalds
2017-09-29  1:53           ` Mimi Zohar
2017-09-29  1:53             ` Mimi Zohar
2017-09-29  1:53             ` Mimi Zohar
2017-09-29  3:26             ` Linus Torvalds
2017-09-29  3:26               ` Linus Torvalds
2017-10-01  1:33               ` Eric W. Biederman
2017-10-01  1:33                 ` Eric W. Biederman
     [not found]                 ` <CA+55aFx726wT4VprN-sHm6s8Q_PV_VjhTBC4goEbMcerYU1Tig@mail.gmail.com>
2017-10-01 12:08                   ` Mimi Zohar
2017-10-01 12:08                     ` Mimi Zohar
2017-10-01 12:08                     ` Mimi Zohar
2017-10-01 18:41                     ` Linus Torvalds
2017-10-01 18:41                       ` Linus Torvalds
2017-10-01 22:34                       ` Dave Chinner
2017-10-01 22:34                         ` Dave Chinner
2017-10-01 23:15                         ` Linus Torvalds
2017-10-01 23:15                           ` Linus Torvalds
2017-10-02  3:54                           ` Dave Chinner
2017-10-02  3:54                             ` Dave Chinner
2017-10-01 23:42                         ` Mimi Zohar
2017-10-01 23:42                           ` Mimi Zohar
2017-10-01 23:42                           ` Mimi Zohar
2017-10-02  3:25                           ` Eric W. Biederman
2017-10-02  3:25                             ` Eric W. Biederman
2017-10-02  3:25                             ` Eric W. Biederman
2017-10-02 12:25                             ` Mimi Zohar
2017-10-02 12:25                               ` Mimi Zohar
2017-10-02 12:25                               ` Mimi Zohar
2017-10-02  4:35                           ` Dave Chinner
2017-10-02  4:35                             ` Dave Chinner
2017-10-02  4:35                             ` Dave Chinner
2017-10-02  4:35                             ` Dave Chinner
2017-10-02 12:09                             ` Mimi Zohar
2017-10-02 12:09                               ` Mimi Zohar
2017-10-02 12:09                               ` Mimi Zohar
2017-10-02 12:43                               ` Jeff Layton
2017-10-02 12:43                                 ` Jeff Layton
2017-10-01 22:06                   ` Eric W. Biederman
2017-10-01 22:06                     ` Eric W. Biederman
2017-10-01 22:20                     ` Linus Torvalds [this message]
2017-10-01 22:20                       ` Linus Torvalds
2017-10-01 23:54                       ` Mimi Zohar
2017-10-01 23:54                         ` Mimi Zohar
2017-10-01 23:54                         ` Mimi Zohar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CA+55aFwD=+5trJbmuc2SL-DYcdmt2p4gq1uPBp8mznj1JYSVTg@mail.gmail.com' \
    --to=torvalds@linux-foundation.org \
    --cc=david@fromorbit.com \
    --cc=ebiederm@xmission.com \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-integrity@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=tytso@mit.edu \
    --cc=zohar@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.