All of lore.kernel.org
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org, James Morris <jmorris@namei.org>,
	linux-security-module@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	David Safford <safford@linux.vnet.ibm.com>,
	Dmitry Kasatkin <dmitry.kasatkin@intel.com>,
	Mimi Zohar <zohar@linux.vnet.ibm.com>,
	David Miller <davem@davemloft.net>
Subject: Re: [RFC] situation with fput() locking (was Re: [PULL REQUEST] : ima-appraisal patches)
Date: Fri, 20 Apr 2012 10:21:35 -0700	[thread overview]
Message-ID: <CA+55aFzuTspDyyLaOA-g-dTWydaUeeWo9uVGR+rZ=ZJzPW_Ocw@mail.gmail.com> (raw)
In-Reply-To: <20120420164239.GH6871@ZenIV.linux.org.uk>

On Fri, Apr 20, 2012 at 9:42 AM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> Actually, I like the per-CPU spinlock variant better; the thing is,
> with that scheme we get normal fput() (i.e. non-nodefer variant)
> non-blocking.  How about this:

What's the advantage of a per-cpu lock?

If you make the work be per-cpu, then you're better with no locking at
all: just disable interrupts (which you do anyway).

And if you want to use a spinlock, don't bother with the percpu side.

The thing I do not like about the schedule_work approach is that it
(a) totally hides the real cost  - which is the scheduling - and (b)
it is so asynchronous that it will happen potentially long after the
task dropped the reference.

And seriously - that is user-visible behavior.

For example, think about this *common* pattern:

  open+mmap+close+unlink+munmap

which would trigger the whole deferred fput, but also triggers the
actual real unlink() at fput time.

Right now, you can have that kind of thing in a program and
immediately unmount the filesystem afterwards (replace "unmount" with
"cannot see silly-renamed files" etc).

The "totally asynchronous deferral" literally *breaks*semantics*.

Sure, it won't be noticeable in 99.99% of all cases, and I doubt you
can trigger much of a test for it. But it's potential real breakage,
and it's going to be hard to ever see. And then when it *does* happen,
it's going to be totally impossible to debug.

It's not just the "last unlink" thing that gets delayed. It things
like file locking. It's "drop_file_write_access()". It's whatever
random thing that file does at "release()". It's a ton of things like
that. Delaying them has user-visible actions.

That's a whole can of complexities and worries outside of the kernel
interface that you are completely ignoring - just because you are
trying to solve the *simple* complexity of locking interaction
entirely within the kernel.

I think that's a bit myopic. We don't even *know* what the problems
with the async approach might be. Your "simple" solution is simple
only inside the kernel.

This is why I suggested you look at Oleg's patches. If we guarantee
that things won't be delayed past re-entering user mode, all those
issues go away.

                     Linus

WARNING: multiple messages have this Message-ID (diff)
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org, James Morris <jmorris@namei.org>,
	linux-security-module@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	David Safford <safford@linux.vnet.ibm.com>,
	Dmitry Kasatkin <dmitry.kasatkin@intel.com>,
	Mimi Zohar <zohar@linux.vnet.ibm.com>,
	David Miller <davem@davemloft.net>
Subject: Re: [RFC] situation with fput() locking (was Re: [PULL REQUEST] : ima-appraisal patches)
Date: Fri, 20 Apr 2012 10:21:35 -0700	[thread overview]
Message-ID: <CA+55aFzuTspDyyLaOA-g-dTWydaUeeWo9uVGR+rZ=ZJzPW_Ocw@mail.gmail.com> (raw)
In-Reply-To: <20120420164239.GH6871@ZenIV.linux.org.uk>

On Fri, Apr 20, 2012 at 9:42 AM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> Actually, I like the per-CPU spinlock variant better; the thing is,
> with that scheme we get normal fput() (i.e. non-nodefer variant)
> non-blocking.  How about this:

What's the advantage of a per-cpu lock?

If you make the work be per-cpu, then you're better with no locking at
all: just disable interrupts (which you do anyway).

And if you want to use a spinlock, don't bother with the percpu side.

The thing I do not like about the schedule_work approach is that it
(a) totally hides the real cost  - which is the scheduling - and (b)
it is so asynchronous that it will happen potentially long after the
task dropped the reference.

And seriously - that is user-visible behavior.

For example, think about this *common* pattern:

  open+mmap+close+unlink+munmap

which would trigger the whole deferred fput, but also triggers the
actual real unlink() at fput time.

Right now, you can have that kind of thing in a program and
immediately unmount the filesystem afterwards (replace "unmount" with
"cannot see silly-renamed files" etc).

The "totally asynchronous deferral" literally *breaks*semantics*.

Sure, it won't be noticeable in 99.99% of all cases, and I doubt you
can trigger much of a test for it. But it's potential real breakage,
and it's going to be hard to ever see. And then when it *does* happen,
it's going to be totally impossible to debug.

It's not just the "last unlink" thing that gets delayed. It things
like file locking. It's "drop_file_write_access()". It's whatever
random thing that file does at "release()". It's a ton of things like
that. Delaying them has user-visible actions.

That's a whole can of complexities and worries outside of the kernel
interface that you are completely ignoring - just because you are
trying to solve the *simple* complexity of locking interaction
entirely within the kernel.

I think that's a bit myopic. We don't even *know* what the problems
with the async approach might be. Your "simple" solution is simple
only inside the kernel.

This is why I suggested you look at Oleg's patches. If we guarantee
that things won't be delayed past re-entering user mode, all those
issues go away.

                     Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2012-04-20 17:22 UTC|newest]

Thread overview: 105+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-04-18 13:04 [PULL REQUEST] : ima-appraisal patches Mimi Zohar
2012-04-18 15:02 ` James Morris
2012-04-18 18:07   ` Mimi Zohar
2012-04-18 18:39     ` Al Viro
2012-04-18 20:56       ` Mimi Zohar
2012-04-19 19:57       ` Mimi Zohar
2012-04-20  0:43         ` [RFC] situation with fput() locking (was Re: [PULL REQUEST] : ima-appraisal patches) Al Viro
2012-04-20  2:31           ` Linus Torvalds
2012-04-20  2:31             ` Linus Torvalds
2012-04-20  2:54             ` Al Viro
2012-04-20  2:58               ` Linus Torvalds
2012-04-20  2:58                 ` Linus Torvalds
2012-04-20  8:09                 ` Al Viro
2012-04-20 15:56                   ` Linus Torvalds
2012-04-20 15:56                     ` Linus Torvalds
2012-04-20 16:08                     ` Al Viro
2012-04-20 16:42                       ` Al Viro
2012-04-20 17:21                         ` Linus Torvalds [this message]
2012-04-20 17:21                           ` Linus Torvalds
2012-04-20 18:07                           ` Al Viro
2012-04-23 18:01                             ` [RFC] TIF_NOTIFY_RESUME, arch/*/*/*signal*.c and all such Al Viro
2012-04-23 18:37                               ` Oleg Nesterov
2012-04-24  7:26                               ` Al Viro
2012-04-25  3:06                                 ` Al Viro
2012-04-25 12:37                                   ` Oleg Nesterov
2012-04-25 12:50                                     ` Al Viro
2012-04-25 13:03                                       ` Oleg Nesterov
2012-04-25 13:32                                         ` Oleg Nesterov
2012-04-25 13:32                                         ` Al Viro
2012-04-25 14:52                                           ` Oleg Nesterov
2012-04-25 15:46                                             ` Oleg Nesterov
2012-04-25 16:10                                               ` Al Viro
2012-04-25 17:02                                                 ` Oleg Nesterov
2012-04-25 17:51                                                   ` Al Viro
2012-04-26  7:15                                                     ` Martin Schwidefsky
2012-04-26  7:25                                                       ` David Miller
2012-04-26 13:52                                                       ` Oleg Nesterov
2012-04-26 14:31                                                         ` Martin Schwidefsky
2012-04-26 13:22                                                     ` Oleg Nesterov
2012-04-26 18:37                                 ` Oleg Nesterov
2012-04-26 23:19                                   ` Al Viro
2012-04-27 17:24                                     ` Oleg Nesterov
2012-04-27 17:54                                       ` Oleg Nesterov
2012-05-02 10:37                                         ` Matt Fleming
2012-05-02 14:14                                           ` Al Viro
2012-04-27 18:45                                       ` Al Viro
2012-04-27 19:14                                         ` Geert Uytterhoeven
2012-04-27 19:34                                           ` Al Viro
2012-04-29 22:51                                             ` Al Viro
2012-04-30  6:39                                               ` Greg Ungerer
2012-04-30  6:39                                                 ` Greg Ungerer
2012-04-27 19:42                                         ` Al Viro
2012-04-27 20:20                                         ` Roland McGrath
2012-04-27 21:12                                           ` Al Viro
2012-04-27 21:27                                             ` Roland McGrath
2012-04-27 23:15                                               ` Al Viro
2012-04-27 23:32                                                 ` Al Viro
2012-04-29  4:12                                                   ` Al Viro
2012-04-30  8:06                                                     ` Martin Schwidefsky
2012-04-27 23:50                                                 ` Al Viro
2012-04-28 18:51                                                   ` [PATCH] arch/tile: avoid calling do_signal() after fork from a kernel thread Chris Metcalf
2012-04-28 18:51                                                     ` Chris Metcalf
2012-04-28 20:55                                                     ` Al Viro
2012-04-28 21:46                                                       ` Chris Metcalf
2012-04-28 21:46                                                         ` Chris Metcalf
2012-04-29  0:55                                                         ` Al Viro
2012-04-28 18:51                                                           ` [PATCH v2] arch/tile: fix up some issues in calling do_work_pending() Chris Metcalf
2012-04-28 18:51                                                             ` Chris Metcalf
2012-04-29  3:49                                                           ` [PATCH] arch/tile: avoid calling do_signal() after fork from a kernel thread Chris Metcalf
2012-04-29  3:49                                                             ` Chris Metcalf
2012-04-28  2:42                                                 ` [RFC] TIF_NOTIFY_RESUME, arch/*/*/*signal*.c and all such Al Viro
2012-04-28  3:32                                                   ` Al Viro
2012-04-28  3:36                                                     ` Al Viro
2012-04-29 16:33                                                     ` Oleg Nesterov
2012-04-29 16:18                                                   ` Oleg Nesterov
2012-04-29 18:05                                                     ` Al Viro
2012-05-01  4:31                                                       ` Al Viro
2012-05-01  5:06                                                         ` Mike Frysinger
2012-05-01  5:52                                                           ` Al Viro
2012-05-02 17:24                                                             ` Al Viro
2012-05-02 18:30                                                       ` Oleg Nesterov
2012-04-29 16:41                                         ` Oleg Nesterov
2012-04-29 18:09                                           ` Al Viro
2012-04-29 18:25                                             ` Oleg Nesterov
2012-04-20  3:15               ` [RFC] situation with fput() locking (was Re: [PULL REQUEST] : ima-appraisal patches) Al Viro
2012-04-20 18:54           ` Hugh Dickins
2012-04-20 19:04             ` Al Viro
2012-04-20 19:18               ` Linus Torvalds
2012-04-20 19:32                 ` Hugh Dickins
2012-04-20 19:58                 ` Al Viro
2012-04-20 21:12                   ` Linus Torvalds
2012-04-20 21:12                     ` Linus Torvalds
2012-04-20 22:13                     ` Al Viro
2012-04-20 22:35                       ` Linus Torvalds
2012-04-20 22:35                         ` Linus Torvalds
2012-04-27  7:35                         ` Kasatkin, Dmitry
2012-04-27 17:34                           ` Al Viro
2012-04-27 18:52                             ` Kasatkin, Dmitry
2012-04-27 18:52                               ` Kasatkin, Dmitry
2012-04-27 19:15                               ` Kasatkin, Dmitry
2012-04-30 14:32                             ` Mimi Zohar
2012-04-30 14:32                               ` Mimi Zohar
2012-05-03  4:23                               ` James Morris
2012-05-03  4:23                                 ` James Morris
2012-04-20 19:37               ` Al Viro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CA+55aFzuTspDyyLaOA-g-dTWydaUeeWo9uVGR+rZ=ZJzPW_Ocw@mail.gmail.com' \
    --to=torvalds@linux-foundation.org \
    --cc=davem@davemloft.net \
    --cc=dmitry.kasatkin@intel.com \
    --cc=jmorris@namei.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=safford@linux.vnet.ibm.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=zohar@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.