linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Luis R. Rodriguez" <mcgrof@kernel.org>
To: Theodore Ts'o <tytso@mit.edu>, colyli@suse.com
Cc: "Jan Kara" <jack@suse.cz>, "Aleksa Sarai" <asarai@suse.de>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	"Dave Chinner" <david@fromorbit.com>,
	"Михаил Гаврилов" <mikhail.v.gavrilov@gmail.com>,
	"Christoph Hellwig" <hch@infradead.org>,
	"Jan Blunck" <jblunck@infradead.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"Oscar Salvador" <osalvador@suse.com>,
	"Hannes Reinecke" <hare@suse.de>, xfs <linux-xfs@vger.kernel.org>,
	"Luis R. Rodriguez" <mcgrof@kernel.org>
Subject: Re: kernel BUG at fs/xfs/xfs_aops.c:853! in kernel 4.13 rc6
Date: Mon, 6 Nov 2017 11:25:34 -0800	[thread overview]
Message-ID: <CAB=NE6UK3463JfiZQFHUiMj=v6HDG0k+uEE-2OvRMsW7i1EMhA@mail.gmail.com> (raw)
In-Reply-To: <20171017141233.l3avshagrv7fr7xt@thunk.org>

On Tue, Oct 17, 2017 at 7:12 AM, Theodore Ts'o <tytso@mit.edu> wrote:
> On Tue, Oct 17, 2017 at 11:20:17AM +0200, Jan Kara wrote:
>> The operation we are speaking about here is different. It is more along the
>> lines of "release this device".  And in the current world of containers,
>> mount namespaces, etc. it is not trivial for userspace to implement this
>> using umount(2) as Ted points out. I believe we could do that by walking
>> through all mount points of a superblock and unmounting them (and I don't
>> want to get into a discussion how to efficiently implement that now but in
>> principle the kernel has all the necessary information).
>
> Yes, this is what I want.  And regardless of how efficiently or not
> the kernel can implement such an operatoin, by definition it will be
> more efficient than if we have to do it in userspace.

It seems most folks agree we could all benefit from this, to help
userspace with a sane implementation.

>> What I'm a bit concerned about is the "release device reference" part - for
>> a block device to stop looking busy we have to do that however then the
>> block device can go away and the filesystem isn't prepared to that - we
>> reference sb->s_bdev in lots of places, we have buffer heads which are part
>> of bdev page cache, and probably other indirect assumptions I forgot about
>> now.

Is this new operation really the only place where such type of work
could be useful for, or are there existing uses cases this sort of
functionality could also be used for?

For instance I don't think we do something similar to revokefs(2) (as
described below) when a devices has been removed from a system, you
seem to suggest we remove the dev from gendisk leaving it dangling and
invisible. But other than this, it would seem its up to the filesystem
to get anything else implemented correctly?

> This all doesn't have to be a single system call.  Perhaps it would
> make sense for first and second step to be one system call --- call it
> revokefs(2), perhaps.  And then the last step could be another system
> call --- maybe umountall(2).

Wouldn't *some* part of this also help *enhance* filesystem suspend /
thaw be used on system suspend / resume as well?

If I may, if we split these up, into two, say revokefs(2) and
umountall(2), how about:

a) revokefs(2): ensures all file descriptors for the fs are closed
   - blocks access attempts high up in VFS
   - point any file descriptor to a revoked null struct file
   - redirect any task struct CWD's so as if the directory had rmmdir'd
   - munmap any mapped regions

Of these only the first one seems useful for fs suspend?

b) umountall(2): properly unmounts filesystem from all namespaces
   - May need to verify if revokefs(2) was called, if so, now that all
file descriptors should
     be closed, do syncfs() to force out any dirty pages
   - unmount() in all namespaces, this takes care of any buffer or page
     cache reference once the ref count of the struct super block goes to
     to zero

  Luis

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-11-06 19:25 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CABXGCsOL+_OgC0dpO1+Zeg=iu7ryZRZT4S7k-io8EGB0ZRgZGw@mail.gmail.com>
2017-09-03  7:43 ` kernel BUG at fs/xfs/xfs_aops.c:853! in kernel 4.13 rc6 Christoph Hellwig
2017-09-03 14:08   ` Михаил Гаврилов
2017-09-04 12:30     ` Jan Kara
2017-10-07  8:10       ` Михаил Гаврилов
2017-10-07  9:22         ` Михаил Гаврилов
2017-10-09  0:05         ` Dave Chinner
2017-10-09 18:31           ` Luis R. Rodriguez
2017-10-09 19:02             ` Eric W. Biederman
2017-10-15  8:53               ` Aleksa Sarai
2017-10-15 13:06                 ` Theodore Ts'o
2017-10-15 22:14                   ` Eric W. Biederman
2017-10-15 23:22                     ` Dave Chinner
2017-10-16 17:44                       ` Eric W. Biederman
2017-10-16 21:38                         ` Dave Chinner
2017-10-16  1:13                     ` Theodore Ts'o
2017-10-16 17:53                       ` Eric W. Biederman
2017-10-16 18:50                         ` Theodore Ts'o
2017-10-16 22:00                       ` Dave Chinner
2017-10-17  1:34                         ` Theodore Ts'o
2017-10-17  0:59                       ` Aleksa Sarai
2017-10-17  9:20                         ` Jan Kara
2017-10-17 14:12                           ` Theodore Ts'o
2017-11-06 19:25                             ` Luis R. Rodriguez [this message]
2017-11-07 15:26                               ` Jan Kara
2017-10-09 22:28             ` Dave Chinner
2017-10-10  7:57               ` Jan Kara
2017-09-04  1:43   ` Dave Chinner
2017-09-04  2:20     ` Darrick J. Wong
2017-09-04 12:14       ` Jan Kara
2017-09-04 22:36         ` Dave Chinner
2017-09-05 16:17           ` Jan Kara
2017-09-05 23:42             ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAB=NE6UK3463JfiZQFHUiMj=v6HDG0k+uEE-2OvRMsW7i1EMhA@mail.gmail.com' \
    --to=mcgrof@kernel.org \
    --cc=asarai@suse.de \
    --cc=colyli@suse.com \
    --cc=david@fromorbit.com \
    --cc=ebiederm@xmission.com \
    --cc=hare@suse.de \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=jblunck@infradead.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=mikhail.v.gavrilov@gmail.com \
    --cc=osalvador@suse.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).