All of lore.kernel.org
 help / color / mirror / Atom feed
From: Al Viro <viro@zeniv.linux.org.uk>
To: Christoph Hellwig <hch@lst.de>
Cc: Christian Brauner <brauner@kernel.org>,
	Heiko Carstens <hca@linux.ibm.com>,
	Vasily Gorbik <gor@linux.ibm.com>,
	Alexander Gordeev <agordeev@linux.ibm.com>,
	Fenghua Yu <fenghua.yu@intel.com>,
	Reinette Chatre <reinette.chatre@intel.com>,
	Miquel Raynal <miquel.raynal@bootlin.com>,
	Richard Weinberger <richard@nod.at>,
	Vignesh Raghavendra <vigneshr@ti.com>,
	Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>,
	Tejun Heo <tj@kernel.org>,
	Trond Myklebust <trond.myklebust@hammerspace.com>,
	Anna Schumaker <anna@kernel.org>,
	Kees Cook <keescook@chromium.org>,
	Damien Le Moal <dlemoal@kernel.org>,
	Naohiro Aota <naohiro.aota@wdc.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	linux-usb@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-s390@vger.kernel.org, linux-rdma@vger.kernel.org,
	linux-nfs@vger.kernel.org, linux-hardening@vger.kernel.org,
	cgroups@vger.kernel.org
Subject: Re: [PATCH 11/19] fs: add new shutdown_sb and free_sb methods
Date: Thu, 14 Sep 2023 03:07:28 +0100	[thread overview]
Message-ID: <20230914020728.GF800259@ZenIV> (raw)
In-Reply-To: <20230913111013.77623-12-hch@lst.de>

On Wed, Sep 13, 2023 at 08:10:05AM -0300, Christoph Hellwig wrote:
> Currently super_blocks are shut down using the ->kill_sb method, which
> must call generic_shutdown_super, but allows the file system to
> add extra work before or after the call to generic_shutdown_super.
> 
> File systems tend to get rather confused by this, so add an alternative
> shutdown sequence where generic_shutdown_super is called by the core
> code, and there are extra ->shutdown_sb and ->free_sb hooks before and
> after it.  To remove the amount of boilerplate code ->shutdown_sb is only
> called if the super has finished initialization and ->d_root is set.

The last sentence doesn't match the patchset.  That aside, there
is an issue with method names.

->shutdown_sb() is... odd.  ->begin_shutdown_sb(), perhaps?  For the
majority of filesystems it's NULL, after all...

Worse, ->free_sb() is seriously misguiding - the name implies that
we are, well, freeing a superblock passed to it.  Which is not what is
happening here - superblock itself is freed only when all passive
references go away.  It's asking for trouble down the road.

We already have more than enough confusion in the area.  Note, BTW,
that there's a delicate issue around RCU accesses and freeing stuff -
->d_compare() can bloody well be called when superblock is getting
shut down.  For anything that might be needed by it (or by other
RCU'd methods) we must arrange for RCU-delayed destruction.
E.g. in case of fatfs we have sbi freeing done via call_rcu() (from
fat_put_super(), called by generic_shutdown_super()).

<checks the current tree>

Oh, bugger...  AFAICS, exfat has a problem - exfat_free_sbi() is called
directly from exfat_kill_sb(), without any concern for this:
static int exfat_utf8_d_cmp(const struct dentry *dentry, unsigned int len,
                const char *str, const struct qstr *name)
{
        struct super_block *sb = dentry->d_sb;
        unsigned int alen = exfat_striptail_len(name->len, name->name,
                                EXFAT_SB(sb)->options.keep_last_dots);

That kfree() needs to be RCU-delayed...  While we are at it, there's
this:
static int exfat_d_hash(const struct dentry *dentry, struct qstr *qstr)
{
        struct super_block *sb = dentry->d_sb;
        struct nls_table *t = EXFAT_SB(sb)->nls_io;
and we need this
        unload_nls(sbi->nls_io);
in exfat_put_super() RCU-delayed as well.  And I suspect that
        exfat_free_upcase_table(sbi);
right after it needs the same treatment.

AFFS: similar problem, wants ->s_fs_info freeing RCU-delayed.

hfsplus: similar, including non-delayed unlock_nls() calls.

ntfs3:
        /*
         * Try slow way with current upcase table
         */
        sbi = dentry->d_sb->s_fs_info;
        uni1 = __getname();
        if (!uni1)
                return -ENOMEM;
__getname().  "Give me a page and you might block, while you are
at it".  Done from ->d_compare().  Called under dentry->d_lock
and rcu_read_lock().  OK, any further investigation of that
one is... probably not worth bothering with at that point.

Other in-tree instances appear to be correct.  I'll push fixes for
those (well, ntfs3 aside) out tomorrow.

WARNING: multiple messages have this Message-ID (diff)
From: Al Viro <viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
To: Christoph Hellwig <hch-jcswGhMUV9g@public.gmane.org>
Cc: Christian Brauner
	<brauner-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	Heiko Carstens <hca-tEXmvtCZX7AybS5Ee8rs3A@public.gmane.org>,
	Vasily Gorbik <gor-tEXmvtCZX7AybS5Ee8rs3A@public.gmane.org>,
	Alexander Gordeev
	<agordeev-tEXmvtCZX7AybS5Ee8rs3A@public.gmane.org>,
	Fenghua Yu <fenghua.yu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	Reinette Chatre
	<reinette.chatre-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	Miquel Raynal
	<miquel.raynal-LDxbnhwyfcJBDgjK7y7TUQ@public.gmane.org>,
	Richard Weinberger <richard-/L3Ra7n9ekc@public.gmane.org>,
	Vignesh Raghavendra <vigneshr-l0cyMroinI0@public.gmane.org>,
	Dennis Dalessandro
	<dennis.dalessandro-ntyVByD3zXaTtA8H5PvdGFaTQe2KTcn/@public.gmane.org>,
	Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	Trond Myklebust
	<trond.myklebust-F/q8l9xzQnoyLce1RVWEUA@public.gmane.org>,
	Anna Schumaker <anna-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>,
	Damien Le Moal <dlemoal-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	Naohiro Aota <naohiro.aota-Sjgp3cTcYWE@public.gmane.org>,
	Greg Kroah-Hartman
	<gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r@public.gmane.org>,
	linux-usb-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-s390-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-hardening-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	cgroups-u79uwXL29TZNg+MwTxZMZA@public.gmane.org
Subject: Re: [PATCH 11/19] fs: add new shutdown_sb and free_sb methods
Date: Thu, 14 Sep 2023 03:07:28 +0100	[thread overview]
Message-ID: <20230914020728.GF800259@ZenIV> (raw)
In-Reply-To: <20230913111013.77623-12-hch-jcswGhMUV9g@public.gmane.org>

On Wed, Sep 13, 2023 at 08:10:05AM -0300, Christoph Hellwig wrote:
> Currently super_blocks are shut down using the ->kill_sb method, which
> must call generic_shutdown_super, but allows the file system to
> add extra work before or after the call to generic_shutdown_super.
> 
> File systems tend to get rather confused by this, so add an alternative
> shutdown sequence where generic_shutdown_super is called by the core
> code, and there are extra ->shutdown_sb and ->free_sb hooks before and
> after it.  To remove the amount of boilerplate code ->shutdown_sb is only
> called if the super has finished initialization and ->d_root is set.

The last sentence doesn't match the patchset.  That aside, there
is an issue with method names.

->shutdown_sb() is... odd.  ->begin_shutdown_sb(), perhaps?  For the
majority of filesystems it's NULL, after all...

Worse, ->free_sb() is seriously misguiding - the name implies that
we are, well, freeing a superblock passed to it.  Which is not what is
happening here - superblock itself is freed only when all passive
references go away.  It's asking for trouble down the road.

We already have more than enough confusion in the area.  Note, BTW,
that there's a delicate issue around RCU accesses and freeing stuff -
->d_compare() can bloody well be called when superblock is getting
shut down.  For anything that might be needed by it (or by other
RCU'd methods) we must arrange for RCU-delayed destruction.
E.g. in case of fatfs we have sbi freeing done via call_rcu() (from
fat_put_super(), called by generic_shutdown_super()).

<checks the current tree>

Oh, bugger...  AFAICS, exfat has a problem - exfat_free_sbi() is called
directly from exfat_kill_sb(), without any concern for this:
static int exfat_utf8_d_cmp(const struct dentry *dentry, unsigned int len,
                const char *str, const struct qstr *name)
{
        struct super_block *sb = dentry->d_sb;
        unsigned int alen = exfat_striptail_len(name->len, name->name,
                                EXFAT_SB(sb)->options.keep_last_dots);

That kfree() needs to be RCU-delayed...  While we are at it, there's
this:
static int exfat_d_hash(const struct dentry *dentry, struct qstr *qstr)
{
        struct super_block *sb = dentry->d_sb;
        struct nls_table *t = EXFAT_SB(sb)->nls_io;
and we need this
        unload_nls(sbi->nls_io);
in exfat_put_super() RCU-delayed as well.  And I suspect that
        exfat_free_upcase_table(sbi);
right after it needs the same treatment.

AFFS: similar problem, wants ->s_fs_info freeing RCU-delayed.

hfsplus: similar, including non-delayed unlock_nls() calls.

ntfs3:
        /*
         * Try slow way with current upcase table
         */
        sbi = dentry->d_sb->s_fs_info;
        uni1 = __getname();
        if (!uni1)
                return -ENOMEM;
__getname().  "Give me a page and you might block, while you are
at it".  Done from ->d_compare().  Called under dentry->d_lock
and rcu_read_lock().  OK, any further investigation of that
one is... probably not worth bothering with at that point.

Other in-tree instances appear to be correct.  I'll push fixes for
those (well, ntfs3 aside) out tomorrow.


  reply	other threads:[~2023-09-14  2:09 UTC|newest]

Thread overview: 106+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-13 11:09 split up ->kill_sb Christoph Hellwig
2023-09-13 11:09 ` Christoph Hellwig
2023-09-13 11:09 ` [PATCH 01/19] fs: reflow deactivate_locked_super Christoph Hellwig
2023-09-13 11:09   ` Christoph Hellwig
2023-09-13 16:35   ` Christian Brauner
2023-09-13 16:35     ` Christian Brauner
2023-09-26  9:24     ` Christoph Hellwig
2023-09-26  9:24       ` Christoph Hellwig
2023-09-13 11:09 ` [PATCH 02/19] fs: make ->kill_sb optional Christoph Hellwig
2023-09-13 11:09   ` Christoph Hellwig
2023-09-13 11:09 ` [PATCH 03/19] fs: release anon dev_t in deactivate_locked_super Christoph Hellwig
2023-09-13 11:09   ` Christoph Hellwig
2023-09-13 23:27   ` Al Viro
2023-09-13 23:27     ` Al Viro
2023-09-14  2:37     ` Al Viro
2023-09-14  2:37       ` Al Viro
2023-09-14  5:38       ` Al Viro
2023-09-14  5:38         ` Al Viro
2023-09-14  7:56         ` Christian Brauner
2023-09-14  7:56           ` Christian Brauner
2023-09-26  9:31           ` Christoph Hellwig
2023-09-26  9:31             ` Christoph Hellwig
2023-09-14 14:02         ` Christian Brauner
2023-09-14 14:02           ` Christian Brauner
2023-09-14 16:58           ` Al Viro
2023-09-14 16:58             ` Al Viro
2023-09-14 19:23             ` Al Viro
2023-09-14 19:23               ` Al Viro
2023-09-15  7:40               ` Christian Brauner
2023-09-15  7:40                 ` Christian Brauner
2023-09-15  9:44             ` Christian Brauner
2023-09-15  9:44               ` Christian Brauner
2023-09-15 14:12               ` Christian Brauner
2023-09-15 14:12                 ` Christian Brauner
2023-09-15 14:28                 ` Al Viro
2023-09-15 14:28                   ` Al Viro
2023-09-15 14:33                   ` Al Viro
2023-09-15 14:33                     ` Al Viro
2023-09-15 14:40                   ` Christian Brauner
2023-09-15 14:40                     ` Christian Brauner
2023-09-26  9:41         ` Christoph Hellwig
2023-09-26  9:41           ` Christoph Hellwig
2023-09-26  9:38     ` Christoph Hellwig
2023-09-26  9:38       ` Christoph Hellwig
2023-09-26 21:25       ` Al Viro
2023-09-27 22:29         ` Al Viro
2023-10-02  6:46         ` Christoph Hellwig
2023-10-09 21:57           ` Al Viro
2023-10-10  8:44             ` Christian Brauner
2023-10-17 19:50               ` Al Viro
2023-09-13 11:09 ` [PATCH 04/19] NFS: remove the s_dev field from struct nfs_server Christoph Hellwig
2023-09-13 11:09   ` Christoph Hellwig
2023-09-13 11:09 ` [PATCH 05/19] fs: assign an anon dev_t in common code Christoph Hellwig
2023-09-13 11:09   ` Christoph Hellwig
2023-09-14  0:34   ` Al Viro
2023-09-14  0:34     ` Al Viro
2023-09-13 11:10 ` [PATCH 06/19] qibfs: use simple_release_fs Christoph Hellwig
2023-09-13 11:10   ` Christoph Hellwig
2023-09-18 11:41   ` Leon Romanovsky
2023-09-18 11:41     ` Leon Romanovsky
2023-09-13 11:10 ` [PATCH 07/19] hypfs: use d_genocide to kill fs entries Christoph Hellwig
2023-09-13 11:10   ` Christoph Hellwig
2023-09-13 11:10 ` [PATCH 08/19] pstore: shrink the pstore_sb_lock critical section in pstore_kill_sb Christoph Hellwig
2023-09-13 11:10   ` Christoph Hellwig
2023-09-13 22:07   ` Kees Cook
2023-09-13 22:07     ` Kees Cook
2023-09-13 11:10 ` [PATCH 09/19] zonefs: remove duplicate cleanup in zonefs_fill_super Christoph Hellwig
2023-09-13 11:10   ` Christoph Hellwig
2023-09-14  0:33   ` Damien Le Moal
2023-09-14  0:33     ` Damien Le Moal
2023-09-14  0:49   ` Al Viro
2023-09-14  0:49     ` Al Viro
2023-09-13 11:10 ` [PATCH 10/19] USB: gadget/legacy: remove sb_mutex Christoph Hellwig
2023-09-13 11:10   ` Christoph Hellwig
2023-09-13 16:10   ` Alan Stern
2023-09-13 16:10     ` Alan Stern
2023-09-26  9:24     ` Christoph Hellwig
2023-09-26  9:24       ` Christoph Hellwig
2023-09-14 10:22   ` Sergey Shtylyov
2023-09-14 10:22     ` Sergey Shtylyov
2023-09-13 11:10 ` [PATCH 11/19] fs: add new shutdown_sb and free_sb methods Christoph Hellwig
2023-09-13 11:10   ` Christoph Hellwig
2023-09-14  2:07   ` Al Viro [this message]
2023-09-14  2:07     ` Al Viro
2023-09-13 11:10 ` [PATCH 12/19] fs: convert kill_litter_super to litter_shutdown_sb Christoph Hellwig
2023-09-13 11:10   ` Christoph Hellwig
2023-09-13 22:07   ` Kees Cook
2023-09-13 22:07     ` Kees Cook
2023-09-13 11:10 ` [PATCH 13/19] fs: convert kill_block_super to block_free_sb Christoph Hellwig
2023-09-13 11:10   ` Christoph Hellwig
2023-09-14  2:29   ` Al Viro
2023-09-14  2:29     ` Al Viro
2023-09-13 11:10 ` [PATCH 14/19] jffs2: convert to ->shutdown_sb and ->free_sb Christoph Hellwig
2023-09-13 11:10   ` Christoph Hellwig
2023-09-13 11:10 ` [PATCH 15/19] kernfs: split ->kill_sb Christoph Hellwig
2023-09-13 11:10   ` Christoph Hellwig
2023-09-18 15:24   ` Michal Koutný
2023-09-18 15:24     ` Michal Koutný
2023-09-13 11:10 ` [PATCH 16/19] x86/resctrl: release rdtgroup_mutex and the CPU hotplug lock in rdt_shutdown_sb Christoph Hellwig
2023-09-13 11:10   ` Christoph Hellwig
2023-09-13 11:10 ` [PATCH 17/19] NFS: move nfs_kill_super to fs_context.c Christoph Hellwig
2023-09-13 11:10   ` Christoph Hellwig
2023-09-13 11:10 ` [PATCH 18/19] fs: simple ->shutdown_sb and ->free_sb conversions Christoph Hellwig
2023-09-13 11:10   ` Christoph Hellwig
2023-09-13 11:10 ` [PATCH 19/19] fs: remove ->kill_sb Christoph Hellwig
2023-09-13 11:10   ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230914020728.GF800259@ZenIV \
    --to=viro@zeniv.linux.org.uk \
    --cc=agordeev@linux.ibm.com \
    --cc=anna@kernel.org \
    --cc=brauner@kernel.org \
    --cc=cgroups@vger.kernel.org \
    --cc=dennis.dalessandro@cornelisnetworks.com \
    --cc=dlemoal@kernel.org \
    --cc=fenghua.yu@intel.com \
    --cc=gor@linux.ibm.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hca@linux.ibm.com \
    --cc=hch@lst.de \
    --cc=keescook@chromium.org \
    --cc=linux-hardening@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linux-usb@vger.kernel.org \
    --cc=miquel.raynal@bootlin.com \
    --cc=naohiro.aota@wdc.com \
    --cc=reinette.chatre@intel.com \
    --cc=richard@nod.at \
    --cc=tj@kernel.org \
    --cc=trond.myklebust@hammerspace.com \
    --cc=vigneshr@ti.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.