All of lore.kernel.org
 help / color / mirror / Atom feed
From: Al Viro <viro@zeniv.linux.org.uk>
To: Tejun Heo <tj@kernel.org>
Cc: Imran Khan <imran.f.khan@oracle.com>,
	gregkh@linuxfoundation.org, akpm@linux-foundation.org,
	linux-kernel@vger.kernel.org
Subject: Re: [RESEND PATCH v7 7/8] kernfs: Replace per-fs rwsem with hashed rwsems.
Date: Tue, 22 Mar 2022 02:40:54 +0000	[thread overview]
Message-ID: <Yjk3Nqft/U6vDvd1@zeniv-ca.linux.org.uk> (raw)
In-Reply-To: <YjjP5ldCCGYqD+UV@slm.duckdns.org>

On Mon, Mar 21, 2022 at 09:20:06AM -1000, Tejun Heo wrote:
> Hello,
> 
> On Mon, Mar 21, 2022 at 05:55:53PM +0000, Al Viro wrote:
> > Why bother with rwsem, when we don't need anything blocking under it?
> > DEFINE_RWLOCK instead of DEFINE_SPINLOCK and don't make it static.
> 
> Oh I mean, in case the common readers get way too hot, percpu_rwsem is a
> relatively easy way to shift the burder from the readers to the writers. I
> doubt we'll need that.
> 
> > kernfs_walk_ns() - this is fucking insane; on the surface, it needs to
> > be exclusive due to the use of the same static buffer.  It uses that
> > buffer to generate a pathname, *THEN* walks over it with strsep().
> > That's an... interesting approach, for the lack of other printable
> > terms - we walk the chain of ancestors, concatenating their names
> > into a buffer and separating those names with slashes, then we walk
> > that buffer, searching for slashes...  WTF?
> 
> It takes the @parent to walk string @path from. Where does it generate the
> pathname?

Sorry, misread that thing - the reason it copies the damn thing at all is
the use of strsep().  Yecch...  Rule of the thumb regarding strsep() use,
be it in kernel or in the userland: don't.  It's almost never the right
primitive to use.

Lookups should use qstr; it has both the length and place for hash.
Switch kernfs_find_ns() to that (and lift the calculation of length
into the callers that do not have it - note that kernfs_iop_lookup()
does) and you don't need the strsep() shite (or copying) anymore.

That would allow for kernfs_walk_ns() to take kernfs_rename_lock shared.

HOWEVER, that's not the only lock needed there and this patchset is
broken in that respect - it locks the starting node, then walks the
path.  Complete with lookups in rbtrees of children in the descendents
of that node and those are *not* locked.

> > Wait a sec; what happens if e.g. kernfs_path_from_node() races with
> > __kernfs_remove()?  We do _not_ clear ->parent, but we do drop references
> > that used to pin what it used to point to, unless I'm misreading that
> > code...  Or is it somehow prevented by drain-related logics?  Seeing
> > that it seems to be possible to have kernfs_path_from_node() called from
> > an interrupt context, that could be delicate...
> 
> kernfs_remove() is akin to freeing of the node and all its descendants. The
> caller shouldn't be racing that against any other operations in the subtree.

That's interesting...  My impression had been that some of these functions
could be called from interrupt contexts (judging by the spin_lock_irqsave()
in there).  What kind of async contexts those are, and what do you use to
make sure they don't leak into overlap with kernfs_remove()?

In particular, if you ever use those from RCU callbacks, you need to make
sure that you have a grace period somewhere; the wait_event() you have in
kernfs_drain() won't do it.

I've tried to track the callchains that could lead there, but it gets
hairy fast.

  reply	other threads:[~2022-03-22  2:41 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-17  7:26 [RESEND PATCH v7 0/8] kernfs: Introduce interface to access global kernfs_open_file_mutex Imran Khan
2022-03-17  7:26 ` [RESEND PATCH v7 1/8] " Imran Khan
2022-03-17 21:34   ` Al Viro
2022-04-05  5:36     ` Imran Khan
2022-04-05 14:24       ` Al Viro
2022-04-06  4:54         ` Imran Khan
2022-04-06 14:54           ` Al Viro
2022-04-06 15:18             ` Tejun Heo
2022-04-14  0:01           ` Imran Khan
2022-03-18 17:10   ` Eric W. Biederman
2022-03-21  0:10     ` Imran Khan
2022-03-17  7:26 ` [RESEND PATCH v7 2/8] kernfs: Replace global kernfs_open_file_mutex with hashed mutexes Imran Khan
2022-03-17  7:26 ` [RESEND PATCH v7 3/8] kernfs: Introduce interface to access kernfs_open_node_lock Imran Khan
2022-03-17  7:26 ` [RESEND PATCH v7 4/8] kernfs: Replace global kernfs_open_node_lock with hashed spinlocks Imran Khan
2022-03-17  7:26 ` [RESEND PATCH v7 5/8] kernfs: Use a per-fs rwsem to protect per-fs list of kernfs_super_info Imran Khan
2022-03-17  7:26 ` [RESEND PATCH v7 6/8] kernfs: Introduce interface to access per-fs rwsem Imran Khan
2022-03-17  7:26 ` [RESEND PATCH v7 7/8] kernfs: Replace per-fs rwsem with hashed rwsems Imran Khan
2022-03-18  0:07   ` Al Viro
2022-03-21  1:57     ` Imran Khan
2022-03-21  7:29       ` Al Viro
2022-03-21 16:46         ` Tejun Heo
2022-03-21 17:55           ` Al Viro
2022-03-21 19:20             ` Tejun Heo
2022-03-22  2:40               ` Al Viro [this message]
2022-03-22 17:08                 ` Tejun Heo
2022-03-22 20:26                   ` Al Viro
2022-03-22 21:20                     ` Tejun Heo
2022-03-28  0:15                 ` Imran Khan
2022-03-28 17:30                   ` Tejun Heo
2022-03-30  2:23                 ` Imran Khan
2022-03-17  7:26 ` [RESEND PATCH v7 8/8] kernfs: Add a document to describe hashed locks used in kernfs Imran Khan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yjk3Nqft/U6vDvd1@zeniv-ca.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=akpm@linux-foundation.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=imran.f.khan@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.