linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: Linus Torvalds <torvalds@linux-foundation.org>,
	Steven Rostedt <rostedt@goodmis.org>
Cc: LKML <linux-kernel@vger.kernel.org>,
	Linux Trace Devel <linux-trace-devel@vger.kernel.org>,
	Masami Hiramatsu <mhiramat@kernel.org>,
	Christian Brauner <brauner@kernel.org>,
	Ajay Kaher <ajay.kaher@broadcom.com>,
	Geert Uytterhoeven <geert@linux-m68k.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH] eventfs: Have inodes have unique inode numbers
Date: Fri, 26 Jan 2024 17:14:12 -0500	[thread overview]
Message-ID: <8547159a-0b28-4d75-af02-47fc450785fa@efficios.com> (raw)
In-Reply-To: <CAHk-=whNfNti-mn6vhL-v-WZnn0i7ZAbwSf_wNULJeyanhPOgg@mail.gmail.com>

On 2024-01-26 16:49, Linus Torvalds wrote:
> On Fri, 26 Jan 2024 at 13:36, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
[...]
> So please try to look at things to *fix* and simplify, not at things
> to mess around with and make more complicated.

Hi Linus,

I'm all aboard with making things as simple as possible and
making sure no complexity is added for the sake of micro-optimization
of slow-paths.

I do however have a concern with the approach of using the same
inode number for various files on the same filesystem: AFAIU it
breaks userspace ABI expectations. See inode(7) for instance:

        Inode number
               stat.st_ino; statx.stx_ino

               Each  file in a filesystem has a unique inode number.  Inode numbers
               are guaranteed to be unique only within a filesystem (i.e., the same
               inode  numbers  may  be  used by different filesystems, which is the
               reason that hard links may not cross filesystem  boundaries).   This
               field contains the file's inode number.

So user-space expecting inode numbers to be unique within a filesystem
is not "legacy" in any way. Userspace is allowed to expect this from the
ABI.

I think that a safe approach to prevent ABI regressions, and just to prevent
adding more ABI-corner cases that userspace will have to work-around, would
be to issue unique numbers to files within eventfs, but in the
simplest/obviously correct implementation possible. It is, after all, a slow
path.

The issue with the atomic_add_return without any kinds of checks is the
scenarios of a userspace loop that would create/delete directories endlessly,
thus causing inode re-use. This approach is simple, but it's unfortunately
not obviously correct. Because eventfs allows userspace to do mkdir/rmdir,
this is unfortunately possible. It would be OK if only the kernel had control
over directory creation/removal, but it's not the case here.

I would suggest this straightforward solution to this:

a) define a EVENTFS_MAX_INODES (e.g. 4096 * 8),

b) keep track of inode allocation in a bitmap (within a single page),

c) disallow allocating more than "EVENTFS_MAX_INODES" in eventfs.

This way even the mkdir/rmdir loop will work fine, but it will prevent
keeping too many inodes alive at any given time. The cost is a single
page (4K) per eventfs instance.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com


  parent reply	other threads:[~2024-01-26 22:14 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-26 20:02 [PATCH] eventfs: Have inodes have unique inode numbers Steven Rostedt
2024-01-26 20:25 ` Linus Torvalds
2024-01-26 21:26   ` Steven Rostedt
2024-01-26 21:31     ` Linus Torvalds
2024-01-26 21:43       ` Steven Rostedt
2024-01-26 21:36     ` Linus Torvalds
2024-01-26 21:42       ` Steven Rostedt
2024-01-26 21:49       ` Linus Torvalds
2024-01-26 22:08         ` Steven Rostedt
2024-01-26 22:26           ` Linus Torvalds
2024-01-27 14:47             ` Steven Rostedt
2024-01-28 14:42               ` Steven Rostedt
2024-01-26 22:14         ` Mathieu Desnoyers [this message]
2024-01-26 22:29           ` Linus Torvalds
2024-01-26 22:41             ` Mathieu Desnoyers
2024-01-26 22:49               ` Linus Torvalds
2024-01-29 16:00                 ` Mathieu Desnoyers
2024-01-29 18:58                   ` Linus Torvalds
2024-01-26 22:34           ` Matthew Wilcox
2024-01-26 22:40             ` Mathieu Desnoyers
2024-01-26 22:48             ` Linus Torvalds
2024-01-26 23:04               ` Matthew Wilcox
2024-01-26 23:11                 ` Linus Torvalds
2024-01-26 23:17                   ` Linus Torvalds
2024-01-27  9:36                     ` Andreas Schwab
2024-01-27 21:47         ` Linus Torvalds
2024-01-28 20:15           ` Steven Rostedt
2024-01-28 20:53             ` Linus Torvalds
2024-01-28 21:08               ` Linus Torvalds
2024-01-28 22:01                 ` Steven Rostedt
2024-01-28 22:17                   ` Linus Torvalds
2024-01-28 22:26                     ` Steven Rostedt
2024-01-28 21:11               ` Steven Rostedt
2024-01-28 21:19               ` Steven Rostedt
2024-01-28 21:43                 ` Linus Torvalds
2024-01-28 22:07                   ` Linus Torvalds
2024-01-28 22:17                     ` Steven Rostedt
2024-01-28 22:25                       ` Linus Torvalds
2024-01-28 22:51           ` Steven Rostedt
2024-01-28 23:24             ` Linus Torvalds
2024-01-28 23:59               ` Steven Rostedt
2024-01-29  0:21                 ` Steven Rostedt
2024-01-29  1:00                   ` Linus Torvalds
2024-01-29  1:42                     ` Linus Torvalds
2024-01-29  2:32                       ` Steven Rostedt
2024-01-29  3:40                         ` Steven Rostedt
2024-01-29  4:01                           ` Linus Torvalds
2024-01-29  2:09                     ` Steven Rostedt
2024-01-29  6:44                       ` Amir Goldstein
2024-01-29  9:32                         ` Steven Rostedt
2024-01-27 15:26       ` David Laight
2024-01-27 20:01         ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8547159a-0b28-4d75-af02-47fc450785fa@efficios.com \
    --to=mathieu.desnoyers@efficios.com \
    --cc=ajay.kaher@broadcom.com \
    --cc=brauner@kernel.org \
    --cc=geert@linux-m68k.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-trace-devel@vger.kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).