linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: John Groves <John@groves.net>
To: Amir Goldstein <amir73il@gmail.com>
Cc: John Groves <jgroves@micron.com>,
	Jonathan Corbet <corbet@lwn.net>,
	 Dan Williams <dan.j.williams@intel.com>,
	Vishal Verma <vishal.l.verma@intel.com>,
	 Dave Jiang <dave.jiang@intel.com>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	 Christian Brauner <brauner@kernel.org>, Jan Kara <jack@suse.cz>,
	Matthew Wilcox <willy@infradead.org>,
	 linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-doc@vger.kernel.org,  linux-kernel@vger.kernel.org,
	nvdimm@lists.linux.dev, john@jagalactic.com,
	 Dave Chinner <david@fromorbit.com>,
	Christoph Hellwig <hch@infradead.org>,
	 dave.hansen@linux.intel.com, gregory.price@memverge.com,
	Miklos Szeredi <miklos@szeredi.hu>,
	 Vivek Goyal <vgoyal@redhat.com>
Subject: Re: [RFC PATCH 00/20] Introduce the famfs shared-memory file system
Date: Thu, 29 Feb 2024 16:16:33 -0600	[thread overview]
Message-ID: <3jwluwrqj6rwsxdsksfvdeo5uccgmnkh7rgefaeyxf2gu75344@ybhwncywkftx> (raw)
In-Reply-To: <CAOQ4uxiPc5ciD_zm3jp5sVQaP4ndb40mApw5hx2DL+8BZNd==A@mail.gmail.com>

On 24/02/29 08:52AM, Amir Goldstein wrote:
> On Fri, Feb 23, 2024 at 7:42 PM John Groves <John@groves.net> wrote:
> >
> > This patch set introduces famfs[1] - a special-purpose fs-dax file system
> > for sharable disaggregated or fabric-attached memory (FAM). Famfs is not
> > CXL-specific in anyway way.
> >
> > * Famfs creates a simple access method for storing and sharing data in
> >   sharable memory. The memory is exposed and accessed as memory-mappable
> >   dax files.
> > * Famfs supports multiple hosts mounting the same file system from the
> >   same memory (something existing fs-dax file systems don't do).
> > * A famfs file system can be created on either a /dev/pmem device in fs-dax
> >   mode, or a /dev/dax device in devdax mode (the latter depending on
> >   patches 2-6 of this series).
> >
> > The famfs kernel file system is part the famfs framework; additional
> > components in user space[2] handle metadata and direct the famfs kernel
> > module to instantiate files that map to specific memory. The famfs user
> > space has documentation and a reasonably thorough test suite.
> >
> 
> So can we say that Famfs is Fuse specialized for DAX?
> 
> I am asking because you seem to have asked it first:
> https://lore.kernel.org/linux-fsdevel/0100018b2439ebf3-a442db6f-f685-4bc4-b4b0-28dc333f6712-000000@email.amazonses.com/
> I guess that you did not get your answers to your questions before or at LPC?

Thanks for paying attention Amir. I think there is some validity to thinking
of famfs as Fuse for DAX. Administration / metadata originating in user space
is similar (but doing it this way also helps reduce RAS exposure to memory 
that might have a more complex connection path).

One way it differs from fuse is that famfs is very much aimed at use
cases that require performance. *Accessing* files must run at full
memory speeds.

> 
> I did not see your question back in October.
> Let me try to answer your questions and we can discuss later if a new dedicated
> kernel driver + userspace API is really needed, or if FUSE could be used as is
> extended for your needs.
> 
> You wrote:
> "...My naive reading of the existence of some sort of fuse/dax support
> for virtiofs
> suggested that there might be a way of doing this - but I may be wrong
> about that."
> 
> I'm not virtiofs expert, but I don't think that you are wrong about this.
> IIUC, virtiofsd could map arbitrary memory region to any fuse file mmaped
> by virtiofs client.
> 
> So what are the gaps between virtiofs and famfs that justify a new filesystem
> driver and new userspace API?

I have a lot of thoughts here, and an actual conversation might be good
sooner rather than later. I hope to be at LSFMM to discuss this - if you agree,
put in a vote for my topic ;). But if you want to talk sooner than that, I'm
interested.

I think one piece of evidence that this isn't possible with Fuse today is that
I had to plumb the iomap interface for /dev/dax in this patch set. That is the
way that fs-dax file systems communicate with the dax layer for fault 
resolution. If fuse/virtiofs handles dax somehow without the iomap interface,
I suspect it's doing something somehow simpler, /and/ that might need to get 
reconciled with the fs-dax methodology. Or maybe I don't know what I'm talking
about (in which case, please help :D).

I think one thing that might make sense would be to bring up this functionality
as a standalone file system, and then consider merging it into fuse when &
if the time seems right. 

Famfs doesn't currently have any up-calls. User space plays the log and tells
the kmod to instantiate files with extent lists to dax. Access happens with
zero user space involvement.

The important thing, the thing I'm currently paid for, is making it
practical to use disaggregated shared memory - it's ultimately not important 
which mechanism is used to enable a filesystem access method for memory.

But caching metadata in the kernel for efficient fault handling is the
only way to get it to perform at "memory speeds" so that appears critical.

One final observation: famfs has significantly more code in user space than
in kernel space, and it's the user side that is likely to grow over time.
That logic is at least theoretically independent of the kernel ABI.

> 
> Thanks,
> Amir.

Thanks!
John


      reply	other threads:[~2024-02-29 22:16 UTC|newest]

Thread overview: 94+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-23 17:41 [RFC PATCH 00/20] Introduce the famfs shared-memory file system John Groves
2024-02-23 17:41 ` [RFC PATCH 01/20] famfs: Documentation John Groves
2024-02-23 17:41 ` [RFC PATCH 02/20] dev_dax_iomap: Add fs_dax_get() func to prepare dax for fs-dax usage John Groves
2024-02-26 12:05   ` Jonathan Cameron
2024-02-26 15:00     ` John Groves
2024-02-23 17:41 ` [RFC PATCH 03/20] dev_dax_iomap: Move dax_pgoff_to_phys from device.c to bus.c since both need it now John Groves
2024-02-26 12:10   ` Jonathan Cameron
2024-02-26 15:13     ` John Groves
2024-02-23 17:41 ` [RFC PATCH 04/20] dev_dax_iomap: Save the kva from memremap John Groves
2024-02-26 12:21   ` Jonathan Cameron
2024-02-26 15:48     ` John Groves
2024-02-23 17:41 ` [RFC PATCH 05/20] dev_dax_iomap: Add dax_operations for use by fs-dax on devdax John Groves
2024-02-26 12:32   ` Jonathan Cameron
2024-02-26 16:09     ` John Groves
2024-02-23 17:41 ` [RFC PATCH 06/20] dev_dax_iomap: Add CONFIG_DEV_DAX_IOMAP kernel build parameter John Groves
2024-02-26 12:34   ` Jonathan Cameron
2024-02-26 16:12     ` John Groves
2024-02-23 17:41 ` [RFC PATCH 07/20] famfs: Add include/linux/famfs_ioctl.h John Groves
2024-02-24  1:39   ` Randy Dunlap
2024-02-24  2:23     ` John Groves
2024-02-24  3:27       ` Randy Dunlap
2024-02-24 23:32         ` John Groves
2024-02-24 23:40           ` Randy Dunlap
2024-02-26 12:39   ` Jonathan Cameron
2024-02-26 16:44     ` John Groves
2024-02-26 16:56       ` Jonathan Cameron
2024-02-26 18:04         ` John Groves
2024-02-23 17:41 ` [RFC PATCH 08/20] famfs: Add famfs_internal.h John Groves
2024-02-26 12:48   ` Jonathan Cameron
2024-02-26 17:35     ` John Groves
2024-02-27 10:28       ` Jonathan Cameron
2024-02-28  1:06         ` John Groves
2024-02-27 13:38   ` Christian Brauner
2024-02-27 14:12     ` John Groves
2024-02-23 17:41 ` [RFC PATCH 09/20] famfs: Add super_operations John Groves
2024-02-26 12:51   ` Jonathan Cameron
2024-02-26 21:47     ` John Groves
2024-02-27 10:34       ` Jonathan Cameron
2024-02-27 17:48     ` John Groves
2024-02-23 17:41 ` [RFC PATCH 10/20] famfs: famfs_open_device() & dax_holder_operations John Groves
2024-02-26 12:56   ` Jonathan Cameron
2024-02-26 22:22     ` John Groves
2024-02-27 13:39   ` Christian Brauner
2024-02-27 18:38     ` John Groves
2024-02-23 17:41 ` [RFC PATCH 11/20] famfs: Add fs_context_operations John Groves
2024-02-26 13:20   ` Jonathan Cameron
2024-02-26 22:43     ` John Groves
2024-02-27 13:41   ` Christian Brauner
2024-02-28  0:59     ` John Groves
2024-02-28  1:49       ` Randy Dunlap
2024-02-28  8:17         ` Christian Brauner
2024-02-28 10:07       ` Christian Brauner
2024-02-28 12:01         ` Christian Brauner
2024-02-23 17:41 ` [RFC PATCH 12/20] famfs: Add inode_operations and file_system_type John Groves
2024-02-26 13:25   ` Jonathan Cameron
2024-02-26 22:53     ` John Groves
2024-02-23 17:41 ` [RFC PATCH 13/20] famfs: Add iomap_ops John Groves
2024-02-26 13:30   ` Jonathan Cameron
2024-02-26 23:00     ` John Groves
2024-02-23 17:41 ` [RFC PATCH 14/20] famfs: Add struct file_operations John Groves
2024-02-26 13:32   ` Jonathan Cameron
2024-02-26 23:09     ` John Groves
2024-02-23 17:41 ` [RFC PATCH 15/20] famfs: Add ioctl to file_operations John Groves
2024-02-26 13:44   ` Jonathan Cameron
2024-02-23 17:42 ` [RFC PATCH 16/20] famfs: Add fault counters John Groves
2024-02-23 18:23   ` Dave Hansen
2024-02-23 19:56     ` John Groves
2024-02-23 20:04       ` Dan Williams
2024-02-23 20:39         ` John Groves
2024-02-23 21:19           ` Dave Hansen
2024-02-23 23:50             ` Dan Williams
2024-02-24  3:59               ` Matthew Wilcox
2024-02-24  4:30                 ` Dan Williams
2024-02-23 17:42 ` [RFC PATCH 17/20] famfs: Add module stuff John Groves
2024-02-26 13:47   ` Jonathan Cameron
2024-02-27 22:15     ` John Groves
2024-02-23 17:42 ` [RFC PATCH 18/20] famfs: Support character dax via the dev_dax_iomap patch John Groves
2024-02-26 13:52   ` Jonathan Cameron
2024-02-27 22:27     ` John Groves
2024-02-23 17:42 ` [RFC PATCH 19/20] famfs: Update MAINTAINERS file John Groves
2024-02-23 17:42 ` [RFC PATCH 20/20] famfs: Add Kconfig and Makefile plumbing John Groves
2024-02-24  1:50   ` Randy Dunlap
2024-02-24  2:24     ` John Groves
2024-02-24  0:07 ` [RFC PATCH 00/20] Introduce the famfs shared-memory file system Luis Chamberlain
2024-02-26 13:27   ` John Groves
2024-02-26 15:53     ` Luis Chamberlain
2024-02-26 21:16       ` John Groves
2024-02-27  0:58         ` Luis Chamberlain
2024-02-27  2:05           ` John Groves
2024-02-29  2:15             ` Dave Chinner
2024-02-29 14:52               ` John Groves
2024-03-11  1:29                 ` Dave Chinner
2024-02-29  6:52 ` Amir Goldstein
2024-02-29 22:16   ` John Groves [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3jwluwrqj6rwsxdsksfvdeo5uccgmnkh7rgefaeyxf2gu75344@ybhwncywkftx \
    --to=john@groves.net \
    --cc=amir73il@gmail.com \
    --cc=brauner@kernel.org \
    --cc=corbet@lwn.net \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=dave.jiang@intel.com \
    --cc=david@fromorbit.com \
    --cc=gregory.price@memverge.com \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=jgroves@micron.com \
    --cc=john@jagalactic.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=nvdimm@lists.linux.dev \
    --cc=vgoyal@redhat.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=vishal.l.verma@intel.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).