linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "John Groves" <john@jagalactic.com>
To: "Miklos Szeredi" <miklos@szeredi.hu>
Cc: "linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"jgroves@micron.com" <jgroves@micron.com>,
	"Amir Goldstein" <amir73il@gmail.com>,
	"fuse-devel@lists.sourceforge.net"
	<fuse-devel@lists.sourceforge.net>
Subject: Re: Question about fuse dax support
Date: Thu, 12 Oct 2023 14:10:57 +0000	[thread overview]
Message-ID: <0100018b2439ebf3-a442db6f-f685-4bc4-b4b0-28dc333f6712-000000@email.amazonses.com> (raw)
In-Reply-To: <CAJfpegsvhbmAYD22Y981BiV8ut7QfZbRZMvGY7Vs-hCM2L+=dQ@mail.gmail.com>

On 23/10/10 04:06PM, Miklos Szeredi wrote:
> On Fri, 6 Oct 2023 at 20:12, John Groves <john@jagalactic.com> wrote:
> >
> > I see that there is some limited support for dax mapping of fuse files, but
> > it seems to be specifically for virtiofs. I admit I barely understand that
> > use case, but there is another fuse/dax use case that I’d like to explore.
> > I would appreciate feedback on this, including pointers to RTFM material,
> > etc.
> >
> > I’m interested in creating a file system interface to fabric-attached shared
> > memory (cxl). Think of a fuse file system that receives metadata (how MD is
> > distributed is orthogonal) and instantiates files that are backed by dax
> > memory (S_DAX files), such that the same ‘data sets’ can be visible as
> > mmap-able files on more than one server. I’d like feedback as to whether
> > this is (or could be) doable via fuse.
> >
> > Here is the main rub though. For this to perform adequately, I don’t think
> > it would be acceptable for each fault to call up to user space to resolve
> > the dax device & offset. So the kernel side of fuse would need to cache a
> > dax extent list for each file to TLB/page-table misses.
> >
> > I would appreciate any questions, pointers or feedback.
> 
> I think the passthrough patches should take care of this use case as well:
> 
> https://lore.kernel.org/all/20230519125705.598234-1-amir73il@gmail.com/
> Thanks,
> Miklos

Thanks for the reply Miklos.

I've looked over that patch set, and I'm pretty sure it's not what is needed
for my use case. I can see how my statement above "backed by dax memory
(S_DAX files)" could have implied that there is an S_DAX backing file that
fuse could use - but there is not already a backing file, just a dax device.

So it is the fuse file that would need to have the S_DAX flag and handle
mapping to an extent list from the dax device (rather than referring to a
backing file that already does this).

This is a performance-sensitive use case - S_DAX files are for direct access
to memory (duh). Posix read/write are supported, but the main use case for
S_DAX files is performant mmap. So I think this can only be viable if:

1) The fuse kernel module supports files with the S_DAX flag, and performs the
   appropriate mapping in conjunction with the dax driver (iomap, etc.)
2) The fuse kernel module caches the extent list of the backing memory
   (extents of the form [offset, length] or [device, offset, length]) so that
   TLB/page-table faults could be resolved without calling out to the user
   space handler.

My naive reading of the existence of some sort of fuse/dax support for virtiofs
suggested that there might be a way of doing this - but I may be wrong about
that.

Please let me know your thoughts on this.

Also: I will be at Linux Plumbers, and I'm speaking about this use case in
the cxl microconference. If you will be there, perhaps we can discuss it.
Any others interested in discussing it, here or at Plumbers, please ping me.

Thanks,
John Groves
Micron



      parent reply	other threads:[~2023-10-12 14:11 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <nx43owwj2x46rfidyi7iziv2dbw3licpjn24ff5sv76nuoe3dt@seenck6dhbz7>
2023-10-06 18:12 ` Question about fuse dax support John Groves
2023-10-10 14:06   ` Miklos Szeredi
     [not found]     ` <eeokvydlogqzlhjrjcf4knvazklizjk4tdd2kkb3qvgy7orfke@ijorgccnytsg>
2023-10-12 14:10       ` John Groves [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0100018b2439ebf3-a442db6f-f685-4bc4-b4b0-28dc333f6712-000000@email.amazonses.com \
    --to=john@jagalactic.com \
    --cc=amir73il@gmail.com \
    --cc=fuse-devel@lists.sourceforge.net \
    --cc=jgroves@micron.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).